Working on the regular expressions in GNU Emacs is fun!! Unlike the conventional regex in Perl or Bash, where one has to type the expression and execute it in order to test, regex in Emacs is highly interactive! Emacs has a build-in regex builder which highlights the match pattern as we create the regular expression.
This post explains the interactive
re-builder function in Emacs, which I
personally enjoyed a lot. As an example, I am going to take few header lines
from Linux kernel source code(I altered some of them) for which we will to
create a regular expression.
Consider following header lines:
1: #include <stdio.h> 2: #include <linux/stdio.h> 3: #include <linux/stdio.h> 4: #include <linux/module.h> 5: #include<linux/slab.h> 6: #include<linux/init.h> 7: #include <linux/types.h> 8: #include <linux/dmi.h> 9: #include <linux/delay.h> 10: #include <linux/platform_device.h> 11: #include <linux/power_supply.h> 12: #include "stdio.h" 13: #include "linux/stdio.h" 14: #include "linux/stdio.h" 15: #include "linux/module.h"
Call re-builder using
This will open a buffer with the name RE-Builder as shown below
Build an expression
The header line start with a
#, lets begin by typing
^denotes the beginning of the line, string or a buffer followed by a
#and a string
include. Altogether the expression will be
^#include. This should highlight all the region which has
To match the white space after the
#include, note that in some lines it does not exists, for example in the line
#include<slab.h>. That means the white space should be skipped. To handle this, we make use of square brackets  to denote an optional part. Lets append [ ] (notice the space between the square brackets). The expression will be
Problem with the above expression is it skips lines like below
1: #include<linux/slab.h> 2: #include<linux/init.h>
and does not highlight more than one spaces like below
1: #include <linux/stdio.h> 2: #include "linux/module.h"
This can easily handled using an asterisk (
*), which match an expression zero or more times. So the modified expression will be
Now we have to match < or “ (double-quote). We can use another square brackets to match them. Note that < and “ are special characters and should be escaped with \ (backslash) at the beginning which makes the final expression as
Now we need to match a string. This is achieved using
[a-z]which will match all characters between ’a’ and ’z’. The expression will be
Now we have a same problem like before that the above expression will highlight just single character. And appending a
+sign will match previous pattern one or more times. Now the expression will be
^#include[ ]*[\<\"][a-z]+. To make it more flexible, lets also match all the capital letters which transforms an expression into
Now lets also match /, . and _. We have to escape all these special characters using \ and the expression will look like
Finally > and closing *“*(double-quote) can be matched using
[\>\"]. Our final expression will be
This ends an introduction to Emacs’s re-builder, for more info please visit Xah Lee’s page on regex.