Subscribe for free with iTunes, Twitter, or RSS (Ogg or Quicktime).


VimGolf - Prime Numbers


For the VimGolf challenge “List the first 100 prime numbers”, there’s a solution that uses a regular expression to detect prime numbers. At 43 keystokes, it’s not the winning solution, but I think it’s the most interesting one. It uses a few clever Vim tricks, including macros, control-a to increment, the very magic pattern switch, and the :global command. There’s a lot to learn from those 43 keystrokes, so let’s study it!



Quicktime 11 MB

Here’s the full solution (with a running keystroke tally):

keystrokes tally
C<Tab>1<Esc> 4
qw 2
<C-a> 1
>> 2
Yp 2
q 1
540@w 5
:g/\v^(<Tab><Tab>+)\1+</d|%s/<Tab>*<CR> 24
ZZ 2
total: 43

The <Tab> notation stands for the tab key, and <Esc> for the escape key. <C-a> stands for ctrl-a, and <CR> for carriage return, a.k.a. the enter key.

Try it out for yourself, and see if you can figure out how it works. You might consider the following explanation a spoiler.


Here’s the general strategy: we’ll start by inserting the numbers 2, 3, 4, 5, and so on, up to 541, which is the one-hundredth prime number. Having inserted all of the numbers in this range, one per line, we’ll then delete each line that contain a non-prime number. That should leave us with the first 100 prime numbers, and nothing else.

And here’s the trick: we’ll indent each line by a number of tabs that equals the value of the number on that line. For example, the line containing the number 2 will lead with 2 tab characters. The line containing the number 3 will lead with 3 tab characters. And so on. Then we can examine the length of the string with a regular expression.

Insert numbers 2 up to 542

Combining the [ctrl-a][] command with a macro lets us rapidly insert the numbers from 2 up to 542:

keystrokes explanation
C<Tab>1<Esc> insert a tab followed by 1
qw start recording
<C-a> increment the number
>> insert a tab at the start of the line
Yp duplicate the line
q stop recording
540@w execute the macro 540 times

Delete all non-prime numbers

The :g/{pattern}/[cmd] command allows us to execute the specified [cmd] on every line that matches the specified {pattern}. It can be used in this form to delete all non-prime numbers:


Then we can flatten the remaining indentation with this substitute command:


Note that these two individual Ex commands can be combined into a single command-line, which saves a keystroke (this is VimGolf remember!):


How does that regular expression work?

The magic of the regex happens in this bit:


Inside the parentheses, we match two or more tab characters. These are captured in a sub-match, and we can reference this sub-match using the \1 notation. This bit of the pattern matches one or more occurrences of the original sub-match.

The + operator means “one or more” occurrences, and it performs a greedy match. That means it matches as many characters as it can. This diagram demonstrates how the regular expression inspects a string of 8 tab characters:

The regex finds that 4 is a factor of 8

The match indicates that 4 is a factor of 8, and therefore 8 is not a prime.

This diagram demonstrates how the same regex interacts with a string of 7 tab characters:

The regex finds no factors of 7, meaning it's a prime

There’s no match, which indicates that 7 is a prime number.

Credit where due

Thanks to Federicco Galassi for introducing me to this VimGolf solution. He credited it to Matthew Draper, but Matthew says his solution was a refinement of the one posted by Glenn. That seems to cover the history of this trick since it appeared on the VimGolf site, but the regex that forms the core of this solution has a much longer history.

This Perl one-liner was used in Abigail’s email signature back in 1997:

perl -wle 'print "Prime" if (1 x shift) =~ /^(?!(11+)\1+$)/'

That regex must have gained some notoriety. In this perlgolf “Sieve of Erastosthenes” challenge (from 2001), it was referred to as the infamous RE from Abigail. I’ve come across [several][] explanations of how the Perl one-liner works, as well as an article that points out its limitations.

If you can fill in any more of the details in the history of this solution, please leave a comment below.

Further reading