The project that first made me fall in love with Perl…
Perl 5 had recently launched, with new features for object-oriented programming, includable modules, lexically scoped variables, and references—including closures. At the time, I was writing boot code, diagnostics, and production test software for Kurzweil-brand electronic musical gear. Part of the challenge of doing embedded systems back then was getting the machine code into the actual machine. The C compiler, assembler, and linker generated a binary file, which needed to be physically programmed into multiple EPROM chips in order to run the code on our custom hardware. We actually had software tools to pad out, checksum, and split up the binary data so it could programmed into multiple EPROMs according to our specifications.
And then we started designing systems that exceeded the capabilities of our tools. And I had to hack together custom C programs to reformat the binary data according to the new requirements. I think I might have done that twice. And then I asked myself whether Perl, which was always very good with strings of data—and what is a binary file but a big string of data? I asked myself whether Perl might be able to do a better job.
A day or two later…
…and I thought it was so incredibly cool that I could replace a whole file of cryptic C code with a paragraph like:
use ROMImage::All; my $codedata = cat( pad( infile('code.bin'), 0x28000 ), pad( infile('data.bin'), 0x17FFC ), ); my $bin = cat($codedata, checksum($codedata, 32)); my $splits = &split($bin, 2); # lo-byte/hi-byte split my @outfiles = ( outfile( 'eprom0lo.img', section($splits->splitnum(0), 0x00000 => 0x10000) ), outfile( 'eprom0hi.img', section($splits->splitnum(1), 0x00000 => 0x10000) ), outfile( 'eprom1lo.img', section($splits->splitnum(0), 0x10000 => 0x20000) ), outfile( 'eprom1hi.img', section($splits->splitnum(1), 0x10000 => 0x20000) ), ); process($_) foreach @outfiles;
I don’t know how I let myself get away with naming a function the same as the split
keyword, but I did. At least I was using strict
for everything. I eventually constructed, on this framework, a deeply configurable system for mastering binary code and data for newer models that included FlashROM. I frankly don’t know how this system would have been possible were it not for the expressiveness of Perl.
Priming for Expressiveness
Enter Perl 6, whose list of features boggles the mind, but includes a few that really excite me… if they can pull it off.
And so I figured I would see whether I could convince Perl 6 to search for prime numbers. Why prime numbers? Of all the toy problems I could come up with, why that one? I could find a whole bunch of plausible-sounding reasons, like it provides a well-understood set of algorithms, of varying complexity, utilizing numerous parts of the language syntax and implementation features, allowing me to evaluate the ease and performance of Perl 6…
But true reason is: Why the hell not? I was inspired by a blog post showing a naïve primes algorithm implemented in Haskell and Scala using lazy evaluation, and I thought, Perl 6 does lazy evaluation. Perl 6 should be able to do that.
Note that, contrary to the above-linked post, this algorithm does not implement the Sieve of Eratosthenes. Rather it is a trial-division algorithm. And that is what I’ll be experimenting with in Perl 6.
The Perl 6 Native Approach
As it turns out, the Perl 6 Int
type includes an is-prime
method, which returns True
if the integer is (likely to be) prime. (The current docs say that it uses the Miller-Rabin primality test, but I’m not sure why it of necessity must do so, except that this is probably the best algorithm currently available.)
So we can write a simple function that returns all the primes up to and including a defined maximum thusly:
sub primes-native (Int $max --> Array[Int]) { grep({.is-prime}, 2 .. $max); }
That is, take all the integers from 2 through to max, and ask the is-prime
method whether they should be included in the filtered (“grepped”) list. Return the result. (As in Perl 5, the return
statement would be redundant.)
One thing you may notice here is that Perl 6 allows symbols to contain the occasional hyphen. It distinguishes a hyphen from a minus sign, because a minus sign always has whitespace around it. Yes, in Perl 6, whitespace is significant as a contextual marker. In particular, infix operators should have whitespace around them, and prefix and postfix operators (including function calls) usually should not. That’s why I put space around the ..
operator (even though it isn’t strictly required in this example), and I left no space between grep
and the opening parentheses. Whitespace matters now.
Perl 6 also supports function signatures, which is very convenient. And it supports gradual typing… which is another post, but the short explanation: Perl 6 declarations can include type information, in which case the compiler will statically check types. If you leave the types out, the compiler will use dynamic typing. This particular function takes one formal parameter, an Int
called $max
, and returns an Array of Int
(or Array[Int]
—same thing).
Rakudo seems to be the most complete Perl 6 implementation available. Unfortunately, when I try to say primes-native(10)
on the latest Rakudo, it replies: “Type check failed for return value; expected ‘Array[Int]’ but got ‘List’.” This is a known problem. That’s too bad. In general, I guess, for now at least, I’ll just have to use dynamic typing for anything that returns an array, or any formal parameter that might receive an array returned by another function. Basically, use dynamic typing for arrays.
So, we “fix” the code thusly:
sub primes-native (Int $max) { grep({.is-prime}, 2 .. $max); }
Based on my benchmarks, the JVM version of Rakudo seems to be the fastest Perl 6 engine available. And running on it, primes-native(1000)
takes about 115mS to find all 168 primes in that range. Even on my old 2.3GHz Intel Core 2 Duo (with one core tied behind its back, because Perl 6 is currently single-threaded, even on the JVM), a tenth of a second to go through 999 candidate primes, not exactly blazing fast.
The Elegant Approach
Of course, that’s not what I was originally aiming at.
As it turns out, I never quite made it to lazy nirvana, because laziness is only partially supported in Rakudo. But that’s another post.
In the meantime, I can scale back the algorithm. Remember those “plausible-sounding reasons”? A simple trial-division loop allows me to experiment with Perl 6, moving objects around (which just happen to be integers in this case), doing lots of simple operations on them and stuffing them into lists… basically what most web apps do, but on a much smaller scale.
So, a Perl 6 loop that goes through all the candidate integers from 2 through $max
, and eliminates any that are factors of any that come before.
There’s something very elegant about a language that allows to me to express that algorithm directly in code:
sub primes-any (Int $max) { my Int @primes; @primes.push($_) unless $_ %% any(@primes) for 2 .. $max; return @primes; }
The %%
operator is read “is divisible by.” That is, $x %% $y
is the same as $x % $y == 0
. So the highlighted line above can be read:
Push the candidate onto the list of primes, unless it is divisible by any of the already-found primes (in which case, it’s not a prime after all, so don’t push it). Do this for each of the candidates in the range 2 .. $max
.
The grace and power of this code is provided by the magical any
operator, which creates a Junction of @primes
. This “any” Junction causes the %%
operator to be applied to each of the primes in turn, and if the candidate is divisible by any of them, the expression returns True
. And—part of the unfulfilled promise—these individual operations can automatically occur in different threads: automagic multithreading.
(UPDATE: I happened to run across posts claiming that Rakudo fully implements autothreading, and has for some time, contrary to what I had believed. However, on both Parrot and the JVM, the above any
Junction still seems to use only one core of my dual-core CPU. I cannot explain this observation.)
But for now, Junctions degenerate to a simple, single-threaded loop…
Or worse? In my benchmarks, primes-any(1000)
required over 11 seconds to run. (Yes, seconds. Eleven of them. No, that is not a typo.)
Is that part of the price of elegant code?
I’d like to believe not. Surely there must be a more efficient implementation, even in Perl 6. Indeed there is. But unfortunately nothing comparable to extant production languages.
But that part of the story will have to wait for next time.
By Robert Czernkowski February 6, 2014 - 11:19 pm
Hi Tim,
I just stumbled across your website. What have you done with your website that stops me magnifying it (unpinch on an iPad) – I’m aging and need me writing to be bigger. (I gave up on HTML no longer reflowing upon enlargement almost a decade ago, when the graphic designers came along and destroyed usability:) )
Cheers – NB, this is a serious question. I guess I wonder why people do this.
Have a good weekend,
RobC
By Ultimape August 12, 2014 - 1:20 pm
The elegance of Perl is why I fell in love with it.
Most other languages just sound like simple English to me. Just like in Wikipedia, it may be easier to read for a non-native speaker, but it also increases the length of the content by a significant amount: http://en.wikipedia.org/wiki/Simple_English_Wikipedia
Once I was able to ‘read’ it as a language, it started to sound like English in my head. I think a large part of my affinity for it is for similar reasons to why immersion learning of a second languages is so effective. Being taught a language by someone familiar with the idioms is a completely different from learning it on your own.
I think much of that has to do with the non-context-free nature of Perl. We grow up learning to use context based placeholders like ‘it’ and other implied aspects, but very few programming languages leverage this part of our lexical powers. I’d venture to say that a programming language’s expressiveness can be tied almost directly with how many nuances of real human language it is able to capture.
Mentally replacing the ‘$_’ operator as ‘it’ makes want a pseudo code generator for simple scripts.
By Ultimape August 12, 2014 - 3:54 pm
To add to my previous comment, here is an article suggesting that the beauty of code may actually be part of the way we process language: http://www.huffingtonpost.com/chris-parnin/scientists-begin-looking-_b_4829981.html
And a related one on the beauty in mathematics. http://neurosciencenews.com/neurobiology-mathematical-beauty-756/