Perl 6 and the Price of Elegant Code

The project that first made me fall in love with Perl…

Perl 5 had recently launched, with new features for object-oriented programming, includable modules, lexically scoped variables, and references—including closures. At the time, I was writing boot code, diagnostics, and production test software for Kurzweil-brand electronic musical gear. Part of the challenge of doing embedded systems back then was getting the machine code into the actual machine. The C compiler, assembler, and linker generated a binary file, which needed to be physically programmed into multiple EPROM chips in order to run the code on our custom hardware. We actually had software tools to pad out, checksum, and split up the binary data so it could programmed into multiple EPROMs according to our specifications.

And then we started designing systems that exceeded the capabilities of our tools. And I had to hack together custom C programs to reformat the binary data according to the new requirements. I think I might have done that twice. And then I asked myself whether Perl, which was always very good with strings of data—and what is a binary file but a big string of data? I asked myself whether Perl might be able to do a better job.

A day or two later…

…and I thought it was so incredibly cool that I could replace a whole file of cryptic C code with a paragraph like:

use ROMImage::All;

my $codedata = cat(
    pad( infile('code.bin'), 0x28000 ),
    pad( infile('data.bin'), 0x17FFC ),
);

my $bin = cat($codedata, checksum($codedata, 32));

my $splits = &split($bin, 2); # lo-byte/hi-byte split

my @outfiles = (
    outfile( 'eprom0lo.img', section($splits->splitnum(0), 0x00000 => 0x10000) ),
    outfile( 'eprom0hi.img', section($splits->splitnum(1), 0x00000 => 0x10000) ),
    outfile( 'eprom1lo.img', section($splits->splitnum(0), 0x10000 => 0x20000) ),
    outfile( 'eprom1hi.img', section($splits->splitnum(1), 0x10000 => 0x20000) ),
);

process($_) foreach @outfiles;

I don’t know how I let myself get away with naming a function the same as the split keyword, but I did. At least I was using strict for everything. I eventually constructed, on this framework, a deeply configurable system for mastering binary code and data for newer models that included FlashROM. I frankly don’t know how this system would have been possible were it not for the expressiveness of Perl.

Priming for Expressiveness

Purple Shock, by Andras Pfaff, http://flic.kr/p/4seYGi

Enter Perl 6, whose list of features boggles the mind, but includes a few that really excite me… if they can pull it off.

And so I figured I would see whether I could convince Perl 6 to search for prime numbers. Why prime numbers? Of all the toy problems I could come up with, why that one? I could find a whole bunch of plausible-sounding reasons, like it provides a well-understood set of algorithms, of varying complexity, utilizing numerous parts of the language syntax and implementation features, allowing me to evaluate the ease and performance of Perl 6…

But true reason is: Why the hell not? I was inspired by a blog post showing a naïve primes algorithm implemented in Haskell and Scala using lazy evaluation, and I thought, Perl 6 does lazy evaluation. Perl 6 should be able to do that.

Note that, contrary to the above-linked post, this algorithm does not implement the Sieve of Eratosthenes. Rather it is a trial-division algorithm. And that is what I’ll be experimenting with in Perl 6.

The Perl 6 Native Approach

As it turns out, the Perl 6 Int type includes an is-prime method, which returns True if the integer is (likely to be) prime. (The current docs say that it uses the Miller-Rabin primality test, but I’m not sure why it of necessity must do so, except that this is probably the best algorithm currently available.)

So we can write a simple function that returns all the primes up to and including a defined maximum thusly:

sub primes-native (Int $max --> Array[Int]) {
    grep({.is-prime}, 2 .. $max);
}

That is, take all the integers from 2 through to max, and ask the is-prime method whether they should be included in the filtered (“grepped”) list. Return the result. (As in Perl 5, the return statement would be redundant.)

One thing you may notice here is that Perl 6 allows symbols to contain the occasional hyphen. It distinguishes a hyphen from a minus sign, because a minus sign always has whitespace around it. Yes, in Perl 6, whitespace is significant as a contextual marker. In particular, infix operators should have whitespace around them, and prefix and postfix operators (including function calls) usually should not. That’s why I put space around the .. operator (even though it isn’t strictly required in this example), and I left no space between grep and the opening parentheses. Whitespace matters now.

Perl 6 also supports function signatures, which is very convenient. And it supports gradual typing… which is another post, but the short explanation: Perl 6 declarations can include type information, in which case the compiler will statically check types. If you leave the types out, the compiler will use dynamic typing. This particular function takes one formal parameter, an Int called $max, and returns an Array of Int (or Array[Int]—same thing).

Rakudo seems to be the most complete Perl 6 implementation available. Unfortunately, when I try to say primes-native(10) on the latest Rakudo, it replies: “Type check failed for return value; expected ‘Array[Int]’ but got ‘List’.” This is a known problem. That’s too bad. In general, I guess, for now at least, I’ll just have to use dynamic typing for anything that returns an array, or any formal parameter that might receive an array returned by another function. Basically, use dynamic typing for arrays.

So, we “fix” the code thusly:

sub primes-native (Int $max) {
    grep({.is-prime}, 2 .. $max);
}

Based on my benchmarks, the JVM version of Rakudo seems to be the fastest Perl 6 engine available. And running on it, primes-native(1000) takes about 115mS to find all 168 primes in that range. Even on my old 2.3GHz Intel Core 2 Duo (with one core tied behind its back, because Perl 6 is currently single-threaded, even on the JVM), a tenth of a second to go through 999 candidate primes, not exactly blazing fast.

The Elegant Approach

Of course, that’s not what I was originally aiming at.

As it turns out, I never quite made it to lazy nirvana, because laziness is only partially supported in Rakudo. But that’s another post.

In the meantime, I can scale back the algorithm. Remember those “plausible-sounding reasons”? A simple trial-division loop allows me to experiment with Perl 6, moving objects around (which just happen to be integers in this case), doing lots of simple operations on them and stuffing them into lists… basically what most web apps do, but on a much smaller scale.

So, a Perl 6 loop that goes through all the candidate integers from 2 through $max, and eliminates any that are factors of any that come before.

There’s something very elegant about a language that allows to me to express that algorithm directly in code:

sub primes-any (Int $max) {
    my Int @primes;
    @primes.push($_) unless $_ %% any(@primes) for 2 .. $max;
    return @primes;
}

The %% operator is read “is divisible by.” That is, $x %% $y is the same as $x % $y == 0. So the highlighted line above can be read:

Push the candidate onto the list of primes, unless it is divisible by any of the already-found primes (in which case, it’s not a prime after all, so don’t push it). Do this for each of the candidates in the range 2 .. $max.

The grace and power of this code is provided by the magical any operator, which creates a Junction of @primes. This “any” Junction causes the %% operator to be applied to each of the primes in turn, and if the candidate is divisible by any of them, the expression returns True. And—part of the unfulfilled promise—these individual operations can automatically occur in different threads: automagic multithreading.

(UPDATE: I happened to run across posts claiming that Rakudo fully implements autothreading, and has for some time, contrary to what I had believed. However, on both Parrot and the JVM, the above any Junction still seems to use only one core of my dual-core CPU. I cannot explain this observation.)

But for now, Junctions degenerate to a simple, single-threaded loop…

Or worse? In my benchmarks, primes-any(1000) required over 11 seconds to run. (Yes, seconds. Eleven of them. No, that is not a typo.)

Is that part of the price of elegant code?

I’d like to believe not. Surely there must be a more efficient implementation, even in Perl 6. Indeed there is. But unfortunately nothing comparable to extant production languages.

But that part of the story will have to wait for next time.

This entry was posted in Uncategorized and tagged , , , , . Bookmark the permalink.

3 Responses to "Perl 6 and the Price of Elegant Code"

Leave a reply