Mopping up with Perl

30 Ways to Shock Yourself #2,
photo by Bre Pettis,
from the book Elektroschutz in 132 Bildern

I ported my DFW.pm Hackathon entry to the latest Perl dev release (5.19.9), the one with built-in sub signatures.

I had originally designed it using the latest p5-mop and Алексей Капранов’s (Alexei Kapranov’s) signatures package. (I thought this was more in the spirit of mop than Sub::Signatures.) And that worked fairly well, with only a few glitches, and only a few complaints. But since sub signatures are now available (Yay!) in core (No hiding how I feel about that, eh?), it seemed a good time to bring the experiment fully into the present.

And here are some discoveries from my experiences with Perl mop+signatures.

(Note: Regarding the photo, “30 Ways to Shock Yourself”… with a mop… I hope you’ll agree, I was shocked only in how powerful and relatively pain-free mop was to develop with. I chose this particular photo, because it was the only CC mop photo I could find that was relatively provocative and interesting. If you were dismayed or offended, don’t worry; next time I’ll probably choose a more subtle photo, like this one.)

Throw a package on the boiler-plate, will ya?

In the file Data/Dedup/Engine.pm, p5-mop standard boilerplate would seem to include something like this:

package Data::Dedup;

use mop;

class Engine {
}

no mop;

This defines a new mop class, Data::Dedup::Engine.

But it puts all imported symbols in Data::Dedup, including all the mop syntax. Subroutine names also end up there. And these could potentially conflict with symbols from class Data::Dedup::Files (which is probably the reason for the no mop at the end).

I actually hadn’t quite figured this all out when I originally wrote the code. However, I played with a few other options and ended up with the following boilerplate (which may or may not be an improvement on standard practice):

package Data::Dedup::Engine;
# VERSION: dist tool inserts version here

package Data::Dedup::Engine::_guts;
use 5.019_009;
use strict;
use warnings;
use feature 'signatures';
no warnings 'experimental::signatures';
use mop 0.03;

class Data::Dedup::Engine {
}

Yeah, it might be nice to reduce that boilerplate. In reality, it’s not as bad as it looks.

The first package line is really only there to give Dist::Zilla a way to auto-insert the package version number, and if I were more clever, I would have figured out how to do that without the funny prelude. Moving further down the page, everything regarding signatures and mop will eventually be subsumed by something like use 5.022.

(I should note at the this juncture, most of these lines add features that would also need to be added when using the standard practice. That is, do not compare this boilerplate to the standard mop boilerplate above. The two accomplish very different things. If we were to add back in all the signature stuff and the strict and warnings and version stuff, back into the standard p5-mop form, the p5-mop form would end up with just as many lines. One more line, actually, because of the additional no mop line.)

The interesting part of what that leaves is the line:

package Data::Dedup::Engine::_guts;

…which places all the package’s private implementation into an internal namespace.

(Naturally, Perl being Perl, there may be other ways of achieving a similar result, like namespace::autoclean, Lexical::Imports, and lexically, not all of which I actually tried. To my mind, using a _guts namespace within a shared lexical scope seemed the simplest thing that could possibly work.)

Production, errors, and other rules of play

When I first started developing my hackathon solution, I installed the p5-mop version 0.02-TRIAL release. And I soon discovered that it does not play nice with Carp. Then I discovered that this issue had been addressed in git. So I upgraded to the development codebase. (Hence, the 0.03 version requirement.)

I also discovered that the signatures package had problems with parsing certain formal parameter names, sometimes, if the names had underscores in them. (WTF?) Fortunately, use feature 'signatures' does not have this problem. So that’s also now a non-issue.

The lesson, though, is clear: for now, you want to be using the latest, greatest, most-est bleeding-edgiest version you can find.

But that’s almost the only reason it’s not “ready for production” yet.

I put “ready for production” in quotes, because “ready” depends on what you mean by “production,” that is, what your particular production requirements are. In my opinion, it was more than ready for the production of the hackathon solution that I wanted to develop. And even with 20/20 hindsight, adopting p5-mop was a good call. And if I could have adopted feature 'signatures', I should have done that, too.

The biggest problem I experienced is unfortunately one of the banes of Perl syntax sugar: bad error messages. With the signatures module, I would commonly wrestle with completely useless and meaningless syntax errors, on so-not-the-correct line numbers, that would be solved, for example, by removing the underscores from formal parameter names. (I don’t even remember how I figured that out. I’ve kinda blocked it from my memory.)

Porting to feature 'signatures' seems to have improved the situation a little, but unfortunately still not enough.

After making the changeover, I encountered “Too few arguments for subroutine at lib/Data/Dedup/Engine.pm line 373.” The actual problem was that the code at line 429 included the following function call:

_block( $object, $!blocking, \($!_blocks_by_key) );

And this function call is actually correct, except that _block was declared thusly on line 369:

sub _block($object, $blocking_subs, $r_blockslot, $keys)

That is, _block is called with one fewer actual parameter than formal parameters. This was not a problem with the signatures module, because it simply defaults extra formal parameters to undef. With feature 'signatures', I needed to make the default explicit… which is probably a good thing.

sub _block($object, $blocking_subs, $r_blockslot, $keys=undef)

So the error was correct. But the line number was completely wrong. And the error message didn’t really give me the information I needed to understand what really went wrong. Ideally, it should have said something like: “Too few arguments calling subroutine _block at lib/Data/Dedup/Engine.pm line 429.”

Other more minor quibbles: mop and signatures don’t play nice with syntax highlighters (but that’s to be expected). They don’t play nice with perlcritic. (Yeah, well, what does?) Pod::Coverage doesn’t work with mop (yet?).

The internal mop data structures used much more RAM than I expected. This seems to be because they use an additional layer of indirection for each construct they store. That is, an internal mop hash will contain a reference that refers to another reference that then finally refers to the actual object associated with that slot in the hash. When you have lots and lots of objects, such as I did during some of the hackathon tests, those extra references add up fast, using up RAM along the way.

There’s also the issue of speed. But although mop is certainly slower than more primitive Perl, it’s not lethargic, even in this early stage, and I was able to get decent performance out of it. So as with anything, you have to weigh low-level performance against other design considerations. But even if mop were not to improve further, it would still be a useful tool to have in the toolbox.

This entry was posted in Uncategorized and tagged , , , . Bookmark the permalink.

Leave a reply