2010/04/30

Perl - ExtUtils::H2PM

I've spent a lot of time lately writing modules that wrap Linux kernel features in some way or another. They all boil down to basically the same thing - export a bunch of constants, and structure packing/unpacking functions. Various bits of extra fancy interface code are sometimes nice, but most of the time these can be written in Pure Perl once the base bits are done.

It's always annoyed me that one has to write an XS module just to obtain these. It's a lot of extra work and bother, for something that ought to be so simple. So instead, I started thinking about it, how can I make this much simpler from Perl?

What I came up with is ExtUtils::H2PM; a module to make it trivially-easy to write this sort of boring module to wrap some constants and structures from OS's header files.

As a brief example, consider the following

use ExtUtils::H2PM;

module "Fcntl";

include "fcntl.h";

constant "O_RDONLY";

write_output $ARGV[0];

This is it. This is all the code you, as a module author, have to write. You store that in some file, lets call it Fcntl.pm.PL. You let the build system go about converting that into Fcntl.pm; which on my system now looks like this:

package Fcntl;
# This module was generated automatically by ExtUtils::H2PM from -e

push @EXPORT_OK, 'O_RDONLY';
use constant O_RDONLY => 0;

1;

This is a plain standard Perl module that can be installed and used, to provide that constant.

We can also create pack/unpack-style functions to wrap structure types too. Consider

use ExtUtils::H2PM;

module "Time";

include "time.h";

structure "struct timespec",
members => [
tv_sec => member_numeric,
tv_nsec => member_numeric,
];

write_output $ARGV[0];

For this we obtain:

package Time;
# This module was generated automatically by ExtUtils::H2PM from -e

use Carp;
push @EXPORT_OK, 'pack_timespec', 'unpack_timespec';

sub pack_timespec
{
@_ == 2 or croak "usage: pack_timespec(tv_sec, tv_nsec)";
my @v = @_;
pack "q q ", @v;
}

sub unpack_timespec
{
length $_[0] == 16 or croak "unpack_timespec: expected 16 bytes";
my @v = unpack "q q ", $_[0];
@v;
}

1;

This was done entirely automatically too.

In a real use-case, you'd be using this to wrap things like socket option constants/structures, and so on; cases where existing Perl functions wrap existing syscalls, you simply want to provide new option values or structures. I've now written several modules mostly or entirely relying on ExtUtils::H2PM to build them, freeing me of having to write all that XS code to implement them.

2010/04/26

Order matters even when it doesn't

Revision control diffs are most readable when they aren't noisy. Operations that disturb the order of many lines in the file create noise which makes it hard to read the interesting change. YAML specifies mappings (hashes, to us perl-types), that are unordered associations of keys to values. Even though YAML doesn't put an ordering on those, sometimes we'd like to pretend that it does, so as to preserve the order when we load a file, edit the data, then dump it back to the file.

At work we store a YAML document in Subversion, which describes a lot of details about IPsec tunnels. In an ideal world this would be the initial source of the information. The world, as you may have observed, is not yet ideal, so this file is in fact back-scraped from information in the actual config files, to keep it up to date. Naturally that's done in Perl.

The YAML file stores a big mapping, each entry itself being a record-like mapping, containing details in named keys. This causes great trouble for our load/edit/dump script, because YAML doesn't specify an ordering in mapping keys. They'll be dumped in "no particular order".
This wouldn't normally be a problem, except that because it's stored in Subversion, a commit changing one line of actual detail might suddenly produce hundreds of lines of false diff, because of reordered keys.

To solve this, I had to apply Much Evil Hackery. The YAML Perl module, it turns out, has a data structure tied to a hash, which remembers the order of keys. By subclassing YAML::Loader and replacing its method to read a mapping into a hash ref, we can force it to use this structure instead. This alteration is transparent to the perl code inbetween, it just sees a normal hash. However, YAML::Dumper sees the ordering and preserves it when it writes out.

The upshot: Load/edit/dump of trees of mappings in YAML preserves ordering, allowing cleaner commits into revision control.

This has been suggested as a wishlist bug against YAML; see also https://rt.cpan.org/Ticket/Display.html?id=56741

2010/04/06

Why I won't tell you what to do

Often in #perl we get people asking a question, whether explicitly or implied, that requires us to pick a solution for them. I try hard not to do this. The closest I'll get is to listen to their description of the problem, and suggest some things which I think might help. Hardly ever do I suggest just one thing.

I do this because it's hard for us to know the entire surrounding context of a problem. If anyone else is like me, then there'll be days, weeks, maybe even months of history behind it; various attempts they've tried already, other bits and pieces of code, system, whatever, that they haven't been able to explain in the 5 minutes they tried to give us the problem. You can't explain a 1-month problem in 5-minutes, nor can I give a solution to it in similar timeframe.

What I can do is name a few things that I'd include in a shortlist of things to think about in more detail, were I to find myself with a similar problem. They can then go away and think about these things. Maybe he already was aware of them, and we've just given some more confidence that those might be correct. Or maybe he wasn't, so we've given him something new to read about. Either way, we've helped guide the decision process, without outright saying "thou shalt do this" - because, without having that month of context around it; for all we know it could be completely wrong. But it's a good start.