perl5 – Juggling Bits

Mocking LWP::UserAgent with POST redirection

2010-04-112010-04-11 thomas111 Comment

To test the ubiquitous (and awesome) LWP::UserAgent module, I use Test::Mock::LWP. It works as advertised, with minimal code. However, I still found myself writing a little plumbing. Here are two code snippets that I use when using Test::Mock::LWP.

First, a way of mocking code using LWP::UserAgent’s requests_redirectable array, as in


sub _new_agent {
    my $agent = LWP::UserAgent->new;
    push @{$agent->requests_redirectable}, 'POST';
    return $agent;
}

This won’t work with vanilla Test::Mock::LWP, as requests_redirectable is not a method and therefore cannot be mocked – a small flaw in UserAgent’s API. To circumvent the problem, mock your own UserAgent creation sub (you did encapsulate that, didn’t you):


use Test::MockObject;

# In addition to Test::Mock::LWP, we also need to mock our own
# wrapper for LWP::UserAgent->new, as it needs UserAgent's
# requests_redirectable array and the mock doesn't have that.
Test::MockObject->new()
  ->fake_module('My::Foo',
                '_new_agent' => sub { return LWP::UserAgent->new; }
               );

Second, just a tiny sub to make things nicer, IMHO: it provides a clean abstraction to get the latest requested URL. It’s useful if that’s all you want to test. This makes tests nicer to read and isolates them from the mocking module. It also encapsulates the magic index we need to get the requested URL from the request header array. The index is 2 here, this works for code making requests using HTTP::Request::Common. It’s probably different if you use HTTP::Request directly, but the test runner’s expected/actual output will quickly show you where the URL hides.

sub latest_requested_url {
    return scalar $Mock_request->new_args()->[2];
}

Interesting Perl 5 modules

2010-04-082010-04-08 thomas11

In my journey into Perl this last year or so I’ve naturally done a lot of research on available modules. After all, the huge variety of CPAN is one of Perl’s strongest points. I journaled a few modules that struck me as particularly interesting, so why not merge these notes and links and publish them for others finding their way in the modern Perl world.

PSGI/Plack

Superglue interface between perl web application frameworks and web servers, just like Perl is the duct tape of the internet.

Inspired by Python’s WSGI and Ruby’s Rack, this is the modern way of doing Perl web apps.

PSGI is a specification to decouple web server environments from web application framework code. […] Web application developers (end users) are not supposed to run their web applications directly using the PSGI interface, but instead are encouraged to use frameworks that support PSGI, or use the helper implementations like Plack (more on that later).

A large number of HTTP servers support PSGI, so you’ll find anything from simple embedded solutions to non-blocking, asynchronous ones with Comet support.

Similarly, a large number of web frameworks have been built with support for the PSGI spec, again from the advanced and complex like Catalyst to the simple like Dancer. At first I wanted to show off some cool Perl web frameworks in this post, but there are so many nowadays that would deserve to be mentioned that I’ll just refer you to the PSGI/Plack site and let you pick whatever floats your boat.

AnyEvent

the DBI of event loop programming

AnyEvent lets you write event-based (callback-based) code without limiting you to a certain event loop. The event loop based nature of your module is transparent to the user. A large number of external loops are supported, among them Glib (for GTK/Gnome apps) and Qt.

There’s also POE with a very similar purpose. This post and its comments contain a lot of useful information and opinions to compare the two.

Dist::Zilla

distribution builder; installer not included!

Similar to many other modules, this one helps build distributions for upload to CPAN. However, it does not address installation of the module. Therefore, it can do powerful stuff, as it’s only run by developers and typically runs on a repository. It features a promising git integration module, for instance. Reviews are very favorable.

Data::Traverse

Callback-based depth-first traversal of Perl data structures

Data::Traverse exports a single function, traverse, which takes a BLOCK and a reference to a data structure containing arrays or hashes. […] Data::Traverse performs a depth-first traversal of the structure and calls the code in the BLOCK for each scalar it finds.

Simple and useful.

Email::Stuff

A more casual approach to creating and sending Email:: emails

In a “Why use this?” section – a good idea – the author shows that using this module, you don’t need to know how to structure MIME messages, and the code is very short and clear.


Email::Stuff->to('Simon Cozens<simon@somewhere.jp>')
            ->from('Santa@northpole.org')
            ->text_body("You've been a good boy this year.")
            ->attach_file('choochoo.gif')
            ->send;

Parameter hash patterns in Perl 5

2010-03-082010-03-09 thomas11

It’s a common pattern in Perl 5 to use a hash for a subroutine’s arguments, or some of them. Damian Conway explains this pattern in his excellent Perl Best Practices. I’ll first briefly recap the standard forms, then show how you can support both standard arguments and a hash for extra arguments.

The basic form looks like this:

pad({ text=>$line, cols=>20 })

You can actually leave out the curly hash-braces and just pass a list of key-value pairs:

pad( text=>$line, cols=>20 )

That’s what you often see in practice, but Conway argues against doing that. It allows mismatches such as passing cols=>20..21 (two values on the right hand side) to pass compilation.

Most of the time that won’t be a problem in practice, as the values of the pairs will be simple enough. But it’s better to do things in a uniform way that works in all situations, and the sub’s implementation depends on the way of passing the hash.

When passing an explicit hash enclosed in {}, you get it as a reference:

my ($hashref) = @_;
my $foo = $hash->{foo};

Using raw key-value pairs, you directly get a hash:

my %hash = @_;
my $foo = $hash{foo};

Obviously, the latter form does not allow to pass any arguments other than the hash. One more argument against doing that. I often write subs that take the necessary arguments directly, and optional ones, or “configuration” parameters, in a hash that may or may not be passed:

$uniprot->retrieve(@ids, {format=>'rdf', include=>1}):

You can implement once and re-use a routine, say _get_args_and_conf, that handles this distinction between arguments and configuration so that your subs don’t have to. It looks at the arguments, checks if the last one is a hash, and if that’s the case, merges it with the default configuration and returns the arguments and the configuration separately. You would use it like that in your code:

my %RETRIEVE_DEFAULTS = (
    format => 'fasta',
    debug => 0 );

sub retrieve {
    my ($ids_ref, $conf_ref) =
        _get_args_and_conf(\%RETRIEVE_DEFAULTS, @_);
    # $ids_ref now contains the arguments, here some ids to
    # retrieve from uniprot.org, and $conf_ref contains the
    # configuration hash with the user's values if given, and the
    # default ones otherwise.
}

My implementation looks like that. The meat of the routine, the hash handling, is straight from Conway’s Best Practices.

sub _get_args_and_conf {
    my $default_conf_ref = shift;
    my @args = @_;
    croak "I need at least one argument!" if @_ < 1;

    # if last arg is a hash, it's additional configuration
    my %defaults = %{$default_conf_ref};
    my %conf = ref $args[-1] eq 'HASH' ?
        (%defaults, %{pop @args}) : %defaults;
    if (@args < 1) {
        croak "I need at least one argument in addition to the hash!";
    }

    # TODO Deal with the case that the argument list is given as a
    # reference.

    return (\@args, \%conf);
}