catdev: 2009

Wednesday, August 26, 2009

Moose: adding attributes to a base class

I'm using FormHandler for an example here, but the technique is general.

FormHandler forms are Perl classes with a number of attributes including an array of Field classes with another (probably too large) number of attributes. A few of the field attributes are validation and data related, but a lot of the other ones are related to producing HTML for display. Despite the too many attributes in the field classes, users still want yet more attributes. One of the FormHandler users was developing forms that used a javascript form package and wanted to interface the FormHandler forms to the javascript forms. To do this, he wanted an additional attribute in the fields to store information that would be used by extJs.

One possibility would be to subclass every last single field and add an additional attribute. This does not sound like either fun or a good idea. A much better alternative was to use Moose to add attributes to the base class by applying a role containing the attributes.

So I started out by creating a small role containing a single attribute:

   package MyApp::Field::Extra;
   use Moose::Role;
   has 'my_extra_attribute' => (is => 'rw', isa => 'Str' );
   1;

Now I needed someplace to apply the role. The BUILD method of the user form looks like a good place. Like good Moose classes, all of the fields have '__PACKAGE__->meta->make_immutable' in them. So in order to apply a role we have to temporarily make the class mutable and then make it mutable again. So I make the class mutable, apply the role using Moose::Util, then make it immutable again:

   my $class = 'HTML::FormHandler::Field';
   $class->meta->make_mutable;
   Moose::Util::apply_all_roles( $class->meta, ('MyApp::Field::Extra'));
   $class->meta->make_immutable;

Using a test file I make a form class using this code in the BUILD method, create an instance of the form, and find that the fields now have an additional attribute that I can retrieve and set.

This looks good until I try to set the new attribute in a 'has_field' declaration:

   has_field 'my_field' => ( my_extra_attribute => 'some_value' );

Ooops. This doesn't work. I'd forgotten that the fields are constructed in the base class BUILD method which fires before my form class's BUILD method. So now I need someplace else to move my role setting that will happen before the base class BUILD. Maybe after BUILDARGS...

   after 'BUILDARGS' => sub {
      my $class = 'HTML::FormHandler::Field';
      $class->meta->make_mutable;
      Moose::Util::apply_all_roles( $class->meta, ('MyApp::Field::Extra'));
      $class->meta->make_immutable;
   };

This works. Now I can treat the new attribute just like an original field attribute. And it's a lot easier than subclassing every field...

There are other ways of achieving the same thing. You could add an attribute instead of applying a role. But roles are more general purpose and flexible, so I'm satisfied with this solution for now.

And I definitely <3 the flexibility that comes with Moose.

Sunday, August 2, 2009

Fields: Moose attributes or arrays of objects?

Moose attributes are lovely things. They're a part of Moose that helps to make programming fun again. But having a magical, Dr. Who screwdriver doesn't mean that you can lose all of your other tools. If you have a nail, a good old-fashioned hammer is the right tool.

Moose type constraints are great things to catch programmer errors, typos, incorrect objects. Throwing errors for this kind of type failure is a good thing. You want the programmer to notice that something has gone wrong even if he hasn't had his morning coffee. I first did object oriented programming with C++. I still remember all the hoops that we had to jump through to handle construction time errors (many C++ compilers didn't have exceptions yet). The problem of what to do if you can't create a valid object is the problem that Moose attributes are designed to solve, and throwing errors is the proper thing to do.

But in my opinion, when the task is validation, not construction of a valid object, throwing errors isn't always the best way. You want to take an input string, hand it to the validator which examines it and hands it calmly back saying "good" or "not good", or some more specific error message which can be presented to a user--a user who is not the programmer. Handing it to a construct which instead has a tantrum and throws it back at your head - picture the programmer frantically dodging this way and that, trying to catch the message and de-cipher the problem to present some reasonable message to the user - introduces unnecessary difficulties.

When I started doing the work that turned into HTML::FormHandler I looked at the possibility of turning the field objects into Moose attributes, and saw a number of problems. One of the problems was simply names. Any attribute in the form that was not a field became a name that was not allowed as a field name. This sucked. Of course the programmer could pick some other name and have some additional attribute trait that specified the "real" field name. Yuck, but I suppose it would have worked.

There was a lot of functionality in the field class already, so it seemed to make most sense to have the attributes be of a field object type. But then the Moose attribute type validation doesn't actually apply to the job of the field classes -- validation -- but instead to whether or not this is a valid field, which is appropriate but not the same thing at all. Moving all of the attributes in the field class into some Moose attribute metaclass didn't appeal at all. I didn't see the point. The fields worked fine as objects. If you didn't ever want to provide assistance in constructing HTML or allow validation using simple object methods or an endless list of other things, I suppose you could go in that direction. But I already had working code and working functionality that I liked that I wasn't willing to give up for no reason other than ... um, fashion? Mooseish purity?

So there would be a bunch of Moose attributes that had a field class type. Then you need to be able specify all of the attributes to use to construct the fields. This is not insurmountable, but to do it in a non-irritating way (for me) would require some Moose-ish sugar. Fine, that would work. The next problem is that the fields are really a kind of set and you often need to iterate over them. So there would need to be some kind of array (or other collection) pointing to all of these objects. This was starting to look suspiciously like the array of field objects that I already had.

At this point I started wondering what advantage there was to making the fields Moose attributes. Sure, you could do $form-> sometimes. If the field name didn't conflict with other attributes. But I was already planning to have nested sub-fields and arrays of repeatable fields and that model didn't fit well at all with the Moose attribute idea. It's not like it was necessary to be able to do method modifiers or other Moose-ish things on the fields. They were objects. You could do whatever you wanted in them already. You could already use method modifiers (etc) on all of the parts of the form validation process.

I'm sure there would have been some way of making fields-as-Moose-attributes work. But this is Just A Program. The idea is to make it work in as simple a way as possible (but no simpler), not to make sure that it uses the latest shiny thing whether it provides any advantage or not. (Well, that's my goal anyway. Your goal may be playing with shiny new technologies and tools. :) )

The Collection::Array of field objects was working fine. It's still working fine. I still don't see any advantage to turning the fields into Moose attributes that isn't canceled out by some other disadvantage or complication. The current architecture, which I'm happy with, would be just silly shoehorned into a bunch of Moose attributes.

I guess I won't win the Most Unnecessary Moose Metaprogramming (MUMM) award (which it sometimes seems to me that the Mooserati are competing for). Shrug.

Saturday, July 18, 2009

why HTML::FormHandler...

Dan Dascalescu (dandv) said about FormHandler that it "would be awesome if the POD could mention what's different from FormFu, why create the module at all." I have tried to put some of that into the pod, but apparently it's not enough. So I've been thinking about this issue (the problem of too many packages and not enough info on the differences), and I think that there's are a number of things getting in the way of being clear enough and loud enough with "what's different".

For one thing, Carl Franks and many other people have put lots of work and time into FormFu and like it. The reasons that HTML::FormHandler exists are personal, very human feelings and reactions that don't belong in "official documentation". In order for the statements about what I think to not be offensive, they have to be put into the form of: "This is the way I feel, this is the way I reacted. YMMV." I don't want to start a flame war and I don't want to hurt anybody's feeling. (Yeah, I know that's kinda girly. So sue me.)

The next problem is that I don't really know FormFu. When I was looking for a form package a year and a half ago I looked at it and I just could not bring myself to use it. I hated it on sight. So I'm not competent to compare FormHandler and FormFu. (Anybody willing to submit a doc patch who has used FormFu?)

The last problem is that I don't think that I'm necessarily the right person to be really loud and clear about the advantages of FormHandler. I'm too close to it. It would feel like I'm tooting my own horn, boasting. I'm almost certainly not going to be seeing its weaknesses clearly. I think that somebody else will have to make the definitive comparison.

I guess I can talk about why FormHandler exists, but many of the reasons are human reasons, not technical reasons.

Programmers have emotions, sometimes strong emotions, about the tools and libraries that they use. Some packages are clean, easy, fun. Modifications can be made easily. The pieces can be clumped together in ways that are accessible and readable. Programmers can have strong emotions about variable names. So clearly they're a little unbalanced. (You should have seen the arguments that me and my manager had about whether a database column should be named 'report_no' or 'report_num'. It was downright silly but we both had emotions that were too strong to give up our positions. About a database column name.) Sometimes the emotions are irrational, or intuitions based on experience that can't easily be put into words.

So I'm trying to remember why I looked at FormFu and thought: "Yuck. I'm not going to use that. No, no, no." (Warning: this may not be helpful to you in making a choice of packages.)

The first problem that I had with FormFu was the yaml config files. I hate YAML. With a passion. I think that whitespace sensitive formats are stupid. I hated hated hated the tab-sensitivity of make files, and YAML was just more of the same. My eyes/brain can't adjust to pulling out the significant information when I glance over a YAML file. Yes, I KNOW that FormFu can use any kind of Config::General format. But all of the documentation was in YAML and I Didn't Want to Look at It.

There. I told you this wasn't going to be a rational technical discussion, didn't I?

I also hated the fact that the forms were even defined in config files to start with. I looked at it, and my first thought was: but what if I want to do something in a way that hasn't been pre-defined? Yes, I know you can make your own constraint classes, yadda, yadda, yadda. It was too disconnected to the particular form. I looked at the way it worked and I got claustrophobia. I felt like I would be having to adapt myself to FormFu, instead of me being able to adapt FormFu to the way I wanted to work. FormFu code couldn't be easily subclassed or overridden for a particular form. You'd have to do something weird to change the way that it worked. I have no idea if my reaction was accurate or not. I couldn't bring myself to try it enough to find out.

It felt heavyweight, cumbersome, and rigid. And no, I can't give you detailed list of why it felt that way to me. I read the documentation and listened to people complain in #catalyst, and that was it.

Speaking of #catalyst, that was a big source of my dislike for FormFu. Lots of people showed up trying to do something particular with HTML and could not figure out how to do it. I saw people spend days on HTML changes that would have taken them minutes by hand. I saw how hard it was to figure out how to achieve particular results (that I can't remember anymore). This may be totally unfair of me. Maybe boatloads of people will eventually show up complaining about FormHandler in the same way. But I'm trying to be honest here (painfully so, maybe). And those complaints did play a part in my disinclination to use a package that was so hard to customize.

So I looked at the other options out there. Formbuilder was deprecated. Rose Forms was ok, but not quite right. Reaction was interesting but overkill for my application. I found Form::Processor, and it certainly wasn't perfect, but at least the thought of using it didn't depress me. I liked the architecture and I liked the way that it could automatically save forms to the database. The problem was that it didn't have a DBIC model. I decided to write one.

Unfortunately Form::Processor had almost no tests. It had a few field tests, and a small handful of non-database form tests. There were NO tests for interfacing with the database. There were no examples for interfacing with the database. So the first thing I had to do was create a CDBI example so that I could figure out how a DBIC model would work. Eventually I got the DBIC model to work, and uploaded my first package to CPAN. They did not come and arrest me for inadequate code.

I was happy enough with Form::Processor for quite a while. Then I started to use Moose in most of my new code. Form::Processor used Rose::Object. I really liked Moose and it seemed silly to be using two different object systems. The Form::Processor code, because it used Rose::Object, looked easy to convert to Moose. So in a burst of energetic yak shaving, I converted it to Moose.

Bill Moseley, the owner of Form::Processor, had some interest in moving toward Moose, but he didn't have time to work on it. There was no public repository. He had a suite of tests he wanted it to pass that I didn't have access to. There were many Moose features that couldn't be used and feature improvements that couldn't be done because they wouldn't be compatible with Bill's codebase.

I was getting some interest in a Moosified form processor from other programmers and I liked how my new code was shaping up. So I put the code up on github and released it to CPAN. Followed by deafening silence. But I had been communicating with Zbigniew Lukasiak about the new project and he had lots of experience with FormFu and ideas he wanted to try out, so he joined the project, which was a godsend. It's so much better to have other progammers contributing too.

So now it's six months from the first CPAN release. I've gotten a lot of positive feedback from people who've used it who like it. The codebase feels more stable now and we've implemented most of the large features we had in mind (though we're hoping for better rendering in the future...)

We've tried hard to make the API consistent, we've refactored to support compound and repeatable fields. We have a comprehensive test suite. It still seems to be very easy to customize - Moose helps with that. To me, it feels flexible, not cumbersome and rigid. You can use hand-built HTML if you want. The rendering is straightforward and simple to a fault. Adding new features has not been painful. I'm happy with it. YMMV.

I suspect that this was NOT what Dan had in mind with his request for 'why create the module at all'. It might be of more interest to a sociologist studying open source than somebody looking for reasons to pick a package. But it is the answer that I have, such as it is.

Perl Blogging for personal satisfaction

Okay, so I suck at dealing with certain sorts of motivational programs. In another persona I write fiction. Fiction writers have this quaint concept BIAW - Book In a Week. It's not really a book in a week. It's a group of writers who get together and commit to some writing goal in the beginning, and then cheer each other on, or boo and hiss. Whatever. It's SUPPOSED to be motivational. The idea is that committing yourself to a public goal like that is supposed to motivate you to actually do it, since you will (theoretically) be ashamed of not meeting your goal. Or be inspired by the achievements of others. Or something.

I tried joining these sorts of BIAW programs on multiple occasions. And then I noticed something odd. I was writing MORE when I wasn't enrolled in some BIAW program than when I was. Sigh.

Apparently I'm not inspired enough by the idea of not achieving what I said I would in front of other people. It actually has a *cough* negative affect. I blow off a day, and then I start resenting the whole thing. It becomes a chore, a drag, some irritating task that I'm "supposed to" be doing. It's not fun anymore.

And the Perl ironman thing has started to have the same feeling for me. Not the fault of the idea. The idea is great. It's just me and the idea that don't get along.

So in an attempt to actually fulfill the spirit of the Ironman challenge (as opposed to the rules - ewww, rules, I hate rules) henceforth I'm not going to even TRY to meet the rules for achieving the various IronPerson levels. Instead - *gasp* - I'm going to blog about Perl and programming when I feel like I have something to say. The habit has started - and that was the whole point, after all.

Now the only remaining question is ... can I think of some clever riff on Ironman for my personal perl blogging program? Um ... the MarshmallowMan? I know! the StayPuftMan!

Now that I have an inspiring symbol and everything, I can stop. I'm happy now.

Tuesday, June 30, 2009

Refactoring with roles for testing

The programming I've been doing lately is very straightforward Perl. It's hard to think what might be interesting about it... and I don't want to look like an idiot and blather on about something that everybody else was doing in kindergarten.

So maybe this is nothing exciting, but it's what I was doing yesterday.

I'm working on replacing an ancient grew-up-over-seventeen-years Perl backend system with more modern Perl. Instead of using a database for most of it, information on state and status is encoded in Linux file permissions and empty files, and in strings in files. Etc. Etc. I'm updating a large portion of it to use a semi-real database (MySQL - but in this case it really is a step upward), but I'm a one person programming team and it just wasn't feasible to rewrite the whole system in one project. As it is, the project was really too big for one person, but it just wasn't possible to do a decent job without completely re-doing a fairly large chunk.

So I get to an interface to a part of the old system, where instead of just storing some flags in the database I have to write an empty file in one directory for one state, copy a file to a different directory for another state, append a line with an identifier in it to a file for a third state... You get the idea.

At this point it's not clear whether the section I'm interfacing to will ever be rewritten. The higher-ups have gotten pretty twitchy at how much this is costing, and I'll probably have to work on some PHP project part time in the fall. So I don't know if this interface code will ever be used in any other module. But it's irritating stuff to test and debug because it's used in the middle of a fairly long and complex process.

So I plopped it all into a Moose role and created a test case with a dummy package:

   use Test::More tests => 3;
   
   {
      package Test::SomeInterface;
      use Moose;
      with 'Some::Kludgy::Interface';
   }
   my $test = Test::SomeInterface->new;
   ok( $test, 'it compiled!' );
   ok( $test->process, 'it didn't blow up!' );
   ok( $test->some_status, 'it worked!' );

So now the funky interface code is packaged off by itself and easily replaced, it was much easier to test than buried in some larger module, and if I'm lucky I'll never have to look at it again.One nice side effect of doing this was that I was forced to clean up what I was passing in to the methods instead of relying on the fairly global object attributes.

Pretty obvious, I guess. But it's yet another way in which Moose makes programming more bearable - when fun is too much to hope for. I <3 Moose.

Tuesday, June 16, 2009

Where the cleaver falls...

Way back in the dark ages, I worked on one of the first projects at IBM to use that fancy new methodology, object oriented programming. They sent the whole team to a month of OO training, and then that wasn't quite enough to bring us up to speed, so they sent us to another month with a different traning company. I remember that one month used a well-regarded (at that time) book by Grady Booch. The whole object thing wasn't that hard to understand, but the thing that I couldn't quite wrap my mind around was how you figured out what to make the objects, how to split up the problem domain into pieces. So I eagerly flipped through the Booch book, thinking that they must have some words of wisdom on this point, and finally I found it. Buried somewhere in the middle they had one page with three paragraphs which basically said: and then a miracle occurs.

Pretty much the equivalent of the humorous saying somebody I know was fond of: Where the cleaver falls, there the chicken parts. (Maybe you had to be there...)

So I was working on a comment screen that needed a captcha. I had never used one before, but I figured it couldn't be that hard. HTML::FormHandler is a form processor, and a captcha is a form element, so it seemed obvious that a captcha field for HFH was in order. That part was easy... And then I realized that in order to check the value typed in by the user, the captcha had to be stored in the session. The session (in this case) is managed by a Catalyst plugin, and for the form field to be aware of the session just seems wrong. Too much mixing of areas of concern.

So a Moose role comes to the rescue. (Perl!) I created a Moose role that contained the captcha field, plus provided 'get_captcha' and 'set_captcha' methods. This seems relatively clean - or at least better than having the field mess with the session. Of course in order to access the session, the Catalyst context or session needs to be passed in to the form object. Not ideal, but not all that bad either. But now these methods are form methods... They could be turned into callbacks, or coderefs that are set on the field, but I've already had to deal with not-entirely-clean relationships between form and field before, so for now I decide to just call them as form methods. I don't really like this, because I'd prefer that the fields not know about the object that contains them, but it's easy and it works.

Then I realize that I'll need a URL for the image in the field. Damn. This is getting messy. I suppose that I can add yet another attribute 'captcha_url' and set THAT too. But then there's another required piece of code that's neither form nor field, but a controller action. Sigh. It's too many different pieces of code for one simple thing.

Oject oriented programming is no longer the new, shiny thing that it was so long ago, but the problem of splitting up the problem into nice little pieces is still with us. That OO training was for was an early attempt at an object oriented interface to a SQL database. It failed utterly. Looking back, I have to say that the problem wasn't fully understood - or at least we didn't understand it. It was a brave attempt, but a fair amount of knowledge about the problems associated with doing ORMs has grown up, and they're better these days.

So I guess progress is being made. In some places by some people..

Friday, June 5, 2009

Flexible extensions with Moose method modifiers

When constructing an API for a library intended to be subclassed by users, you run into the classic tension between power and simplicity. Too many ways to call your code, too many different methods and the user is overwhelmed. Yet if you don't have enough power, the user may by stymied when he reaches some level of complexity.

Moose method modifiers provide a lot of flexibility that doesn't necessarily have to be provided by an explicit API. In some cases they may provide stopgap ways of extending, in other cases they may actually be better than creating more complexity by providing more explicit hooks.

The main processing method for HTML::FormHandler is pretty straightforward:

sub process
{
   my $self = shift;

   $self->clear if $self->processed;
   $self->_setup_form(@_);
   $self->validate_form if $self->has_params;
   $self->update_model if $self->validated;
   $self->processed(1);
   return $self->validated;
}

The above method is called when a form is processed:

   my $form = MyApp::Form::User->new;
   $form->process( item => $user, params => $c->req->params );

The form is something like:

   package MyApp::Form::User;
   use HTML::FormHandler::Moose;
   extends 'HTML::FormHandler::Model::DBIC';

   has_field 'username' ( required => 1 );
   has_field 'some_attr';

A common need is to have particular fields be required in some circumstances and not in others. So maybe it would be nice to have some kind of callback to determine whether the field is required or not... But a simple Moose method modifier on one of the methods called in the 'process' routine will do the trick just fine:

   before 'validate_form' => sub {
      my $self = shift;
      $self->field('some_attr')->required(1)
         if( ...some condition... );
   };

Now maybe there's some magical API syntax that could achieve the same thing, but really this is pretty straightforward and easy. Maybe there's some additional database update that you want to do that's not directly related to the fields, such as recording that the user updated his record. Another Moose method modifier comes to the rescue:

   before 'update_model' => sub {
      shift->item->user_updated;
   };

...where 'user_updated' is a method on your DBIx::Class result source that sets a flag or an updated time or whatever you want. The same result could be achieved by subclassing the methods of course, but the method modifiers can also be used in Moose roles, making it possible to split up your form pieces into nice chunks that can be reused in multiple forms.

   package MyApp::Form::Options;
   use HTML::FormHandler::Moose::Role;

   has_field 'opt_in';
   has_field 'something_else';
   after 'validate' => sub {
      ...do some specific cross validation...
   };
   before 'update_model' => sub {
      ...some db processing...
   };

And then a form class could just be a collection of roles:

   package MyApp::Form::User;
   use HTML::FormHandler::Moose;
   extends 'HTML::FormHandler::Model::DBIC';
   with 'MyApp::Form::Options';
   with 'MyApp::Form::Login';
   1;

So the power and flexibility of HTML::FormHandler's API are increased without extra code simply by using Moose and its method modifiers. When I was originally hired to program in Perl I wasn't that happy about it, partly because the native Perl OO features are weak and kludgy. But with Moose that's a thing of the past. Perl + Moose are as good as or better than any other object oriented language I've worked with.

Wednesday, May 27, 2009

Validating structured data

In an earlier post I discussed how HTML form processing, when sufficiently generalized, leads to processing structured data (by which I mean data in Perl-ish hashes and lists). Here is an example of some structured data:

my $user_record = {
   username => 'Joe Blow',
   occupation => 'Programmer',
   tags => ['Perl', 'programming', 'Moose' ],
   employer => {
      name => 'TechTronix',
      country => 'Utopia',
   },
   options => {
      flags => {
         opt_in => 1,
         email => 0,
      },
      cc_cards => [
         {
            type => 'Visa',
            number => '4248999900001010',
         },
         {
            type => 'MasterCard',
            number => '4335992034971010',
         },
      ],
   },
   addresses => [
      {
         street => 'First Street',
         city => 'Prime City',
         country => 'Utopia',
         id => 0,
      },
      {
         street => 'Second Street',
         city => 'Secondary City',
         country => 'Graustark',
         id => 1,
      },
      {
         street => 'Third Street',
         city => 'Tertiary City',
         country => 'Atlantis',
         id => 2,
      }
   ]
};

Here is the HTML::FormHandler form that defines field validators to process that structure:

{
   package Structured::Form;
   use HTML::FormHandler::Moose;
   extends 'HTML::FormHandler';

   has_field 'username';
   has_field 'occupation';
   has_field 'tags' => ( type => 'Repeatable' );
   has_field 'tags.contains' => ( type => 'Text' );
   has_field 'employer' => ( type => 'Compound' );
   has_field 'employer.name';
   has_field 'employer.country';
   has_field 'options' => ( type => 'Compound' );
   has_field 'options.flags' => ( type => 'Compound' );
   has_field 'options.flags.opt_in' => ( type => 'Boolean' );
   has_field 'options.flags.email' => ( type => 'Boolean' );
   has_field 'options.cc_cards' => ( type => 'Repeatable' );
   has_field 'options.cc_cards.type';
   has_field 'options.cc_cards.number';
   has_field 'addresses' => ( type => 'Repeatable' );
   has_field 'addresses.street';
   has_field 'addresses.city';
   has_field 'addresses.country';
   has_field 'addresses.id';

}

The names of the fields are flattened references to the elements of the structure, with special field types for Repeatable and Compound elements. These types of structures can be used to update a database with DBIx::Class (although there are limits, of course).

I've left off the actual validators, but they can be defined pretty easily using Moose types or other constraints.

  has_field 'cc_type' => ( apply => [ CCType ] );

It would probably be better to define some of the fields in a role or field, to keep some of the related validation in the same place. The validation of the credit card numbers depend on the type of the credit card, for example. Error messages can be retrieved from an array of error fields or plain error messages, but there need to be more flexible ways of getting those messages. I'm not exactly sure what people will want yet.

Real Soon Now (tm) I'm going to work on a KiokuDB model... So many programming tasks, so little time.

Tuesday, May 19, 2009

Complexity happens

I've heard a programmer's job described as 'managing complexity'. People who like programming tend to like other complex systems, like, say D&D. (They also like fantasy in general... no doubt there's some clever comment there that my cold-fogged brain can't work out.)

And yet, one of the primary goals in a programming project is to keep it simple. (Otherwise known as KISS.) So there's this continual tension between simplicity and complexity. Simplicity in one area may require complexity in another. Creating a simple, easy-to-use API often requires more underlying complexity than a non-intuitive but straightforward interface. Though there are occasionally golden moments when things fall into place and you can achieve both greater simplicity of interface AND greater code simplicity. Just don't hold your breath waiting for them...

Problems that seem simple to start with acquire complexity when you add features, when you handle more use cases. Simple, stupid things like the fact that you don't get anything in CGI parameters for an un-checked checkbox introduce irregularities that flow through to surprising corners of code. Decisions made about what it means to not have a particular parameter or have it set to empty cascade through formerly pristine and clean lines of code.

I'm not quite sure whether this is a complaint or simply a report. Sometimes the logical complexity is fascinating. You poke something to see what happens; you try some new way of factoring to see if that magical moment of greater order occurs... And then sometimes you can hardly stay awake and certainly can't concentrate, and you find yourself surfing Amazon for some new fantasy novel that's a lot more your speed.

Or desperately trying to think of something to write that's somehow remotely related to Perl programming. Because you foolishly committed to DOING THAT in a moment of insanity.

Let me just repeat that key phrase a couple of more times, in case foolish sites that count dumb things and claim they mean something aren't paying attention: Perl programming Perl programming Perl programming

Thanks. I'm done blathering now. I think I've got enough of a word count.

Tuesday, May 12, 2009

Defining the form processing problem

This week I've been discussing the goals of a form processor and working on adding support to HTML::FormHandler for multiple rows. When I first started doing web programming and learning Perl 18 months ago, I looked for a module that did what I wanted, and thought "it can't be that hard". Once I got into it, the problem started to look more and more complex. You need to map different representations of data onto each other, each level having different kinds of relationships to each other, and yet maintain the accessibility of the information from multiple representations.

Let's start with a Perl-ish structure of hashes and lists:


{
   addresses => [
      {
         city => 'Middle City',
         country => 'Graustark',
         address_id => 1,
         street => '101 Main St',
      },
      {
         city => 'DownTown',
         country => 'Utopia',
         address_id => 2,
         street => '99 Elm St',
      },
      {
         city => 'Santa Lola',
         country => 'Grand Fenwick',
         address_id => 3,
         street => '1023 Side Ave',
      },
   ],
   'occupation' => 'management',
   'user_name' => 'jdoe',
}

This structure represents a user with a user name, an occupation, and an array of addresses. This example includes an array of hashrefs, because that's what I'm working on this week... The first problem is that this structure can't be directly represented in CGI/HTTP parameters, since they don't do nested hashrefs. So in order to get this structure into and out of an HTML form, we flatten it into a hashref with names that can be munged by something like CGI::Expand:

my $params = {
   'addresses.0.city' => 'Middle City',
   'addresses.0.country' => 'Graustark',
   'addresses.0.address_id' => 1,
   'addresses.0.street' => '101 Main St',
   'addresses.1.city' => 'DownTown',
   'addresses.1.country' => 'Utopia',
   'addresses.1.address_id' => 2,
   'addresses.1.street' => '99 Elm St',
   'addresses.2.city' => 'Santa Lola',
   'addresses.2.country' => 'Grand Fenwick',
   'addresses.2.address_id' => 3,
   'addresses.2.street' => '1023 Side Ave',
   'occupation' => 'management',
   'user_name' => 'jdoe',
};

A corollary of this is that the form processing program should be able to take in structures of either type, and output at least the flattened structure so that it can be used to fill in the form with current data. Then we consider where the initial data is going to come from. Often the data is in a database, so now we have the problem of taking a database object, like a 'user' row with a relationship pointing to a number of addresses, and convert it to the flat CGI hash. And we also want to go in the opposite direction, converting a flat CGI hash into a structure suitable for putting back into the database. The data in a database (or other data soruce) isn't necessarily in a form suitable for displaying as strings in an HTML form. So there are inflation and deflation steps. The database structure and data must be mapped to a CGI structure.

Then there's the question of how to define the validators which are the main purpose of this exercise. The data that's input from the parameters passed in must be validated (and/or inflated). If there are errors, the program has to present that information in such a way that an HTML form can be constructed with the errors presented to the user for correction.

There are a number of choices to be made about how to define the fields to allow these validations and conversions to happen in a simple and regular fashion. A common solution is to treat the nested elements as subforms, but they are not actually separate forms, they are simply ... nested elements.

The way that feels best to me is to allow the definition to be done in one 'form' class, which represents one HTML form.

   package HasMany::Form::User;
   use HTML::FormHandler::Moose;
   extends 'HTML::FormHandler::Model::DBIC';

   has_field 'user_name';
   has_field 'occupation';

   has_field 'addresses' => ( type => 'Repeatable' );
   has_field 'addresses.address_id' => ( type => 'PrimaryKey' );
   has_field 'addresses.street';
   has_field 'addresses.city';
   has_field 'addresses.country';

This flat representation matches the flatness of the HTML form. The field names with dots give information to allow the creation of nested elements. The field names are also related to the database object, where 'addresses' is the DBIC relationship accessor, and street/city/country are columns in the address table. In practice there would be more to these field definitions, since there would be validators associated with them, but I'll leave them out for now to simplify the problem.

Constructing the arrays is tricky. The form object doesn't know how many elements are in the array until it is handed the information from the database or the parameters from the form. So the arrays of address fields must be cloned from the fields that have been defined and put into some structure to hold the definitions and the data. There is a choice of structures here. We can either match the

 'addresses.1.country' => 'Utopia'

format, or match the

 { addresses => []}

structure. These structures have different numbers of levels, since we have to add the ".1." level to indicate the array. It could be set up either way and mapped to the other. For the purposes of constructing HTML, however, you want to have some place to act as a container for an individual address so that it can be wrapped in a div, so the structure with the numbered level seems more useful. So now the 'HasMany' field container will create an array of field container objects (instances) that contain an address record.

Once constructed and filled, the nested fields can be accessed with

 $form->field('addresses')->field('1')->field('city')->value

or using the shortcut method

 $form->field('addresses.1.city')->value

. There's something awkward about this, because it's oddly modal. The field structures are different depending on whether the form has been filled out with data or not. The implementation which I have working right now has an array of fields (the same as other non-has_many compound fields) which is cloned into subfields which are created on the fly. It would be possible, I suppose, to have a dummy subfield to contain the field definitions. I'll have to think about that one...

In order to interface with the database object in a regular, MVC-ish way, the form program should output structured data that can be saved by the database model. Inflations may be associated with this.

So in the end, it seems like what you end up with is program which will take structured data, process it and validate it, and return structured data. This is a much more general problem than it first appears when "all" you want to do is process an HTML form.

Monday, May 4, 2009

Moose beginners: clear, predicate, and triggers

One of the nice things about Moose is that it adds another state for your instance variables -- whether or not the variable is actually set. With standard Perl variables you can check whether the variable is defined or undefined and true or false. In Moose the state of being undefined is different from the state of being set. In order to take advantage of this additional state you need to use the 'clearer' and 'predicate' methods for your attribute:

   has 'my_var' => ( isa => 'Str|Undef', is => 'rw', 
          clearer => 'clear_my_var',
          predicate => 'has_my_var' );

Setting 'my_var' to 'undef' is different than doing 'clear_my_var'. If you set it to undefined:

   $my_obj->my_var(undef);

then the predicate 'has_my_var' will return true. If you check for truth in the usual way:

   if( $my_obj->my_var ) { ... }

false will be returned for both the case where the attribute has been set to undefined and has been cleared. So you have to to use the predicate method:

   if( $my_obj->has_my_var ) { ... }

The predicate method will return true if 'my_var' has been set to undefined, and false if 'my_var' has been cleared.
An important piece of related behavior is that a trigger on an attribute is called when you set it, whether or not you are setting it to undefined, but the trigger is not called when you do a clear. So if you have an object in which only one of two variables should have a value, you can create triggers for both of them and use clear to un-set the other variable.

   has 'my_var' => ( isa => 'Str', is => 'rw',
           clearer => 'clear_my_var',
           predicate => 'has_my_var',
           trigger => sub { shift->clear_my_other_var }
   );
   has 'my_other_var' => ( isa => 'Str', is => 'rw',
           clearer => 'clear_my_other_var,
           predicate => 'has_my_other_var',
           trigger => sub { shift->clear_my_var }
   );

If you tried to set 'my_var' to undef in the trigger, you would end up in an infinite recursion, since each attempt to set the other variable would cause the trigger in that variable to fire. Another issue is whether or not you should allow a particular attribute to be set to undefined. Some attributes may need an explicit undefined state, in which case you must set your isa to 'Str|Undef', but if you don't actually need an undefined state then you are better off not allowing it and using a predicate, in which case you must clear the variable to un-set it since setting it to undef will fail.

Monday, April 27, 2009

Compound fields in forms

Many Perl form packages require nested forms of some sort for compound fields, but there are problems with that model. A form class--whether constructed from config settings in a file or declared in a Perl class--has lots of other attributes besides the fields. It made more sense to me to have a form with nested fields, where the fields can form a tree. That way attributes of the form do not have to be repeated in sub-forms. It also allows declaring subfields in the containing form, if that's what makes sense for this particular case. Here's an example of how to do a form with a nested field in HTML::FormHandler:

   package MyApp::Form;
   use HTML::FormHandler::Moose;
   extends 'HTML::FormHandler';

   has_field 'address' => ( type => 'Compound' );
   has_field 'address.street';
   has_field 'address.city';
   has_field 'address.zip' => ( type => '+Zip' );
   1;

The subfields are indicated by prefacing the field name with the name of the compound field and a dot. This form class will create a form with an 'address' field that contains an array of fields.
You can achieve the same thing by creating an 'Address' field:

   package MyApp::Field::Address;
   use HTML::FormHandler::Moose;
   extends 'HTML::FormHandler::Field::Compound';

   has_field => 'street';
   has_field => 'city';
   has_field => 'zip' => ( type => '+Zip' );
   1;

(Where '+Zip' is a field class that you have written...) Then this field can be used in a form by simply doing:

   has_field 'addresss' => ( type => '+Address' );

This feels simple and straightforward and, for me at least, matches my conceptual model of a form better than nested forms.

Thursday, April 23, 2009

HTML::FormHandler

When I started on a Catalyst project a year and a half ago I picked Form::Processor for form handling. I liked the architecture and I liked having the form in a Perl class so that I could do more-or-less anything that I wanted to with it. But then there was Moose, and I didn't want to have to switch back and forth between object systems. Plus you can do lots of shiny things with Moose. It's like a big Christmas present with lots of little parts that you can play with for ages...

So I converted Form::Processor to Moose. But it was tough to maintain back compatibility and still use all that Moose-y goodness. In the end it seemed rational to start fresh with a new project: HTML::FormHandler. It has many excellent features, such as being able to define the fields declaratively, and there are plans for many more. Moose continues to be a pleasure to work with. It makes features easy that would have been horrendously difficult without it. Of course, Reaction does wonderful things with Moose too, and is in many ways a more complete solution. But I think that HTML::FormHandler can still fill a niche. This is Perl after all--there has to be more than one way to do things.