Saturday, 13 June 2009

The Future of Perl 5

Bjarne Stroustrup is quoted having said: It is my firm belief that all successful languages are grown and not merely designed from first principles. Being a perlist, (or a perlian? a perlard?), I can only agree. Perl 5 was grown, and continues to grow; although most of its growth is now outsourced to the CPAN (with modules like Devel::Declare, Method::Signatures, Moose, MooseX::*, and so on.)

Thus, I see little added value in adding new, complex syntax to Perl 5. My preference goes to making Perl 5 easier to extend. (Here you can feel the influence of the Perl 6 approach.) The core's job is not to duplicate and override the efforts made on CPAN to extend Perl, but to encourage them. CPAN is, after all, the killer app of Perl.

The big advantages of outsourcing syntax experiments to CPAN is that the community can run many experiments in parallel, and have those experiments reach a larger public (because installing a CPAN module is easier than compiling a new perl); that enables also to prune more quickly the failed approaches. This is a way to optimize and dynamize innovation.

So, a patch to add a form of subroutine parameter declaration, or to add some syntactic sugar for declaring classes, are probably not going to be included in Perl 5 today. Those would extend the syntax, but not help extensibility -- actually they would hinder it, imposing and freezing a single core way to do things. And saying that there is only one way to do it is a bit unperlish. It's basically for the same reason that we don't add an XML parser or a template preprocessor in the core: no single, one-size-fits-all solution has emerged.

Now, if a one single way of declaring subroutine parameters or classes emerges and stabilizes, it makes sense to add it in the core, and even to re-implement it in C, for efficiency reasons. But whatever is added will also impose some backwards compatibility requirements on the future core releases: we must be careful to avoid getting stuck with useless or ugly syntax. -- In turn, that means that new syntax can eventually be added for purely aesthetic reasons (like the stacking of filetest operators in 5.10).

Those general considerations don't mean, however, that all new syntax is to be ruled out from the core: if some new syntax is introduced in a way that improves the internals or can be taken advantage of by CPAN modules, it's worthwhile to include. For example, we can consider adding a way to declare methods differently than ordinary subroutines (possibly via a new keyword method that would supplement sub) : so we can forbid calling them as subroutines, or by magic-goto from a subroutine. We could also add a way for a subroutine to know whether it has been called as a method or as a subroutine. That kind of thing. Improving the possibilities of introspection and self-checking of Perl do improve its extensibility.

What about, then, the future directions of Perl 5? The Big Picture? The priorities? The Plan for 5.12 and beyond?

New syntax is nice and shiny. But for me, there are more important and urgent features that are needed now. New syntax introduces new bugs, and, Perl 5 being what it is, new edge cases; we should aim to reduce those instead.

In my view, (I could even use the word vision, but I won't do it), the future of Perl 5 should be mostly organised around two directions:


  • clean-up and orthogonalisation

  • giving more facilities for extensibility



To elaborate on that a bit:

Clean-up and orthogonalisation : the big TO-DO here for 5.12 is the clean-up of the internal handling of UTF-8 strings and the abstraction leakage that ensues in some corners of it. (This was referred to as the Unicode bug by many people.) Briefly, perl builtins like lc() or uc(), or regex metacharacters like \w, have a behaviour that depends on whether the string they operate on have the internal UTF-8 flag set. This shouldn't be the case -- that flag should be kept strictly internal. Fixing this will make the life easier to many people that handle Unicode strings with Perl, but it will break backwards compatibility. That's why it's not planned for a 5.10.x release.

Another field that could be improved is the behaviour of autovivification, which is currently not extremely consistent. Sometimes also autovivification is annoying -- a common example is a test like exists($h->{a}{b}), that autovivs $h->{a}.

Any one of those two cleanups, once implemented, would be important enough to deserve a 5.12.0 release (at least that's what I'm thinking this week).

Giving more facilities for extensibility : Providing more hooks to module writers. Perl is pretty hookable already; but the creativity of modules writers has no limits. Vincent for example was talking yesterday about making the internal function op_free(), used to free code from the memory, hookable -- which would help for some evil manipulations of eval(), the details of which I don't remember at the moment. More generally, a better API for manipulating the optree internals would be useful. I would like to have more hooks in the tokenizer as well -- Devel::Declare, for example, could benefit from this.

Once those goals are achieved, it will be time to add new syntax, on steadier grounds. The core will never have an XML parser, because the diversity of needs for parsing XML makes the diversity of modules necessary and welcomed; this is not true for object models -- many competing object models are not necessarily a good thing, especially inside a same application. But large-scale experimentation on CPAN enabled the community to make Moose much better than whatever a handful of P5Pers could have designed by themselves.

And now I shall STFUAWSC.

3 comments:

Michael said...

At first it would seem that some of your arguments are in opposition to what chromatic is writing about on his blog, but I don't think they really are.

You talk about why having a patch for "syntactic sugar for declaring classes" is probably not going to be included in Perl 5 today. But then you later say that "improving the possibilities of introspection and self-checking of Perl do improve its extensibility.". That's what I see happening with a declarative class syntax and with subroutine parameters. Make things more declarative. Methods are different from subroutines and classes are different from packages. We've known these differences for a long time and Moose is making it easier to declaratively mark them as different.

Maybe it's just a matter of time or incubation for you, but I don't think Moose is going anywhere. The foundations are strong and it's influence is spreading faster than any Perl module I've ever seen. So having support for a declarative class syntax that works with Moose and which Moose could build on to make other things would be a welcome thing in Perl. And in my opinion, now would be a good time. Especially since it seems that Perl 5 development happens so slowly, if it was started now we wouldn't see it for another 2-3 years in production :)

Robin said...

I've thought for years that formal subroutine parameters are a prime example of a feature that can (and therefore should!) be implemented in a way that improves the internals. If we had formal parameters then it should be possible to just leave the params on the stack and access them there directly from the subroutine, rather than having to do the messy and inefficient business of bundling everything up into @_.

I strongly suspect that lack of formal parameters is the main reason calling a sub is so irredeemably slow. At least that was the conclusion I reached the last time I thought about how it could be speeded up.

To me formal parameters are a prime example of something that should be implemented in core, not only because of the real prospect of efficiency gains but also because it's so syntactically uncontroversial.

Theory said...

I think that the comparison between “adding new syntax” and “adding an XML parser” is not valid. An XML parser is not, after all, syntax. chromatic's class patch, OTOH, is new syntax, syntax that's pretty unquestionably stable (hence the proliferation of modules in the class namespace, mine included. It seems to me that, once a minor issue about making the class keyword work at runtime is worked out, the Moose folks would be happy. And as someone who wrote a module that creates a class(&) function, all I can say is, yes please!.

So I think that there is unambiguous, extremely stable syntax that could be added to Perl 5, and very few would complain, IMHO.

—Theory