Friday, 10 July 2009

The strictperl experiment

The strictperl experiment by chromatic, and the strict-by-default patch by Steffen Mueller, will help me explain what approaches are right or wrong in dealing with Perl 5 evolutions.

strictperl is, as chromatic puts it, unilaterally enabling strict for all code not run through -e. Unsurprisingly, that breaks a lot of code. And I mean a lot, as in "most". Not only on the DarkPAN (I seldom use strict for scripts of four lines), on the CPAN (I have modules that will break under strictperl), but in the core itself. chromatic mentions Exporter and vars as modules that break under it. Well, of course they break. Their very purpose is to manipulate symbols by name, which is exactly the kind of thing that strict prevents. (Incidentally all code that uses Exporter or vars can't run under strict perl, and I think that's most of the Perl code out there actually.) That is why chromatic added a separate makefile target to build strictperl: if the patch was going in perl itself, perl couldn't be even built! That's how broken it is.

It's certainly interesting to dive in Perl's internals, but to experiment with enabling strict on a global level, a simple source filter would have been sufficient. One could have written one in ten minutes, including time to look up the documentation, and it would have been immediately obvious afterwards why it was a bad idea. Except to chromatic, who still thinks it's (quoting) a feature I believe is worth considering.

So now let's look at Steffen Mueller's solution, which is strictperl done right, and which is already in bleadperl.

Currently, with bleadperl, a simple use 5.11.0 will enable strictures in the current lexical scope, like use strict would do, but implicitly. (It will also enable features, but that's unrelated.) Look:

$ bleadperl -e 'use 5.11.0; print $foo'
Global symbol "$foo" requires explicit package name at -e line 1.
Execution of -e aborted due to compilation errors.

Moreover, like in chromatic's strictperl, a simple -e or -E won't enable them, because strictures are not wanted for one-liners.

That way, we don't break code that doesn't want strictures, (actually we don't break anything at all that doesn't require 5.11.0 already, it's completely backwards compatible), but it's still removing some boilerplate for default strictness if you're requesting a perl recent enough.

Steffen's patch itself is not very recent, but I didn't apply it to bleadperl immediately, because I disagreed with the implementation. As you can see in the sources, it uses Perl_load_module to load the file strict.pm and execute its import() method. That's very slow. I applied it when I got around to make it faster with the next commit, which replaces that method call by just flipping three bits on an internal integer variable.

All this goodness is coming to you in Perl 5.12 when someone will be willing to take the pumpking's hat.

Next time, I'll explain why enabling warnings by default is not a good idea.

Monday, 6 July 2009

Resigning

So I'm resigning from my role of Perl 5 pumpking. That doesn't mean that I'm not proud of what I did for Perl 5 in the past, or that I don't stand behind my choices anymore.

Let's begin by inspecting some of the core ideas that drove the chromatic rants since the last six months, and examine why they are a sure recipe for failure.

First, the regular snapshots, labeled releases. When you cannot control how much development time you'll have for the next release, nor the list of hot topics that the contributors will volunteer to contribute to, you don't release easily; at the best, you snapshot. Which is fine for alpha-grade software as parrot, as I've already said; but not for production-ready software. Did I point out that Perl was a very complex project with a lot of interdependent parts? Care is essential.

(As a side-effect, it might quite be easy to predict the release dates of parrot for the next century, but not the date where it will be used in production. If that happens at all, that is.)

Secondly, a lack of feature plan for the next versions of Perl. Again, this is not something that you can control in a volunteer-based project. I've said many times that I'd like 5.12 to solve some inconsistencies on the handling of UTF-8 strings with regard to case-insensitivity comparisons and case-changing operators. That would be a backwards incompatible change and would break code -- contrary to the gross mischaracterisations that chromatic presents as truth, P5P is certainly not opposed to backwards incompatibility. But to achieve that, you need a regex + UTF-8 guru, or somebody willing to invest countless hours into becoming one. If someone comes with another itch to scratch, sends patches, implements a neat new feature, then that would warrant a shiny 5.12 without the UTF-8 revamp.

Thirdly, the "untested code is not worth caring about" fallacy. One can try to wave away the DarkPAN into oblivion with a blog post, but if Perl still exists for more important purposes than the amusement of a few computer language geeks, it's because of the DarkPAN. And not all the DarkPAN code is regression-tested, or even testable. Quick glue scripts, crontabbed email reports, network management helpers, post-commit hooks: they cost too much to test, because the environment they run in is orders of magnitude more complex than their internal logic. I've code out there in the DarkPAN that deals with the relative replication delays on two pairs of master and slaves databases. That's not testable. But this code is critical.

Fourthly, the main argument chromatic has to back up his proposals for Perl 5 is "we should change defaults and add syntactic sugar without thinking about the consequences because that will allow me to rip off three lines of boilerplate code." The hints I gave at the time on P5P that maybe adding syntax would be a better idea if there was some semantics behind it were then aggressively relabelled as reactionary. (As a coincidence, modernperlbooks.com started its anti-P5P propaganda shortly afterwards.)

At the contrary, every language extension that went into 5.10 was there because it solved an actual problem, in a more efficient way than the CPAN could provide, and sufficiently thought out (I hope) to avoid being sorry having to maintain it in a couple of versions from now. (But now I'm not so sure about UNIVERSAL::DOES() anymore.)

Fifthly, the way chromatic has arbitrary chosen one particular regression from 5.8 to 5.10 and presented it as if it was as serious as a remote root zero-day vulnerability, willingly ignoring every other regression (or improvement) in the hope of making a point. Note, that regression wasn't even a bug in the language, something that could have made Perl programs misbehave or segfault. It was a performance regression. In a world where most programs are I/O bound anyway. And there were many other more interesting regressions to choose from, I won't hide that. So why this one? Certainly because of its marketing value. It was certainly sexier than a more severe regression on an obscure feature he would have needed to explain to his readers.

At the end, I've had enough of those gratuitous attacks. I've always accepted and even encouraged criticism about my decisions as a pumpking, but only as long as there were based on technical arguments, not on marketing slogans repeated hysterically by someone who remains deaf to any form of discussion. So I'm stepping down from my role of pumpking. I'm burnt out. I don't want to have to justify myself again and again in front of all this.

Moreover, if my resignation can help Perl 5, it will be for the greater good. There are many committers and knoledgeable contributors, and they'll probably start reviewing the patches to apply a bit more: avoiding bottlenecks is good. The release process will be documented and distributed (and thanks to Dave Mitchell for having worked a lot on this) and the whole bus factor of Perl 5 will go up. Don't worry, I'll still be around to ensure that the future of Perl 5 is not handed to the marketroids, and to produce the occasional patch. But I'm withdrawing from the front line. Which will have, I'm sure, positive effects in the end.

Perlbuzz is no longer useful

I've been since a long time disappointed at perlbuzz, but here's another blatant proof of their lack of professionalism. This supposed "news roundup" post points to several articles by chromatic and his fanboys, and supplement them by pure FUD, like the retransmission of a twitter message saying "Is getting code into the !perl core a fight against inertia and petty hostility?", linking to an anonymous comment on a random web forum. If this is not calumny, I don't know what it is. It's just unfair to see perlbuzz relaying it without checking.

However, I note that none of the rebuttals written by me or by others elsewhere have deserved a mention.

Someone is actively trying to damage the community image of P5P out there, and perlbuzz is helping him. Willingly?

Sunday, 5 July 2009

Time-based releases in open source

Whenever I hear someone saying we should have regular releases, I hear we should release when Venus enters Pisces. Because, when you have so many uncontrolled parameters to deal with, sticking to a predefined calendar boils down to superstition.

Time-based releases actually make sense in one case : when you're selling your software. That way, you can ensure a regular stream of revenue via upgrades. Moreover, if you're a commercial entity, you probably are already paying full-time a couple of developers, so you can predict how much time you'll spend on the next version during the next few months.

All those premises are not true for volunteer-based open source projects, such as Perl.

Actually, time-based releases still make sense if the project is in an alpha stage and is being prototyped. In this case, however, it's more accurate to speak about snapshots than releases, because such a "release" is only a way to publicize the newest changelog. That's the case for parrot, for example: it doesn't aim for backwards compatibility, or even stability, and has no user base. That would be also the case for the development branch of Perl 5 (currently 5.11), although nobody got around to do it since the migration to git -- I agree that would be a nice improvement.

In the specific case of Perl 5, even if you wanted to try to have regular releases, more factors would complicate the task. You can't release bleadperl at any point and call it stable, even when all dual-life modules are perfectly in sync with the CPAN and when all tests pass on all platforms and all configurations. That's because you also need, for a new major release, to have a consistent set of features. For example, it would not have made sense to release 5.10.0 with the lexical $_ variable, but without the (_) prototype.

With the upcoming release of 5.10.1 by Dave Mitchell, we're trying to improve the release process to be less of a pain for the pumpkings. Changes to dual-life modules will no longer be accepted in the core unless they are put on CPAN first. Module::Corelist will refer to tags in the git repository instead of perforce change numbers. The perldelta will hopefully be written a bit more incrementally.

I plan to copy 5.10.1's perldelta in bleadperl when it's done, then to supplement it by blead-only changes. Afterwards I'll have more elements to judge whether we're ready for a 5.12.0 or not -- and that decision will be motivated by the contents of the release, not by the time of the year.

Saturday, 4 July 2009

The DarkPAN matters

There is currently some FUD going about in some parts of the Perl community about why we should break Perl 5 backwards compatibility. A short blog entry, schmarkpan, is a good example of the trend: loud, noisy, but clueless and devoid of any content.

The author writes, DarkPAN was discussed quite a bit at YAPC::NA. But really, what is it? How is it defined? Well, DarkPAN is just a slang word to express the fact that Perl 5 is now currently used all over the world in production and sometimes critical systems: websites with high availibility, financial systems, operating system build processes, and so on. That would also include the so-called GreyPAN, all Perl OSS applications that are not on CPAN. So, asking this question, even rhetorically, makes you sound as if you didn't knew that Perl is actually used in production.

He continues, Who are these people that have a vested interested in Perl and yet do not participate? For a start, I would like to point out that Mr Perez himself is someone with apparently some interest in Perl, but who does not participate in Perl 5 at the slightest. I might be biased, but I tend to think that the regular contributors of perl5-porters are a lot more likely to have informed opinions about the Perl 5 core development than people who don't even read it.

I think attempting to appease a faceless entity with no defined boundaries has fail written all over it. It is fear mongering. Who's the fear monger here? And who's trying to be realistic by attempting to release software with some quality expectations, notably by making the upgrade process seamless and introducing as little bugs as possible? It's a well-known political technique to accuse the opposite party of being guilty of one's own defects. But it doesn't make you right.

If we break it, they can choose not to upgrade. Several fallacies in one single sentence. First, we should not consider breaking it. Breakage happens, but our goal is to avoid it. Secondly, breakage is detected after upgrades, usually (or it would have been avoided). Third, this not only applies to Perl 5, but also to every open source project widely used (in the Perl world, DBI or LWP come to mind -- upgrading them is not a trivial operation either.) This is a matter of common sense.

Then, suddenly and without warning, the discussion shifts from breaking backwards compatibility to simply breaking Perl: What about bug fixes?, I hear you say. That is the joy of open source. There are lots of unemployed hackers out there that would be willing to backport those fixes for you. Oh heavens forbid! Actually paying open source hackers to write code! The sky is falling! Wishful thinking at its finest. Mr Perez might have found a tuit tree in the middle of a magic garden, but I have never seen "lots of unemployed hackers" who are at the same time fluent in the Perl internals, and lots of generous donators that fight over the privilege of paying them to work on Perl. Actually, there is one, currently: Dave Mitchell, who is paid to get 5.10.1 released. Yes, paid: the sky is falling.

Perl does not belong to DarkPAN. It belongs to us who participate in its wellbeing. In other words, "fuck the users". But we're not in the land of toy projects here. We can't bless any change (or revert it two releases afterwards) just because it looks shiny at the time. This is not the parrot you're looking for. People do use Perl 5 for serious things, and we must ensure that a certain level of quality is preserved.

But have you, Mr Perez, even skimmed through a perldelta manpage, to look at the list of incompatible changes? Because there's actually a lot of them, in spite of what you seem to believe. I had code that broke because strict is stricter in 5.10 (as mentioned in the manual that you didn't bother to read.) ghc broke because we removed the magical variable $*. It's not like we fear incompatible change. We just don't like to introduce incompatible changes just for the sake of being incompatible.

I say we make the decisions best for our language of choice. But code changes don't happen because someone yells about it on a blog. They happen because someone actually writes code. If you really want to see something changed in Perl 5, write a patch and come discussing it on P5P. Although you might not have an idea about what you might want to see changed in Perl 5, since your empty, uninformed and slightly inconsistent rant fails to identify any precise problem.

Saturday, 20 June 2009

The new \N regex escape

At the French Perl Workshop 2009, a talk on Perl 6 parsing by Stéphane Payrard reminded me of the existence of \N in Perl 6. It's a regex escape that is the exact opposite of \n : it will match any character that is not a newline, independently from the presence or absence of the single line match modifier /s. So I have now added it in Perl 5. This is my first feature contribution to the regex engine!

Sunday, 14 June 2009

Back from the FPW 2009

The French Perl Workshop 2009 has ended. I took some pictures (not much). I had the occasion to try on a smaller French audience the talk I'll give in Lisbon for YAPC::Europe.

One of the interesting presentations was made by the rtgi guys, who mapped the CPAN authors and modules, as well as the Perl community as seen on the web, on cpan-explorer.org. Enjoy the maps.

Saturday, 13 June 2009

The Future of Perl 5

Bjarne Stroustrup is quoted having said: It is my firm belief that all successful languages are grown and not merely designed from first principles. Being a perlist, (or a perlian? a perlard?), I can only agree. Perl 5 was grown, and continues to grow; although most of its growth is now outsourced to the CPAN (with modules like Devel::Declare, Method::Signatures, Moose, MooseX::*, and so on.)

Thus, I see little added value in adding new, complex syntax to Perl 5. My preference goes to making Perl 5 easier to extend. (Here you can feel the influence of the Perl 6 approach.) The core's job is not to duplicate and override the efforts made on CPAN to extend Perl, but to encourage them. CPAN is, after all, the killer app of Perl.

The big advantages of outsourcing syntax experiments to CPAN is that the community can run many experiments in parallel, and have those experiments reach a larger public (because installing a CPAN module is easier than compiling a new perl); that enables also to prune more quickly the failed approaches. This is a way to optimize and dynamize innovation.

So, a patch to add a form of subroutine parameter declaration, or to add some syntactic sugar for declaring classes, are probably not going to be included in Perl 5 today. Those would extend the syntax, but not help extensibility -- actually they would hinder it, imposing and freezing a single core way to do things. And saying that there is only one way to do it is a bit unperlish. It's basically for the same reason that we don't add an XML parser or a template preprocessor in the core: no single, one-size-fits-all solution has emerged.

Now, if a one single way of declaring subroutine parameters or classes emerges and stabilizes, it makes sense to add it in the core, and even to re-implement it in C, for efficiency reasons. But whatever is added will also impose some backwards compatibility requirements on the future core releases: we must be careful to avoid getting stuck with useless or ugly syntax. -- In turn, that means that new syntax can eventually be added for purely aesthetic reasons (like the stacking of filetest operators in 5.10).

Those general considerations don't mean, however, that all new syntax is to be ruled out from the core: if some new syntax is introduced in a way that improves the internals or can be taken advantage of by CPAN modules, it's worthwhile to include. For example, we can consider adding a way to declare methods differently than ordinary subroutines (possibly via a new keyword method that would supplement sub) : so we can forbid calling them as subroutines, or by magic-goto from a subroutine. We could also add a way for a subroutine to know whether it has been called as a method or as a subroutine. That kind of thing. Improving the possibilities of introspection and self-checking of Perl do improve its extensibility.

What about, then, the future directions of Perl 5? The Big Picture? The priorities? The Plan for 5.12 and beyond?

New syntax is nice and shiny. But for me, there are more important and urgent features that are needed now. New syntax introduces new bugs, and, Perl 5 being what it is, new edge cases; we should aim to reduce those instead.

In my view, (I could even use the word vision, but I won't do it), the future of Perl 5 should be mostly organised around two directions:


  • clean-up and orthogonalisation

  • giving more facilities for extensibility



To elaborate on that a bit:

Clean-up and orthogonalisation : the big TO-DO here for 5.12 is the clean-up of the internal handling of UTF-8 strings and the abstraction leakage that ensues in some corners of it. (This was referred to as the Unicode bug by many people.) Briefly, perl builtins like lc() or uc(), or regex metacharacters like \w, have a behaviour that depends on whether the string they operate on have the internal UTF-8 flag set. This shouldn't be the case -- that flag should be kept strictly internal. Fixing this will make the life easier to many people that handle Unicode strings with Perl, but it will break backwards compatibility. That's why it's not planned for a 5.10.x release.

Another field that could be improved is the behaviour of autovivification, which is currently not extremely consistent. Sometimes also autovivification is annoying -- a common example is a test like exists($h->{a}{b}), that autovivs $h->{a}.

Any one of those two cleanups, once implemented, would be important enough to deserve a 5.12.0 release (at least that's what I'm thinking this week).

Giving more facilities for extensibility : Providing more hooks to module writers. Perl is pretty hookable already; but the creativity of modules writers has no limits. Vincent for example was talking yesterday about making the internal function op_free(), used to free code from the memory, hookable -- which would help for some evil manipulations of eval(), the details of which I don't remember at the moment. More generally, a better API for manipulating the optree internals would be useful. I would like to have more hooks in the tokenizer as well -- Devel::Declare, for example, could benefit from this.

Once those goals are achieved, it will be time to add new syntax, on steadier grounds. The core will never have an XML parser, because the diversity of needs for parsing XML makes the diversity of modules necessary and welcomed; this is not true for object models -- many competing object models are not necessarily a good thing, especially inside a same application. But large-scale experimentation on CPAN enabled the community to make Moose much better than whatever a handful of P5Pers could have designed by themselves.

And now I shall STFUAWSC.

Tuesday, 5 May 2009

Evolution of smart matching

The semantics of the smart match (~~) operator, introduced in Perl 5.10.0, are undergoing major changes. They are justified partly by the divergence with Perl 6's smart match, and partly by the inconvenience of the current semantics.

The major change that drove all other ajdustments is that now, ~~ is no longer symmetrical, but the dispatch will be done based on the type of the matcher (the right hand side) only. The type of the matchee will be taken into account only in a second phase.

The loss of symmetry however allowed to introduce better distributivity (in the ~~ %hash and ~~ @array cases.)

Another consequence is that overloading the ~~ operator will be taken into account only when the object is on the right of the match. Allowing the matchee to invoke ~~ overload would interfere with the property of distributivity, notably in the $object ~~ @array case.

The full semantics can be checked out in the perlsyn man page in the smartmatch branch of the Perl 5 source repository. I consider those docs final (unless Larry invokes Rule One...)

I plan to do a short talk on the new smart match in Lisbon, at the next YAPC::EU -- more detailed than the braindump you're currently reading!

The implementation is still in progress. Tests, notably, are needed, if you want to help (because my tuits are quite short those days...)