Tuesday 3 November 2009

Ubuntu upgrade woes

After having upgraded my laptop (a Dell Latitude D420) to Ubuntu Karmic Koala, it refused to boot. The normal boot process was only showing the usual splash screen, that only consists in the Ubuntu logo, then switched to a black screen, where nothing was possible anymore -- no switching to a text console, nothing. Damn.


Here's how I fixed that (in the hope that could be useful to someone). I found out, by booting without the splash screen, that the root partition wasn't mounted on boot, which was the cause of all problems. For the record, to boot without the splash screen, you have to select the kernel you want in the grub menu, edit its command-line, and remove the words "quiet splash" from it.


So I booted on the rescue kernel (by selecting it in grub), which gave me a basic busybox shell in RAM. There, I mounted manually my root partition, /dev/sda1, to a newly created directory /grumpf. I moved /grumpf/etc/fstab away and wrote a basic working fstab with the commands:

echo proc /proc proc defaults 0 0 > /grumpf/etc/fstab
echo /dev/sda1 / ext3 defaults,errors=remount-ro 0 1 >> /grumpf/etc/fstab

Then I rebooted. In the grub selection menu, I selected the regular kernel, but edited its command-line: I replaced the part root=UUID=deafbeef... by root=/dev/sda1, actually telling grub to look up the device by symbolic name instead of UUID. At this point the computer successfully booted.


Once there, I could log in as root, edit /boot/grub/menu.lst to make permanent my changes to the kernel command-line, and complete the fstab with appropriate lines for the swap, the cdrom and my /home partition. One last reboot, and voilà, the system was fully functional again.


This doesn't explain why device UUIDs aren't supported in the boot sequence on that hardware, though.

Monday 27 July 2009

Smart match to-do

The "new" smart match in Perl 5.10.1 can still be ameliorated.

Unlike what will happen with the changes between 5.10.0 and 5.10.1, which won't be backwards-compatible, the futher improvements to smart matching will be considered only if they don't break code. However the semantics can be extended.

Here's an example. The following code, that prints "ok" under 5.10.0, will now issue a run-time error :

use 5.10.1;
$obj = bless { foo => 42 };
say "ok" if 'foo' ~~ $obj;

The new error, Smart matching a non-overloaded object breaks encapsulation, which was suggested by Ricardo Signes, avoids to access by mistake the implementation details of an object.

However, from inside a method, it can be completely correct to use ~~ on the object's underlying reference. That's no longer possible in 5.10.1, unless you make a shallow copy of the object, for example with:
'foo' ~~ { %$obj }

Which won't be very performant if you have lots of keys there.

That's why I recently added in perltodo this little item:

Currently $foo ~~ $object will die with the message "Smart matching a non-overloaded object breaks encapsulation". It would be nice to allow to bypass this by using explictly the syntax $foo ~~ %$object or $foo ~~ @$object.

Wednesday 22 July 2009

On the influence of religion and astrology on science

I have, in addition, introduced a new method of philosophizing on the basis of numbers. -- Pico della Mirandola, Oration on the Dignity of Man


Wolfgang Pauli, in addition of being a Nobelized physicist, was most interested in the mental and psychological processes that make scientific discoveries possible. In a book co-authored with C.G. Jung, The Interpretation of Nature and the Psyche, he studies how Johannes Kepler discovered his three laws. By looking at the original texts of Kepler, who was a strict protestant, Pauli shows that the image that Kepler had of the Trinity led him to accept the Copernician heliocentrism, and to invent the idea that the movements of the planets could be measured and mathematically described.

The main enemy of Kepler was Robert Fludd, an English physician and astrologer. Fludd had a background in alchemy, kabbalah and mysticism. For him, the heavens (the macrocosm) was the reflection of the human body (the microcosm), and vice-versa: consequently, applying mathematical rules to the movements of planets was implying a negation of free will. (It's to be noted at this point that both Kepler and Fludd believed that planets were living beings.)

The same gap between Fludd's and Kepler's presuppositions can be seen on their approach to astrology. Fludd believed that astrology worked because of the mystical correspondence between heavens and earth. Kepler supposes action on the human mind induced by remote sources of light placed at certain angles -- the same angles that appear in Kepler's second law. As Pauli notes, Kepler's thesis is experimentally verifiable, but Kepler didn't seem to think that, if he his correct about astrology, artificial sources of light would have the same effect. Here too, Kepler completely externalizes the physical processes, something that Fludd refuses to do on religious grounds.

It's remarkable to note that Fludd's conceptions made him refuse Kepler's approach to astronomy, but enabled him to correctly discover the principle of blood circulation. That is, as the human body is like the heavens, where planets revolve around the Sun, image of God, then in the body blood must revolve around the heart, where is located the Soul, image of God. Or at least that's how Fludd saw it.

Even with false assumptions, both men made science advance. The conclusion of Pauli somehow was that, while all assumptions need to be scrutinized with skepticism, only the results will validate a scientific theory; but that those assumptions are precisely what makes the scientists creative. So, what kind of assumptions led to Pauli's Exclusion Principle?

Tuesday 21 July 2009

Deprecation in the Perl core

chromatic goes on arguing that deprecated language constructs deserve to be removed just because they're deprecated.

I'll argue, at the contrary, that deprecated constructs should be removed only when they stand in the way of a performance improvement or of a bug fix. Which is precisely why, for example, pseudo-hashes, empty package names or $* got removed in Perl 5.10. In other cases the deprecation warning is more of a "bad style, don't do this" message -- until someone finds a good reason to remove the deprecated construct.

Here are a few examples, which are a panel of different cases:


  • The $[ variable used to be a waste of memory. Now re-implemented as a %^H element, its existence has now no runtime memory impact (as long as it's not used). It's very possible, however, that it has a speed impact (I'd be glad too see a benchmark). If someone demonstrates that perl can be made faster by removing it, it's worth removing. (Note that it's deprecated in bleadperl currently.)

  • The apostrophe syntax for namespaces (foo'bar instead of foo::bar) has been used quite recently for its cuteness value (like the function isn't). It's not yet deprecated. Moreover the core still has Perl 4 files that use it. (Those would be worth considering for removal, by the way -- they're not maintained.) However, it's largely considered bad style to use this syntax and it can be confusing (as in print "$name's"). Adding a deprecation warning would be nice to warn users against potential errors, but I think that removing it would cause too many problems; so, that deprecation warning is likely to stay for some value of ever.

  • The defined(@foo) and defined(%bar) syntax is actually only deprecated for lexicals -- not for globals. (You can check.) Some deep magic uses them for stashes. I'm not sure about the details, but a comment in toke.c asserts that removing them would break the Tk module. So that construct can't be removed without some heavy engineering to devise a replacement.

  • Finally, the "split to @_ in scalar context" deprecation. Removing assignment to @_ would change the behaviour of split, and that's what caused the discussion on P5P: if we remove it and replace it by something more sensible, code developed on 5.12 might cause problems when run on an older perl. Opinions differ here. I would personally favor removing it unconditionally, which is likely to be chromatic's point of view too (for once). More conservative opinions were issued.


Now, let's have a look at the two arguments against deprecation and removal that are misrepresented by chromatic:

The arguments for retaining these features are likewise simple: modifying code may cause bugs and removing features may break existing code.

First, "modifying code may cause bugs". This meme surfaced on P5P about a very complex and hairy Perl module with almost no test (ExtUtils::ParseXS). However the code changes we're talking about here are C changes. Written properly, C is a more straightforward language than Perl, and the test coverage of the core is excellent. So I don't buy that argument at all and I don't think anyone knowledgeable used it seriously about core changes. Also, if we're speaking only about adding a warning, it's usually a very simple change. Removal itself might be more complex, but still a lot simpler than adding features. And features get added.

Secondly, "removing features may break existing code". Right, but are we talking about deprecation of removal ? chromatic seems to suppose that deprecation must necessarily be followed by removal. He says: The problem, I believe, is that there's little impetus to migrate away from deprecated features; features can remain deprecated for 15 years. But this is not a problem -- this is a sensible policy decision. Unlike mere deprecation, removal of constructs will break code, so we must balance it with convenience, and proceed only when it brings more benefits than inconveniences for Perl users.

Basically the only real argument against removal of features is precisely the one that chromatic persists in ignoring: preservation of backward compatibility, to avoid gratuitous breakage of Perl programs. But instead of repeating myself on that point, let me finish this post by a quote:

Perl has broken backward compatibility in the past where I judged very few people would be affected.
-- Larry Wall in P5P.

Friday 17 July 2009

The job of the pumpking

The job of a pumpking is a difficult one. It demands technical and human skills, and constant commitment.

Historically, the pumpking has been the one who applied most of the patches. (That's actually the origin of the word -- the pumpking holds a semaphore known as the patch pumpkin.) Applying a patch is never completely trivial. Even if the code change is clear enough, you'll have to look at the regression tests, and eventually write some; make sure the docs are still correct and complete (and eventually write some); if applicable, increment the version numbers of the core modules, or contact the maintainer of the module if it's dual life.

Sometimes, the code change is not clear at all, or really big, or it touches areas of the code that the pumpking is not familiar with. In this case he has to learn about the surrounding code, or ask experts in that field (if any are around and willing to respond), understand how the patch works, determine what kind of performance or footprint impact it will have, think about possible better implementations, detect potential subtle regressions, and consider test coverage of the new code.

Even if it's a doc patch, he will have to check that the new documentation is right, doesn't duplicate or contradict information elsewhere, and is written clearly enough and in a grammatically correct English. (Which makes it only more difficult for non-native pumpkings.)

Putting it shortly, patch application is a time consuming activity, and it demands a fair amount of concentration. In any case, that's not something you can do when you have ten minutes free between two tasks.

Is it rewarding? Well, not really. Applying patches is a lot like scratching other people's itches. You don't get much time left to work on what would interest you personally. Like, fixing funny bugs, or adding new functionalities. And people tend to get upset when their patches are not applied fast enough.

Sometimes you have to reject a patch. You do this in the most diplomatic way possible, because the person who sent it usually is full of good will, and volunteered some of his own free time, and you don't want to drive volunteers off. So you ask, could I have some more docs with that, because I don't understand it fully? or: what problem does it solve for you? or: excuse me, but have you considered this alternate implementation that might be better suited? Care to work on it a bit more? Thanks. Even when you know at the first sight that a patch is totally worthless, you can't reply, "ur patch iz worthless, lol", you have to try to be pedagogical about why the proposed patch is a bad idea.

And all of this, you do on borrowed time, because you're not paid for it. You could be sleeping, or cooking, or having fun with your family and your friends, or enjoying a nice day's weather. No, you stay inside and you apply other people's patches. And why do you do it? Because you like Perl, you want to make it better, and you're not doing a bad job at it. And probably because no-one else would be doing it for you.

And of course no-one except the previous pumpkings fully realize what all of this really means. No. You take decisions, lead by technical expertise and years of familiarity with perl5-porters. Sometimes heat is generated, but that settles down quickly and stays contained within the mailing list. You listen to all parties and you show respect. And you take an informed decision, trying to remember, as Chip Salzenberg noted, that good synthesis is different from good compromise, although they're easy to mistake for each other, and that only the first one leads to good design. All of this is normal, and expected.

Being a pumpking is difficult. It's demanding; it's not rewarding. But that's not why I quitted. I quitted because I can't continue to do it under constant fire. I want my work to be properly criticized based on technical considerations, not to be denigrated. (Also, I want a pony, too. I realize that it's difficult to deal with people who are in the middle of the "hazard" peak of this graph.) That's also why I would refuse to be paid to be a full-time pumpking: in my day job, I get recognition.

Actually, I don't think that a new pumpking will step up, and I think that this will be for the best. P5P probably needs to transition from pumpkingship to a more oligarchic form of governance. More people need to take the responsibility to review and apply patches. More people need to back up the technical decisions that are made. A vocal consensus is stronger than a single pumpking, and it will force us to write down a policy, which will be a good thing and increase the bus factor of Perl. Moreover, a system involving many committers would scale better. All committers are volunteering time. It's a scarce resource, but can be put in common. And new major Perl releases are not going to happen without more copious free time.

As a side-effect, if many people start handling the patch backlog, that means that I'll be able to devote time to long-standing issues without feeling guilty. Like the UTF-8 cleanup I have been speaking about.

For once, I plan to take some vacations this summer -- away from P5P. That didn't happened to me since years. I think I deserved it.

Friday 10 July 2009

The strictperl experiment

The strictperl experiment by chromatic, and the strict-by-default patch by Steffen Mueller, will help me explain what approaches are right or wrong in dealing with Perl 5 evolutions.

strictperl is, as chromatic puts it, unilaterally enabling strict for all code not run through -e. Unsurprisingly, that breaks a lot of code. And I mean a lot, as in "most". Not only on the DarkPAN (I seldom use strict for scripts of four lines), on the CPAN (I have modules that will break under strictperl), but in the core itself. chromatic mentions Exporter and vars as modules that break under it. Well, of course they break. Their very purpose is to manipulate symbols by name, which is exactly the kind of thing that strict prevents. (Incidentally all code that uses Exporter or vars can't run under strict perl, and I think that's most of the Perl code out there actually.) That is why chromatic added a separate makefile target to build strictperl: if the patch was going in perl itself, perl couldn't be even built! That's how broken it is.

It's certainly interesting to dive in Perl's internals, but to experiment with enabling strict on a global level, a simple source filter would have been sufficient. One could have written one in ten minutes, including time to look up the documentation, and it would have been immediately obvious afterwards why it was a bad idea. Except to chromatic, who still thinks it's (quoting) a feature I believe is worth considering.

So now let's look at Steffen Mueller's solution, which is strictperl done right, and which is already in bleadperl.

Currently, with bleadperl, a simple use 5.11.0 will enable strictures in the current lexical scope, like use strict would do, but implicitly. (It will also enable features, but that's unrelated.) Look:

$ bleadperl -e 'use 5.11.0; print $foo'
Global symbol "$foo" requires explicit package name at -e line 1.
Execution of -e aborted due to compilation errors.

Moreover, like in chromatic's strictperl, a simple -e or -E won't enable them, because strictures are not wanted for one-liners.

That way, we don't break code that doesn't want strictures, (actually we don't break anything at all that doesn't require 5.11.0 already, it's completely backwards compatible), but it's still removing some boilerplate for default strictness if you're requesting a perl recent enough.

Steffen's patch itself is not very recent, but I didn't apply it to bleadperl immediately, because I disagreed with the implementation. As you can see in the sources, it uses Perl_load_module to load the file strict.pm and execute its import() method. That's very slow. I applied it when I got around to make it faster with the next commit, which replaces that method call by just flipping three bits on an internal integer variable.

All this goodness is coming to you in Perl 5.12 when someone will be willing to take the pumpking's hat.

Next time, I'll explain why enabling warnings by default is not a good idea.

Monday 6 July 2009

Resigning

So I'm resigning from my role of Perl 5 pumpking. That doesn't mean that I'm not proud of what I did for Perl 5 in the past, or that I don't stand behind my choices anymore.

Let's begin by inspecting some of the core ideas that drove the chromatic rants since the last six months, and examine why they are a sure recipe for failure.

First, the regular snapshots, labeled releases. When you cannot control how much development time you'll have for the next release, nor the list of hot topics that the contributors will volunteer to contribute to, you don't release easily; at the best, you snapshot. Which is fine for alpha-grade software as parrot, as I've already said; but not for production-ready software. Did I point out that Perl was a very complex project with a lot of interdependent parts? Care is essential.

(As a side-effect, it might quite be easy to predict the release dates of parrot for the next century, but not the date where it will be used in production. If that happens at all, that is.)

Secondly, a lack of feature plan for the next versions of Perl. Again, this is not something that you can control in a volunteer-based project. I've said many times that I'd like 5.12 to solve some inconsistencies on the handling of UTF-8 strings with regard to case-insensitivity comparisons and case-changing operators. That would be a backwards incompatible change and would break code -- contrary to the gross mischaracterisations that chromatic presents as truth, P5P is certainly not opposed to backwards incompatibility. But to achieve that, you need a regex + UTF-8 guru, or somebody willing to invest countless hours into becoming one. If someone comes with another itch to scratch, sends patches, implements a neat new feature, then that would warrant a shiny 5.12 without the UTF-8 revamp.

Thirdly, the "untested code is not worth caring about" fallacy. One can try to wave away the DarkPAN into oblivion with a blog post, but if Perl still exists for more important purposes than the amusement of a few computer language geeks, it's because of the DarkPAN. And not all the DarkPAN code is regression-tested, or even testable. Quick glue scripts, crontabbed email reports, network management helpers, post-commit hooks: they cost too much to test, because the environment they run in is orders of magnitude more complex than their internal logic. I've code out there in the DarkPAN that deals with the relative replication delays on two pairs of master and slaves databases. That's not testable. But this code is critical.

Fourthly, the main argument chromatic has to back up his proposals for Perl 5 is "we should change defaults and add syntactic sugar without thinking about the consequences because that will allow me to rip off three lines of boilerplate code." The hints I gave at the time on P5P that maybe adding syntax would be a better idea if there was some semantics behind it were then aggressively relabelled as reactionary. (As a coincidence, modernperlbooks.com started its anti-P5P propaganda shortly afterwards.)

At the contrary, every language extension that went into 5.10 was there because it solved an actual problem, in a more efficient way than the CPAN could provide, and sufficiently thought out (I hope) to avoid being sorry having to maintain it in a couple of versions from now. (But now I'm not so sure about UNIVERSAL::DOES() anymore.)

Fifthly, the way chromatic has arbitrary chosen one particular regression from 5.8 to 5.10 and presented it as if it was as serious as a remote root zero-day vulnerability, willingly ignoring every other regression (or improvement) in the hope of making a point. Note, that regression wasn't even a bug in the language, something that could have made Perl programs misbehave or segfault. It was a performance regression. In a world where most programs are I/O bound anyway. And there were many other more interesting regressions to choose from, I won't hide that. So why this one? Certainly because of its marketing value. It was certainly sexier than a more severe regression on an obscure feature he would have needed to explain to his readers.

At the end, I've had enough of those gratuitous attacks. I've always accepted and even encouraged criticism about my decisions as a pumpking, but only as long as there were based on technical arguments, not on marketing slogans repeated hysterically by someone who remains deaf to any form of discussion. So I'm stepping down from my role of pumpking. I'm burnt out. I don't want to have to justify myself again and again in front of all this.

Moreover, if my resignation can help Perl 5, it will be for the greater good. There are many committers and knoledgeable contributors, and they'll probably start reviewing the patches to apply a bit more: avoiding bottlenecks is good. The release process will be documented and distributed (and thanks to Dave Mitchell for having worked a lot on this) and the whole bus factor of Perl 5 will go up. Don't worry, I'll still be around to ensure that the future of Perl 5 is not handed to the marketroids, and to produce the occasional patch. But I'm withdrawing from the front line. Which will have, I'm sure, positive effects in the end.

Perlbuzz is no longer useful

I've been since a long time disappointed at perlbuzz, but here's another blatant proof of their lack of professionalism. This supposed "news roundup" post points to several articles by chromatic and his fanboys, and supplement them by pure FUD, like the retransmission of a twitter message saying "Is getting code into the !perl core a fight against inertia and petty hostility?", linking to an anonymous comment on a random web forum. If this is not calumny, I don't know what it is. It's just unfair to see perlbuzz relaying it without checking.

However, I note that none of the rebuttals written by me or by others elsewhere have deserved a mention.

Someone is actively trying to damage the community image of P5P out there, and perlbuzz is helping him. Willingly?

Sunday 5 July 2009

Time-based releases in open source

Whenever I hear someone saying we should have regular releases, I hear we should release when Venus enters Pisces. Because, when you have so many uncontrolled parameters to deal with, sticking to a predefined calendar boils down to superstition.

Time-based releases actually make sense in one case : when you're selling your software. That way, you can ensure a regular stream of revenue via upgrades. Moreover, if you're a commercial entity, you probably are already paying full-time a couple of developers, so you can predict how much time you'll spend on the next version during the next few months.

All those premises are not true for volunteer-based open source projects, such as Perl.

Actually, time-based releases still make sense if the project is in an alpha stage and is being prototyped. In this case, however, it's more accurate to speak about snapshots than releases, because such a "release" is only a way to publicize the newest changelog. That's the case for parrot, for example: it doesn't aim for backwards compatibility, or even stability, and has no user base. That would be also the case for the development branch of Perl 5 (currently 5.11), although nobody got around to do it since the migration to git -- I agree that would be a nice improvement.

In the specific case of Perl 5, even if you wanted to try to have regular releases, more factors would complicate the task. You can't release bleadperl at any point and call it stable, even when all dual-life modules are perfectly in sync with the CPAN and when all tests pass on all platforms and all configurations. That's because you also need, for a new major release, to have a consistent set of features. For example, it would not have made sense to release 5.10.0 with the lexical $_ variable, but without the (_) prototype.

With the upcoming release of 5.10.1 by Dave Mitchell, we're trying to improve the release process to be less of a pain for the pumpkings. Changes to dual-life modules will no longer be accepted in the core unless they are put on CPAN first. Module::Corelist will refer to tags in the git repository instead of perforce change numbers. The perldelta will hopefully be written a bit more incrementally.

I plan to copy 5.10.1's perldelta in bleadperl when it's done, then to supplement it by blead-only changes. Afterwards I'll have more elements to judge whether we're ready for a 5.12.0 or not -- and that decision will be motivated by the contents of the release, not by the time of the year.

Saturday 4 July 2009

The DarkPAN matters

There is currently some FUD going about in some parts of the Perl community about why we should break Perl 5 backwards compatibility. A short blog entry, schmarkpan, is a good example of the trend: loud, noisy, but clueless and devoid of any content.

The author writes, DarkPAN was discussed quite a bit at YAPC::NA. But really, what is it? How is it defined? Well, DarkPAN is just a slang word to express the fact that Perl 5 is now currently used all over the world in production and sometimes critical systems: websites with high availibility, financial systems, operating system build processes, and so on. That would also include the so-called GreyPAN, all Perl OSS applications that are not on CPAN. So, asking this question, even rhetorically, makes you sound as if you didn't knew that Perl is actually used in production.

He continues, Who are these people that have a vested interested in Perl and yet do not participate? For a start, I would like to point out that Mr Perez himself is someone with apparently some interest in Perl, but who does not participate in Perl 5 at the slightest. I might be biased, but I tend to think that the regular contributors of perl5-porters are a lot more likely to have informed opinions about the Perl 5 core development than people who don't even read it.

I think attempting to appease a faceless entity with no defined boundaries has fail written all over it. It is fear mongering. Who's the fear monger here? And who's trying to be realistic by attempting to release software with some quality expectations, notably by making the upgrade process seamless and introducing as little bugs as possible? It's a well-known political technique to accuse the opposite party of being guilty of one's own defects. But it doesn't make you right.

If we break it, they can choose not to upgrade. Several fallacies in one single sentence. First, we should not consider breaking it. Breakage happens, but our goal is to avoid it. Secondly, breakage is detected after upgrades, usually (or it would have been avoided). Third, this not only applies to Perl 5, but also to every open source project widely used (in the Perl world, DBI or LWP come to mind -- upgrading them is not a trivial operation either.) This is a matter of common sense.

Then, suddenly and without warning, the discussion shifts from breaking backwards compatibility to simply breaking Perl: What about bug fixes?, I hear you say. That is the joy of open source. There are lots of unemployed hackers out there that would be willing to backport those fixes for you. Oh heavens forbid! Actually paying open source hackers to write code! The sky is falling! Wishful thinking at its finest. Mr Perez might have found a tuit tree in the middle of a magic garden, but I have never seen "lots of unemployed hackers" who are at the same time fluent in the Perl internals, and lots of generous donators that fight over the privilege of paying them to work on Perl. Actually, there is one, currently: Dave Mitchell, who is paid to get 5.10.1 released. Yes, paid: the sky is falling.

Perl does not belong to DarkPAN. It belongs to us who participate in its wellbeing. In other words, "fuck the users". But we're not in the land of toy projects here. We can't bless any change (or revert it two releases afterwards) just because it looks shiny at the time. This is not the parrot you're looking for. People do use Perl 5 for serious things, and we must ensure that a certain level of quality is preserved.

But have you, Mr Perez, even skimmed through a perldelta manpage, to look at the list of incompatible changes? Because there's actually a lot of them, in spite of what you seem to believe. I had code that broke because strict is stricter in 5.10 (as mentioned in the manual that you didn't bother to read.) ghc broke because we removed the magical variable $*. It's not like we fear incompatible change. We just don't like to introduce incompatible changes just for the sake of being incompatible.

I say we make the decisions best for our language of choice. But code changes don't happen because someone yells about it on a blog. They happen because someone actually writes code. If you really want to see something changed in Perl 5, write a patch and come discussing it on P5P. Although you might not have an idea about what you might want to see changed in Perl 5, since your empty, uninformed and slightly inconsistent rant fails to identify any precise problem.

Saturday 20 June 2009

The new \N regex escape

At the French Perl Workshop 2009, a talk on Perl 6 parsing by Stéphane Payrard reminded me of the existence of \N in Perl 6. It's a regex escape that is the exact opposite of \n : it will match any character that is not a newline, independently from the presence or absence of the single line match modifier /s. So I have now added it in Perl 5. This is my first feature contribution to the regex engine!

Sunday 14 June 2009

Back from the FPW 2009

The French Perl Workshop 2009 has ended. I took some pictures (not much). I had the occasion to try on a smaller French audience the talk I'll give in Lisbon for YAPC::Europe.

One of the interesting presentations was made by the rtgi guys, who mapped the CPAN authors and modules, as well as the Perl community as seen on the web, on cpan-explorer.org. Enjoy the maps.

Saturday 13 June 2009

The Future of Perl 5

Bjarne Stroustrup is quoted having said: It is my firm belief that all successful languages are grown and not merely designed from first principles. Being a perlist, (or a perlian? a perlard?), I can only agree. Perl 5 was grown, and continues to grow; although most of its growth is now outsourced to the CPAN (with modules like Devel::Declare, Method::Signatures, Moose, MooseX::*, and so on.)

Thus, I see little added value in adding new, complex syntax to Perl 5. My preference goes to making Perl 5 easier to extend. (Here you can feel the influence of the Perl 6 approach.) The core's job is not to duplicate and override the efforts made on CPAN to extend Perl, but to encourage them. CPAN is, after all, the killer app of Perl.

The big advantages of outsourcing syntax experiments to CPAN is that the community can run many experiments in parallel, and have those experiments reach a larger public (because installing a CPAN module is easier than compiling a new perl); that enables also to prune more quickly the failed approaches. This is a way to optimize and dynamize innovation.

So, a patch to add a form of subroutine parameter declaration, or to add some syntactic sugar for declaring classes, are probably not going to be included in Perl 5 today. Those would extend the syntax, but not help extensibility -- actually they would hinder it, imposing and freezing a single core way to do things. And saying that there is only one way to do it is a bit unperlish. It's basically for the same reason that we don't add an XML parser or a template preprocessor in the core: no single, one-size-fits-all solution has emerged.

Now, if a one single way of declaring subroutine parameters or classes emerges and stabilizes, it makes sense to add it in the core, and even to re-implement it in C, for efficiency reasons. But whatever is added will also impose some backwards compatibility requirements on the future core releases: we must be careful to avoid getting stuck with useless or ugly syntax. -- In turn, that means that new syntax can eventually be added for purely aesthetic reasons (like the stacking of filetest operators in 5.10).

Those general considerations don't mean, however, that all new syntax is to be ruled out from the core: if some new syntax is introduced in a way that improves the internals or can be taken advantage of by CPAN modules, it's worthwhile to include. For example, we can consider adding a way to declare methods differently than ordinary subroutines (possibly via a new keyword method that would supplement sub) : so we can forbid calling them as subroutines, or by magic-goto from a subroutine. We could also add a way for a subroutine to know whether it has been called as a method or as a subroutine. That kind of thing. Improving the possibilities of introspection and self-checking of Perl do improve its extensibility.

What about, then, the future directions of Perl 5? The Big Picture? The priorities? The Plan for 5.12 and beyond?

New syntax is nice and shiny. But for me, there are more important and urgent features that are needed now. New syntax introduces new bugs, and, Perl 5 being what it is, new edge cases; we should aim to reduce those instead.

In my view, (I could even use the word vision, but I won't do it), the future of Perl 5 should be mostly organised around two directions:


  • clean-up and orthogonalisation

  • giving more facilities for extensibility



To elaborate on that a bit:

Clean-up and orthogonalisation : the big TO-DO here for 5.12 is the clean-up of the internal handling of UTF-8 strings and the abstraction leakage that ensues in some corners of it. (This was referred to as the Unicode bug by many people.) Briefly, perl builtins like lc() or uc(), or regex metacharacters like \w, have a behaviour that depends on whether the string they operate on have the internal UTF-8 flag set. This shouldn't be the case -- that flag should be kept strictly internal. Fixing this will make the life easier to many people that handle Unicode strings with Perl, but it will break backwards compatibility. That's why it's not planned for a 5.10.x release.

Another field that could be improved is the behaviour of autovivification, which is currently not extremely consistent. Sometimes also autovivification is annoying -- a common example is a test like exists($h->{a}{b}), that autovivs $h->{a}.

Any one of those two cleanups, once implemented, would be important enough to deserve a 5.12.0 release (at least that's what I'm thinking this week).

Giving more facilities for extensibility : Providing more hooks to module writers. Perl is pretty hookable already; but the creativity of modules writers has no limits. Vincent for example was talking yesterday about making the internal function op_free(), used to free code from the memory, hookable -- which would help for some evil manipulations of eval(), the details of which I don't remember at the moment. More generally, a better API for manipulating the optree internals would be useful. I would like to have more hooks in the tokenizer as well -- Devel::Declare, for example, could benefit from this.

Once those goals are achieved, it will be time to add new syntax, on steadier grounds. The core will never have an XML parser, because the diversity of needs for parsing XML makes the diversity of modules necessary and welcomed; this is not true for object models -- many competing object models are not necessarily a good thing, especially inside a same application. But large-scale experimentation on CPAN enabled the community to make Moose much better than whatever a handful of P5Pers could have designed by themselves.

And now I shall STFUAWSC.

Tuesday 5 May 2009

Evolution of smart matching

The semantics of the smart match (~~) operator, introduced in Perl 5.10.0, are undergoing major changes. They are justified partly by the divergence with Perl 6's smart match, and partly by the inconvenience of the current semantics.

The major change that drove all other ajdustments is that now, ~~ is no longer symmetrical, but the dispatch will be done based on the type of the matcher (the right hand side) only. The type of the matchee will be taken into account only in a second phase.

The loss of symmetry however allowed to introduce better distributivity (in the ~~ %hash and ~~ @array cases.)

Another consequence is that overloading the ~~ operator will be taken into account only when the object is on the right of the match. Allowing the matchee to invoke ~~ overload would interfere with the property of distributivity, notably in the $object ~~ @array case.

The full semantics can be checked out in the perlsyn man page in the smartmatch branch of the Perl 5 source repository. I consider those docs final (unless Larry invokes Rule One...)

I plan to do a short talk on the new smart match in Lisbon, at the next YAPC::EU -- more detailed than the braindump you're currently reading!

The implementation is still in progress. Tests, notably, are needed, if you want to help (because my tuits are quite short those days...)

Tuesday 7 April 2009

Deprecating $[

I just committed in perl a patch by James Mastros that emits a deprecation warning each time $[ is assigned to. This patch will go in the 5.12.X series.

What does this mean exactly? This only means that we discourage using $[ in any new code now. This doesn't necessarily mean that $[ will be removed in the 5.14.0. Some variables or features have been deprecated for a longer time before being actually removed. However, $[ is likely to be removed in 5.14, and here's why:

Basically, $[ allows the programmer to adjust the index of the first element of an array -- by default, 0. $[ has been in the past the source of many worrisome behaviours, and has evolved to accomodate them.

In Perl 4, if I remember correctly, $[ was global. This was not a good idea. Of course, you could set it with local(), but called subroutines would still see the new value. That's why $[ began to be treated like a lexical pragma in Perl 5.0; that way, all side-effects were avoided. The "downside" of this was that $[ needed to be assigned at compile-time. That could be surprising to "clever" programmers in search of novel obfuscation techniques.

However, memory was needed to keep track of the value of $[, in each lexical scope. That's why in 5.10 $[ was removed from the global compiler hints structure and was stored in the improved version of the %^H lexical hints variable. I'll spare you the details (perlpragma has some of them), but, shortly, that allowed us to reclaim the memory used for $[ (at least in programs that didn't use it).

So, now, in 5.10, the existence of $[ has no memory impact on perl. However, it's possible that it still has a small run-time impact. Removing the code that tests whether $[ exists and is non-zero before accessing an array element could make perl run a tiny bit faster. And that's why we're now deprecating $[.

Thanks to James Mastros for Friendly Doing It. And, by the way: that was a one-line patch to the C code of the interpreter. Useful stuff doesn't need to be difficult to implement!

Tuesday 31 March 2009

Autovivification in Perl 5

On the Perl 5 Porters mailing list, Graham Barr points at an interesting bit of history.

The question discussed is autovivification of non-existent hash elements when used in an lvalue context. Usually, they are autovivified: (this is with 5.10.0)


$ perl -E '1 for $x{a}; say for keys %x'
a
$ perl -E 'map $_, $x{a}; say for keys %x'
a

But not when used as subroutine arguments:

$ perl -E 'sub f{} f($x{a}); say for keys %x; say "END"'
END

Graham says that sub args were special cased out during 5.005 development, for unclear reasons.

A small bit of inconsistency is that hash slices are autovivified in sub calls:

$ perl -E 'sub f{} f(@x{a}); say for keys %x'
a

We have no good overview of the current state of autovivification behaviour in Perl 5. It's known that it can change with strictures. The current perltodo notes: Make all autovivification consistent w.r.t LVALUE/RVALUE and strict/no strict. (It's a known bug that strictures can affect autovivification; I wonder whether taintedness can affect this too.)

A good first step towards consistency would be to produce a matrix of autovivification for a bunch of data types (array elements, hash elements, hash slices, tied or not) and a bunch of operators that take lvalues (map, grep, foreach, sub call, chained -> dereferences, etc.) with and without strictures/taint. Then, that could be turned into a regression test, and finally we could tweak the behaviour in 5.12 to make it more consistent.

Want to help?

Friday 27 March 2009

Polybius at the funeral

Plutarch, in his Life of Philopoemen, mentions that a young man named Polybius was carrying the urn of the general:

They burnt his body, and put the ashes into an urn, and then marched homeward, not as in an ordinary march, but with a kind of solemn pomp, half triumph, half funeral, crowns of victory on their heads, and tears in their eyes, and their captive enemies in fetters by them. Polybius, the general's son, carried the urn, so covered with garlands and ribbons as scarcely to be visible; and the noblest of the Achaeans accompanied him.

Polybius, is, of course, the great historian, and also one of the major sources of Plutarch for the second Punic war and the conquest of the Greece by Rome. But Plutarch does not mention that. He just expects his reader to know who he's talking about.

Do we read here, between the lines, Plutarch's secret regret of not having lived in interesting times -- of not having something original to write on, and of being a mere compiler? Was Plutarch dreaming of being a Thucydides or a Polybius, who, like Clausewitz twenty centuries afterwards, were unlucky officers before becoming great historians?

Or maybe Plutarch, who was a subject of the Roman Caesar, but still Greek and proud of it, didn't want to insist on, but rather allude to, the image of the historian of the downfall of Greece, in his youth, taking part in the funeral of the man who was nicknamed, by the Romans themselves, "the last of the Greeks".

Thursday 26 March 2009

Sons of a snake

Both Alexander the Great and P. Cornelius Scipio Africanus the Elder were said to be sons of Zeus (respectively Jupiter), the god having taken the shape of a giant snake to impregnate their mothers. It's difficult to judge which part Alexander and Scipio themselves had in the fabrication of those legends. According to Plutarch, Alexander asserted his divine parenthood when he was talking with Asians and Egyptians, but not with Macedonians and Greeks. Scipio never asserted it, but never negated it either. Moreover, Scipio was familiar with the Greek culture, so he might have just copied Alexander's legend for his own political reasons. Also, Alexander's legend might not be completely unrelated to the legend of Buddha's birth, notably the part where his mother dreams about being pregnant of a powerful animal.

The theme of the great general son of a god in the Indo-European history and mythology would be interesting to explore. I don't see what role does the snake symbol play in this system, but it reminds me a bit of the Celtic Melusine.