Tuesday, 5 May 2009

Evolution of smart matching

The semantics of the smart match (~~) operator, introduced in Perl 5.10.0, are undergoing major changes. They are justified partly by the divergence with Perl 6's smart match, and partly by the inconvenience of the current semantics.

The major change that drove all other ajdustments is that now, ~~ is no longer symmetrical, but the dispatch will be done based on the type of the matcher (the right hand side) only. The type of the matchee will be taken into account only in a second phase.

The loss of symmetry however allowed to introduce better distributivity (in the ~~ %hash and ~~ @array cases.)

Another consequence is that overloading the ~~ operator will be taken into account only when the object is on the right of the match. Allowing the matchee to invoke ~~ overload would interfere with the property of distributivity, notably in the $object ~~ @array case.

The full semantics can be checked out in the perlsyn man page in the smartmatch branch of the Perl 5 source repository. I consider those docs final (unless Larry invokes Rule One...)

I plan to do a short talk on the new smart match in Lisbon, at the next YAPC::EU -- more detailed than the braindump you're currently reading!

The implementation is still in progress. Tests, notably, are needed, if you want to help (because my tuits are quite short those days...)

Tuesday, 7 April 2009

Deprecating $[

I just committed in perl a patch by James Mastros that emits a deprecation warning each time $[ is assigned to. This patch will go in the 5.12.X series.

What does this mean exactly? This only means that we discourage using $[ in any new code now. This doesn't necessarily mean that $[ will be removed in the 5.14.0. Some variables or features have been deprecated for a longer time before being actually removed. However, $[ is likely to be removed in 5.14, and here's why:

Basically, $[ allows the programmer to adjust the index of the first element of an array -- by default, 0. $[ has been in the past the source of many worrisome behaviours, and has evolved to accomodate them.

In Perl 4, if I remember correctly, $[ was global. This was not a good idea. Of course, you could set it with local(), but called subroutines would still see the new value. That's why $[ began to be treated like a lexical pragma in Perl 5.0; that way, all side-effects were avoided. The "downside" of this was that $[ needed to be assigned at compile-time. That could be surprising to "clever" programmers in search of novel obfuscation techniques.

However, memory was needed to keep track of the value of $[, in each lexical scope. That's why in 5.10 $[ was removed from the global compiler hints structure and was stored in the improved version of the %^H lexical hints variable. I'll spare you the details (perlpragma has some of them), but, shortly, that allowed us to reclaim the memory used for $[ (at least in programs that didn't use it).

So, now, in 5.10, the existence of $[ has no memory impact on perl. However, it's possible that it still has a small run-time impact. Removing the code that tests whether $[ exists and is non-zero before accessing an array element could make perl run a tiny bit faster. And that's why we're now deprecating $[.

Thanks to James Mastros for Friendly Doing It. And, by the way: that was a one-line patch to the C code of the interpreter. Useful stuff doesn't need to be difficult to implement!

Tuesday, 31 March 2009

Autovivification in Perl 5

On the Perl 5 Porters mailing list, Graham Barr points at an interesting bit of history.

The question discussed is autovivification of non-existent hash elements when used in an lvalue context. Usually, they are autovivified: (this is with 5.10.0)


$ perl -E '1 for $x{a}; say for keys %x'
a
$ perl -E 'map $_, $x{a}; say for keys %x'
a

But not when used as subroutine arguments:

$ perl -E 'sub f{} f($x{a}); say for keys %x; say "END"'
END

Graham says that sub args were special cased out during 5.005 development, for unclear reasons.

A small bit of inconsistency is that hash slices are autovivified in sub calls:

$ perl -E 'sub f{} f(@x{a}); say for keys %x'
a

We have no good overview of the current state of autovivification behaviour in Perl 5. It's known that it can change with strictures. The current perltodo notes: Make all autovivification consistent w.r.t LVALUE/RVALUE and strict/no strict. (It's a known bug that strictures can affect autovivification; I wonder whether taintedness can affect this too.)

A good first step towards consistency would be to produce a matrix of autovivification for a bunch of data types (array elements, hash elements, hash slices, tied or not) and a bunch of operators that take lvalues (map, grep, foreach, sub call, chained -> dereferences, etc.) with and without strictures/taint. Then, that could be turned into a regression test, and finally we could tweak the behaviour in 5.12 to make it more consistent.

Want to help?

Friday, 27 March 2009

Polybius at the funeral

Plutarch, in his Life of Philopoemen, mentions that a young man named Polybius was carrying the urn of the general:

They burnt his body, and put the ashes into an urn, and then marched homeward, not as in an ordinary march, but with a kind of solemn pomp, half triumph, half funeral, crowns of victory on their heads, and tears in their eyes, and their captive enemies in fetters by them. Polybius, the general's son, carried the urn, so covered with garlands and ribbons as scarcely to be visible; and the noblest of the Achaeans accompanied him.

Polybius, is, of course, the great historian, and also one of the major sources of Plutarch for the second Punic war and the conquest of the Greece by Rome. But Plutarch does not mention that. He just expects his reader to know who he's talking about.

Do we read here, between the lines, Plutarch's secret regret of not having lived in interesting times -- of not having something original to write on, and of being a mere compiler? Was Plutarch dreaming of being a Thucydides or a Polybius, who, like Clausewitz twenty centuries afterwards, were unlucky officers before becoming great historians?

Or maybe Plutarch, who was a subject of the Roman Caesar, but still Greek and proud of it, didn't want to insist on, but rather allude to, the image of the historian of the downfall of Greece, in his youth, taking part in the funeral of the man who was nicknamed, by the Romans themselves, "the last of the Greeks".

Thursday, 26 March 2009

Sons of a snake

Both Alexander the Great and P. Cornelius Scipio Africanus the Elder were said to be sons of Zeus (respectively Jupiter), the god having taken the shape of a giant snake to impregnate their mothers. It's difficult to judge which part Alexander and Scipio themselves had in the fabrication of those legends. According to Plutarch, Alexander asserted his divine parenthood when he was talking with Asians and Egyptians, but not with Macedonians and Greeks. Scipio never asserted it, but never negated it either. Moreover, Scipio was familiar with the Greek culture, so he might have just copied Alexander's legend for his own political reasons. Also, Alexander's legend might not be completely unrelated to the legend of Buddha's birth, notably the part where his mother dreams about being pregnant of a powerful animal.

The theme of the great general son of a god in the Indo-European history and mythology would be interesting to explore. I don't see what role does the snake symbol play in this system, but it reminds me a bit of the Celtic Melusine.

Tuesday, 14 October 2008

Git: on rebasing

(This is a follow-up to How remotes work).

We've seen how git manages to merge your local changes when you pull from a remote repository.

This approach has a small downside, aesthetically: that is, the creation of a large number of merge commits, making the history more difficult to read. Wouldn't it be nice if git offered you the possibility to simply re-apply your local changes on top of what you just pulled?

Rejoice, because that's what the git rebase command is for.

git rebase origin/master will take the local commits (that are reachable from the master head, but not from origin/master), remove them from the commit tree, and re-apply them on top of origin/master, before moving the master head to the top of the new line of commits it just created. That way, the history is kept linear:

You can afterwards just push your new commits.

Important warning: git rebase changes your commits. Because their place in the tree will be different, their SHA1 will be different as well; and the old ones will disappear. For that reason, you must not manipulate commits with rebase if you have already published them in a shared repository from which someone else might have fetched.

Rebasing is a powerful tool that will enable you to manipulate your branches, moving lines of commits from one location to another. The git-rebase man page has more examples.

Friday, 10 October 2008

Git: how remotes work

One of the difficult things for a git beginner to understand is how remote branches work.

Basically, as git is a distributed version control system, every developer has a full and independent repository. So, how can you pass changes around?

In the examples below, we'll consider a remote repository, that we'll call origin, and a local one (that we'll call local). The remote repository has one branch, called master, that has been cloned as origin/master on the local repository. Moreover, the local repository has one local branch, called master as well (but it doesn't need to be), which is set up to track changes that happen on origin/master. Note that, as origin/master is a remote branch, it cannot be checked out -- only master can.

The fetch operation (command git fetch) copies the latest commits from the master on origin to origin/master, and updates the HEAD of the origin/master branch:

git fetch origin
The circles on the schema (click for a larger version) represent commits, and the arrows are the parent->child relationship between commits. The labels indicate the various HEADs (or branches). It is to be noted that a branch is nothing more than a label following the HEAD of a series of commits.

At the end of this operation, origin/master matches the master branch on the origin, but master on the local repository is still behind. We need to use the git merge command to make master point at the same commit than origin/master:
git merge origin/master
This kind of merge is called a fast-forward because no actual merging of changes is involved. No new commit is created; we have just moved a HEAD forward in history. And that's fast.

Now, what happens if you committed a change on your master on local? Nothing changes for the fetch; the two new commits are still created from origin's master so origin/master matches it exactly:
git fetch origin
However, on local, master and origin/master have bifurcated. To reunite them, you'll need to use git merge, that will create another commit, and make master point to it:
git merge origin/master
The new commit (in orange) is a merge commit: it has two parents. (If conflicts happens, git will ask you to resolve them.)

Ah, but now, your master has two more commits that the origin's master. And you surely want to share your changes with your fellow developers. That's where the git push command is useful:
git push origin

git push will start by copying your two commits to the origin, and ask it to update its master to point at the same location than yours. At the end of the operation, both commit trees should match exactly. Note that git push will refuse to push if your origin/master is not up to date.

The "origin" argument to git fetch and git push is optional; git will use all remotes if you don't specify one.

Finally, a note: as fetch and merge are often done together, a git command combines both: git pull. It's smarter than the addition of the two commands, because it knows how to look in your git config what remote branch is actually tracked by your current local branch, and merge from there -- so you don't even need to type the name of origin/master for the merge.

Next time, we'll speak about rebasing.

Monday, 6 October 2008

Git tip: rewind master, keep head in a branch

Imagine that you just committed something on your master branch, and suddendly realize that you'll have to work a bit more on it. Wouldn't it be great to have committed this last patch on a branch instead?

Git allows you to do this. You can rewind the master by one patch, while retaining the current HEAD in a new branch.

First, create a new branch (let's call it newbranch) that points at your HEAD:

git branch newbranch
Then, rewind the master by one commit:
git reset --hard HEAD^
That's all. You just moved your last commit on its own branch. If you want to continue working on top of it, you just have to
git checkout newbranch
and start hacking.
(Now, this is a very entry-level git tip, but those kind of examples, that demonstrate what makes git different from centralized version control systems, have certainly a good pedagogical value. More to come when I have tuits.)

Tuesday, 23 September 2008

Back from the Italian Perl Workshop

I'm back from Pisa, where I attended the Italian Perl Workshop 2008 last week (kindly invited by the organizers).


This was a very enjoyable conference. Several talks were in English, but most of them were in Italian: however, it was still easy for me to understand what was going on, partly because Italian is not really far from French, partly because of my already acquired familiarity with this language, and finally because good slides always help. (I'm sure I would have had a lot more trouble understanding presentations in Dutch or in Spanish, for example.)


I presented two talks; the first one, on Thursday, was a bit improvised: it was for the first part a plea in favour of writing code that stylistically will avoid clobbering the history in a version control system; for a second part, a live demonstration of a tool I wrote for consulting the history of a git repository from within vim. (More on that tool later.) My second talk, on Friday, was a presentation of the main new features in perl 5.10 -- basically an updated and translated version of the talk I gave at the French Perl Workshop 2007 in Lyon. The fact that I had a cold did not help my voice and my throat was aching after both presentations. I'm not sure I want to hear the audio recordings that were made.


We stayed the whole week-end in Pisa afterwards, visiting the city. Have I mentioned how much I enjoy Italy? Each time I go there I wish I could stay longer; and if I had to choose a country to live in besides France, that would certainly be Italy. (Moreover, they have the best food in the world. And the wines are not bad either.)


I come back with a to-do list that includes starting using Devel::NYTProf, which was presented by Tim Bunce; and looking at Matt Trout's Devel::Declare craziness. It seems that I have joined #moose on the perl IRC network, too. Those things just happen after conferences. And that's why you should go to conferences: they build the community much more than mailing lists and IRC channels. A great thanks to all conference organizers!


I didn't take many pictures of the IPW, but they're on flickr already. More sightseeing shots will follow.