Tuesday, 31 March 2009

Autovivification in Perl 5

On the Perl 5 Porters mailing list, Graham Barr points at an interesting bit of history.

The question discussed is autovivification of non-existent hash elements when used in an lvalue context. Usually, they are autovivified: (this is with 5.10.0)


$ perl -E '1 for $x{a}; say for keys %x'
a
$ perl -E 'map $_, $x{a}; say for keys %x'
a

But not when used as subroutine arguments:

$ perl -E 'sub f{} f($x{a}); say for keys %x; say "END"'
END

Graham says that sub args were special cased out during 5.005 development, for unclear reasons.

A small bit of inconsistency is that hash slices are autovivified in sub calls:

$ perl -E 'sub f{} f(@x{a}); say for keys %x'
a

We have no good overview of the current state of autovivification behaviour in Perl 5. It's known that it can change with strictures. The current perltodo notes: Make all autovivification consistent w.r.t LVALUE/RVALUE and strict/no strict. (It's a known bug that strictures can affect autovivification; I wonder whether taintedness can affect this too.)

A good first step towards consistency would be to produce a matrix of autovivification for a bunch of data types (array elements, hash elements, hash slices, tied or not) and a bunch of operators that take lvalues (map, grep, foreach, sub call, chained -> dereferences, etc.) with and without strictures/taint. Then, that could be turned into a regression test, and finally we could tweak the behaviour in 5.12 to make it more consistent.

Want to help?

1 comment:

Anonymous said...

I don't find the sub call thing an ambiguity; it's a situation where the intent is very much more likely to be that it be an rvalue, not an lvalue.

for(), on the other hand, is often enough used with lvalue intent that it is best consistently so.

Note that in either case, you can quote or 0+ to produce an explicit rvalue. I remember controversy over whether scalar() should have similar effect.