Perl Circus - Three Rings of Perl Tricks.

Hashes

from perldoc perlintro... A hash represents a set of key/value pairs:

my %fruit_color = ("apple", "red", "banana", "yellow");

You can use whitespace and the => operator to lay them out more nicely:

my %fruit_color = (
    apple  => "red",
    banana => "yellow",
);

To get at hash elements:

$fruit_color{"apple"};           # gives "red"

Create...

Create a hash from a list or an array

@arr = ("Canada", 1);
%hash = ("Albania", 355, "Bolivia", 591, @arr);

# improve readability
%hash = ("Albania"=>355, "Bolivia"=>591, @arr);
while (($k, $v) = each %hash) {print "$k=>$v, "};
Canada=>1, Albania=>355, Bolivia=>591,

Once you have a list (which may contain arrays to be flattened into lists), the promotion to hash is automatic. Perl builds the hash for you, using the first list element as a key, and the next as a value (and so-on). Notice the use of the "=>" symbol which, in lists is a synonym for the comma. It can be used to visually emphasize the matching pairs of keys and values. Beware the trap of assuming that the hash key/value pairs will be in any particular order (e.g. "Canada=>1" appears first in the printout, even though it would seem that it should be last). Unlike arrays, the order of hash entries are not predictable.

Create a hash using arrays for keys and values

# use ranges, or even reversed ranges
@hash1{'a'..'d'} = (reverse 'A'..'D');

# or use arrays
@keys = (Albania, Bolivia, Canada);
@vals = (355, 591, 1);
@hash2{@keys} = @vals;
while (($k, $v) = each %hash2) {
    print "$k=>$v, ";
}
a=>D, b=>C, c=>B, d=>A, or Canada=>1, Albania=>355, Bolivia=>591,

This trick uses a "hash slice" (with the "@..{..}" notation) to specify a range, or collection of keys in a hash. The hash, helpfully, is automagically created simply because we assigned values to it. Beware the trap of confusing the "@" symbol for referring to a slice. It seems out-of-place when dealing with otherwise hash-like structures.

Create a hash from other hashes

%hash1 = ("Canada", 1);
%hash2 = ("Albania", 355, "Bolivia", 591);
%hash3 = (%hash1, %hash2);

while (($k, $v) = each %hash3) {
    print "$k=>$v, ";
}
Albania=>355, Canada=>1, Bolivia=>591,

Remarkably Perl understands from the context you place a hash in that it should treat those hashes as if they were flat lists (called a "list context"). The first two hashes are combined (by appending the second "list" onto the first) and then promoted back into a hash again. Beware the trap that appears in cases where the keys in the second hash overlap (i.e. are the same as those in the first). In these cases the keys and values in the first hash will be summarily overwritten -- see below for tricks that avoid this "lossy" outcome.

Create a hash and preserve the add-order

use Tie::IxHash;
tie (%hash, Tie::IxHash);
$hash{'one'} = "un";
$hash{'two'} = "deux";
$hash{'three'} = "trois";
while (($k, $v) = each %hash) {
    print "$k=>$v, ";
}
one=>un, two=>deux, three=>trois,

Ordinarily Perl does not guarantee the order of items in a hash to be in any predictable order. If this is important you can easily add a "keep-in-order" behavior to a particular hash by "tying" that hash to the Tie::IxHash module. Of course you must first use the module, then, once you tie your hash to this module, the module will automatically add the desired behavior.

Create a hash by combining other hashes, in a non-lossy way

%hash1 = ('NY', 212, 'CT', 203);
%hash2 = ('NJ', 914, 'NY', 516);

while (($key, $val) = each %hash2) {
    $hash1{$key} .= "," if ($hash1{$key});
    $hash1{$key} .= $val;
}
while (($k, $v) = each %hash1) {
    print "$k=>$v, ";
}
NJ=>914, CT=>203, NY=>212,516,

Like the trick in which we create a hash from other hashes, here we want to combine two hashes into one. However in this case we do not want to lose any values, even when key names overlap. Perl still does not allow duplicate keys in the same hash however, so we will concatenate new values onto existing old values (if they exist), separating them with commas in a string.

Create a hash by combining other hashes, in a non-lossy way using references

%hash1 = ('NY', 212, 'CT', 203);
%hash2 = ('NJ', 914, 'NY', 516);

while (($key, $val) = each %hash1) {
    $hash1{$key} = [$val];
}

while (($key, $val) = each %hash2) {
    push @{$hash1{$key}}, $val;
}

while (($k, $v) = each %hash1) {
    print "$k=>@{$v}, ";
}
NJ=>914, CT=>203, NY=>212 516,

The trap in the last trick is that whatever character (like our comma) you use to separate multiple values in a single hash entry that character must never appear in the values themselves. For example if a value already had a comma in the middle of it you suddenly have a lot of confusion when it comes time to separate the combined values again. You could check each value first to be sure it didn't contain any commas or if you never intend to separate them again you can just ignore the "problem". On the other hand if you are willing to convert your hash values into array references you can neatly avoid the trap altogether. The "[ ]" wrapper turns the enclosing list into a reference to an anonymous array (so called because the actual array was never named) containing that list. Next we dereference each item back into arrays (using the @{ } wrapper), just long enough to push the new value onto it. Naturally you must remember that your hash values are all references to arrays now and will need to be dereferenced accordingly before you can use them.

Create a hash and prepopulate all values

%salaries = ();
@employees = qw(Bob Sue Deb Jim);

# base salary for all employees
@salaries{@employees} = (10_000) x @employees;

# some get a 20% bonus
$salaries{Sue} *= 1.20;

foreach (@employees) {
    print "$_ \$$salaries{$_}\n";
}
Bob $10000
Sue $12000
Deb $10000
Jim $10000

We have a list of employees and we wish to create an associated list of salaries. The hash is initially empty but the trick is to easily populate the keys and values of the hash in one line. The trick is to use a hash slice along with the "x" operator to accomplish this. In line 5 of the code we use the "employees" array as the key to our "@salaries" slice of the salaries hash. This gives us a way to specify that we want to assign something to the hash slots for every employee in the "employees" array. Now we can use the "x" operator to easily create a list that contains base salaries for each of these slots. The "foreach" operator would work but I like the way the "x" operator reads in code. We start with a list of one salary and "times" it by the length of the employees array (recall that in a scalar context arrays are evaluated to their lengths). The result is that the right-side of the assignment becomes a list of 10_000's, the same length as the length of the employees array.

Create a hash of hashes

my %contacts = (
    work => {
        Jack => 'jsmith@example.com',
        Jill => 'jgirl7@example.com',
    },
    family => {
        Lisa => 'lisrfr@example.com',
    },
);

foreach my $cat (sort keys %contacts) {
    print "$cat:\n";
    foreach my $name (sort keys %{$contacts{$cat}}) {
        print "\t$name\t$contacts{$cat}->{$name}\n";
    }
}
family:
	Lisa	lisrfr@example.com
work:
	Jack	jsmith@example.com
	Jill	jgirl7@example.com

Keeping your data stored as a hash make sense when you have information associated with names, like addresses. You might like this so much you want to do it twice: associating category names with hashes of people, for example. There is no limit how deeply you can nest hashes. The trick is to remember that you are dealing with references to hashes, so you must use the dereference operators. For example, casting $contacts{$cat} into a hash by writing %{$contacts{$cat}}. Then in the next line we use the arrow operator to get a member from a hash reference as well.

Select...

Select every element of a hash that matches a pattern

%hash1 = ('OH', 'Ohio', 'AK', 'Alaska', 'TX', 'Texas');

# if you only want the matching values...
@vals = grep {$_ =~ m/^T/} values %hash1;

# or create a second hash
@keys = grep {$hash1{$_} =~ m/^T/} keys %hash1;
@hash2{@keys} = @hash1{@keys};

while (($k, $v) = each %hash2) {
    print "$k=>$v, ;
}
TX=>Texas,

We use the grep function the same way we did in the Array section. The source array for grep is the keys function which provides the value for $_ used in the regular expression test. The result is an array of keys corresponding to the matching values. This is used to assign a hash slice from the first hash to a second hash slice. Alternatively if you are only interested in the values you should (not surprisingly) use the values function with grep.

Modify...

Modify hash by removing elements with keys that match a pattern

%hash = ('OH', 'Ohio', 'AK', 'Alaska', 'TX', 'Texas');

@keys = grep {$_ =~ m/^T/} keys %hash;
foreach (@keys) { delete $hash{$_} }

while (($k, $v) = each %hash) {
    print "$k=>$v, ";
}
AK=>Alaska, OH=>Ohio,

A variation on the last trick that really just points out the usefulness of having an array of keys. By using foreach with such an array it is very easy to modify a collection of hash entries.

Modify every hash value

%hash = ('Tycho', 'Brache', 'Lev', 'Tolstoy');
foreach (keys %hash) {$hash{$_} = uc($hash{$_})};
while (($k, $v) = each %hash) {
    print "$k=>$v, ";
}
Lev=>TOLSTOY, Tycho=>BRACHE,

This trick uses the keys function to retrieve a list of all the keys in the hash. It is then a trivial matter to iterate through the list and access or modify each hash value. Okay, so some tricks really are easy!

Modify a hash by swapping its keys and values

%hash1 = ('Tycho', 'Brache', 'Lev', 'Tolstoy');
%hash2 = reverse %hash1;
while (($k, $v) = each %hash2) {
    print "$k=>$v, ";
}
Brache=>Tycho, Tolstoy=>Lev,

Like any great trick this one looks easy. The first hash uses state abbreviations for its keys, while the second hash uses the state names. The magic is all in the reverse function. Perl treats the hash passed to reverse as a list which is then completely stood it on its head so that the last element becomes the first. Perl then converts this upside-down list back into a hash to complete the assignment. If you follow what's happening to each of the elements you see that the keys and values are, in effect, exchanged. You might also notice that the rehashed pairs have been reordered too, but this is not really important (or necessarily true) since Perl never guarantees its hash pairs to be in any predictable order with the one exception that every value always follow its key.

Sort...

Sort a hash by its keys

%hash = ('Tycho', 'Brache', 'Lev', 'Tolstoy');
@keys = sort keys %hash;
foreach (@keys) {
    print "$_=>$hash{$_}, ";
}
Lev=>Tolstoy, Tycho=>Brache,

It isn't really correct to say that a hash is "sorted". In fact Perl will keep hashes in any order it pleases (based on memory efficiency, actually). Hashes, unlike arrays, have no predictable order. The trick is to grab the keys off the hash (using Perl's keys function) and then sort that into a nice ordered array. Of course this has no effect on the hash, but it does allow us to iterate through the array and at least access the hash elements in order.

Sort a hash by its values

%hash = ('Tycho', 'Brache', 'Lev', 'Tolstoy');
@keys = sort{$hash{$a} cmp $hash{$b}} keys %hash;

foreach (@keys) {
    print "$_=>$hash{$_}, ";
}
Tycho=>Brache, Lev=>Tolstoy,

Remembering the caveats in the last trick, this is just a more explicit sort, using the word comparison operator "cmp" to compare the values of the hash. If you knew that the values were numbers you would use <=> to compare.