Download presentation
Presentation is loading. Please wait.
Published byCurtis Ross Modified over 9 years ago
1
Topic 5: Hashes CSE2395/CSE3395 Perl Programming Learning Perl 3rd edition chapter 5, pages 73-85 Programming Perl 3rd edition pages 76-78, 697-700, 703-704, 733-734 perldata manpage
2
Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University 2 In this topic Hashes ► aka associative arrays Hash variables Functions which use hashes Uses of hashes Accessing Perl’s environment Hashes ► aka associative arrays Hash variables Functions which use hashes Uses of hashes Accessing Perl’s environment
3
Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University 3 Arrays Arrays are ► ordered ► indexed by a number (integer) ► dense –if element n exists, so do elements 0 to n-1 Arrays are ► ordered ► indexed by a number (integer) ► dense –if element n exists, so do elements 0 to n-1 012345 @array indices 42"dog"-0.2undef420
4
Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University 4 Arrays Arrays aren’t always best data structure Imagine array of students’ marks ► indexed by 8-digit student ID number Arrays aren’t always best data structure Imagine array of students’ marks ► indexed by 8-digit student ID number @marks 12345678123456791234568012345681 8943undef70 0 Ten million empty elements in here!
5
Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University 5 Arrays Student ID numbers aren’t really numbers anyway ► can’t do arithmetic on them ► order of two student IDs not really important ► really just strings that happen to contain digits Want some data structure where indices are strings ► usually called associative arrays –or dictionary –or (lookup) table –or hash table Student ID numbers aren’t really numbers anyway ► can’t do arithmetic on them ► order of two student IDs not really important ► really just strings that happen to contain digits Want some data structure where indices are strings ► usually called associative arrays –or dictionary –or (lookup) table –or hash table
6
Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University 6 Associative arrays Associative array is an array where ► can locate an array element’s value given index ► indices are strings ► indices are unique ► indices are unordered For example, to look up capital cities of countries Associative array is an array where ► can locate an array element’s value given index ► indices are strings ► indices are unique ► indices are unordered For example, to look up capital cities of countries PeruJapanUKRussiaCanadaEgypt LimaTokyoLondonMoscowOttawaCairo In Perl, associative arrays are called “hashes” (because they’re implemented using hash tables)
7
Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University 7 Hashes in Perl Indices called keys ► strings ► must be unique ► e.g., country names Contents called values ► any scalar ► may be duplicated ► e.g., capital city names Can look up value given key, but not vice versa ► What’s the capital of Egypt? (easy) ► What country is Monrovia the capital of? (hard) Unordered ► You can’t sort a hash! ► Perl stores elements in an order optimized for fast lookup Indices called keys ► strings ► must be unique ► e.g., country names Contents called values ► any scalar ► may be duplicated ► e.g., capital city names Can look up value given key, but not vice versa ► What’s the capital of Egypt? (easy) ► What country is Monrovia the capital of? (hard) Unordered ► You can’t sort a hash! ► Perl stores elements in an order optimized for fast lookup Llama3 pages 73-74; Camel3 pages 51, 76-77; perldata manpage
8
Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University 8 Hash elements Hash key written inside { curly braces } ► contrast with normal arrays using [ square brackets ] ► $capital{"Egypt"} # Equal to "Cairo" ► $capital{$nation} # Depends on $nation Can assign to a hash element ► overwrites the old value, if there was one –or creates a new element, if there wasn’t ► doesn’t change any other element ► $capital{"Australia"} = "Canberra"; Using nonexistent key returns undef ► $capital{"Atlantis"} # No such country Hash key written inside { curly braces } ► contrast with normal arrays using [ square brackets ] ► $capital{"Egypt"} # Equal to "Cairo" ► $capital{$nation} # Depends on $nation Can assign to a hash element ► overwrites the old value, if there was one –or creates a new element, if there wasn’t ► doesn’t change any other element ► $capital{"Australia"} = "Canberra"; Using nonexistent key returns undef ► $capital{"Atlantis"} # No such country Llama3 pages 76-78; Camel3 page 67
9
Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University 9 Testing hash elements Can determine if hash key exists using exists function ► exists $capital{"Canada"} # True ► exists $capital{"Atlantis"} # False Not same as using defined ► key can exist, but value can be undefined ► exists $capital{"Vatican City"} # True ► defined $capital{"Vatican City"} # False Can determine if hash key exists using exists function ► exists $capital{"Canada"} # True ► exists $capital{"Atlantis"} # False Not same as using defined ► key can exist, but value can be undefined ► exists $capital{"Vatican City"} # True ► defined $capital{"Vatican City"} # False Llama3 page 83; Camel3 pages 697-698, 710-711; perlfunc manpage
10
Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University 10 Deleting hash elements To remove an entry from a hash, use delete function ► delete $capital{"Czechoslovakia"}; ► exists will now return false for that key To clear a hash, assign empty list to entire hash ► %capital = (); # World anarchy To remove an entry from a hash, use delete function ► delete $capital{"Czechoslovakia"}; ► exists will now return false for that key To clear a hash, assign empty list to entire hash ► %capital = (); # World anarchy Llama3 pages 76-77, 83-84; Camel3 pages 699-700; perlfunc manpage
11
Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University 11 Entire hashes To refer to an entire hash, use %hash ► % instead of $ ► no curly braces Can copy hashes ► %clone = %hash; Can initialize hash with many elements by assigning list to it ► for each element, write key followed by value ► order of key/value pairs not important ► %capital = ("Peru", "Lima", "Japan", "Tokyo", "UK", "London", "Russia", "Moscow", "Canada", "Ottawa", "Egypt", "Cairo"); Hashes flatten back into lists when used in list context ► e.g., when passed to a subroutine To refer to an entire hash, use %hash ► % instead of $ ► no curly braces Can copy hashes ► %clone = %hash; Can initialize hash with many elements by assigning list to it ► for each element, write key followed by value ► order of key/value pairs not important ► %capital = ("Peru", "Lima", "Japan", "Tokyo", "UK", "London", "Russia", "Moscow", "Canada", "Ottawa", "Egypt", "Cairo"); Hashes flatten back into lists when used in list context ► e.g., when passed to a subroutine Llama3 pages 78-79; Camel3 pages 76-78
12
Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University 12 Hash elements Hashes, subroutines, arrays and scalars occupy different namespaces ► %x, $x{... } refer to hash %x ► @x, $x[... ] refer to array @x ► &x, x(... ) refer to subroutine &x ► $x refers to scalar $x Hash elements interpolate into double-quoted strings ► print "The capital of $nation is $capital{$nation}\n"; Entire hashes don’t interpolate at all. ► print "%capital"; # Prints "%capital" Hashes, subroutines, arrays and scalars occupy different namespaces ► %x, $x{... } refer to hash %x ► @x, $x[... ] refer to array @x ► &x, x(... ) refer to subroutine &x ► $x refers to scalar $x Hash elements interpolate into double-quoted strings ► print "The capital of $nation is $capital{$nation}\n"; Entire hashes don’t interpolate at all. ► print "%capital"; # Prints "%capital"
13
Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University 13 Functions that use hashes How do you print out the contents of a hash? ► need to know what keys a hash has –from each key, can get value with $hash{key} keys function returns a list of all keys in a hash ► order is indeterminate, but same every time ► every key is unique –by definition of hash ► keys %capital # Returns list ("Canada", "UK", "Egypt", "Japan", "Peru", "Russia") (maybe) values function returns a list of all values in a hash ► order is same as from keys function ► values may be duplicated –values may be any scalar ► values %capital # Returns list ("Ottawa", "London", "Cairo", "Tokyo", "Lima", "Moscow") How do you print out the contents of a hash? ► need to know what keys a hash has –from each key, can get value with $hash{key} keys function returns a list of all keys in a hash ► order is indeterminate, but same every time ► every key is unique –by definition of hash ► keys %capital # Returns list ("Canada", "UK", "Egypt", "Japan", "Peru", "Russia") (maybe) values function returns a list of all values in a hash ► order is same as from keys function ► values may be duplicated –values may be any scalar ► values %capital # Returns list ("Ottawa", "London", "Cairo", "Tokyo", "Lima", "Moscow") Llama3 pages 80-81; Camel3 pages 733-734, 824; perlfunc manpage
14
Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University 14 Timeout # Printing an entire hash using keys function. # Initialize the hash. # The => notation is just a pretty-looking # synonym for the, (comma) operator that also quotes # the the word on the left side. Great for hashes. %capital = (Peru => "Lima", Japan => "Tokyo", UK => "London", Russia => "Moscow", Canada => "Ottawa", Egypt => "Cairo"); # Iterate over the hash, once per nation. # Order is indeterminate. foreach $nation (keys %capital) { print "Capital of $nation is $capital{$nation}\n"; } # Printing an entire hash using keys function. # Initialize the hash. # The => notation is just a pretty-looking # synonym for the, (comma) operator that also quotes # the the word on the left side. Great for hashes. %capital = (Peru => "Lima", Japan => "Tokyo", UK => "London", Russia => "Moscow", Canada => "Ottawa", Egypt => "Cairo"); # Iterate over the hash, once per nation. # Order is indeterminate. foreach $nation (keys %capital) { print "Capital of $nation is $capital{$nation}\n"; }
15
Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University 15 Timeout # Printing an entire hash, sorted by country. # Initialize the hash. %capital = (Peru => "Lima", Japan => "Tokyo", UK => "London", Russia => "Moscow", Canada => "Ottawa", Egypt => "Cairo"); # Iterate over the hash, once per nation. # Note that this isn't sorting the hash, # nor even iterating over the hash, but # iterating over a sorted list of the hash's keys. foreach $nation (sort keys %capital) { print "Capital of $nation is $capital{$nation}\n"; } # Printing an entire hash, sorted by country. # Initialize the hash. %capital = (Peru => "Lima", Japan => "Tokyo", UK => "London", Russia => "Moscow", Canada => "Ottawa", Egypt => "Cairo"); # Iterate over the hash, once per nation. # Note that this isn't sorting the hash, # nor even iterating over the hash, but # iterating over a sorted list of the hash's keys. foreach $nation (sort keys %capital) { print "Capital of $nation is $capital{$nation}\n"; }
16
Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University 16 Functions that use hashes keys may return a very large list ► perhaps inefficient if you need only one hash element at a time each function iterates over a hash ► one element at a time ► on first call, returns a two-element list containing one key/value pair ► subsequent calls return other key/value pairs –order indeterminate, but guaranteed not to repeat any pairs ► when all key/value pairs have been returned once, returns empty list ► state is kept by Perl with hidden attribute on hash variable ► much more space-efficient than using keys ► typical use –while (($key, $value) = each %hash) {... } keys may return a very large list ► perhaps inefficient if you need only one hash element at a time each function iterates over a hash ► one element at a time ► on first call, returns a two-element list containing one key/value pair ► subsequent calls return other key/value pairs –order indeterminate, but guaranteed not to repeat any pairs ► when all key/value pairs have been returned once, returns empty list ► state is kept by Perl with hidden attribute on hash variable ► much more space-efficient than using keys ► typical use –while (($key, $value) = each %hash) {... } Llama3 pages 81-82; Camel3 pages 703-704; perlfunc manpage
17
Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University 17 Timeout # Printing an entire hash, using each function. # Initialize the hash. %capital = (Peru => "Lima", Japan => "Tokyo", UK => "London", Russia => "Moscow", Canada => "Ottawa", Egypt => "Cairo"); # Iterate over the hash, once per nation. # No provision for sorting the output here, # because order returned by each function # is indeterminate. while (($nation, $city) = each %capital) { print "Capital of $nation is $city\n"; } # Printing an entire hash, using each function. # Initialize the hash. %capital = (Peru => "Lima", Japan => "Tokyo", UK => "London", Russia => "Moscow", Canada => "Ottawa", Egypt => "Cairo"); # Iterate over the hash, once per nation. # No provision for sorting the output here, # because order returned by each function # is indeterminate. while (($nation, $city) = each %capital) { print "Capital of $nation is $city\n"; }
18
Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University 18 Uses of hashes Hashes useful for ► implementing sparse arrays ► implementing lookup tables/databases ► counting strings ► removing duplicates from a list ► passing named parameters to subroutines Hashes useful for ► implementing sparse arrays ► implementing lookup tables/databases ► counting strings ► removing duplicates from a list ► passing named parameters to subroutines Llama3 pages 75-76
19
Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University 19 Hashes: sparse arrays Normal arrays are dense ► creating $a[10000] creates @a[0..9999] too. Hash keys are independent ► creating $h{"10000"} creates no other other elements –only elements that exist need to take up memory ► just have to pretend that keys (really strings) are integers –like student ID numbers ► may have to write some code to fake “order” of elements –foreach $element (sort {$a $b} keys %h) Normal arrays are dense ► creating $a[10000] creates @a[0..9999] too. Hash keys are independent ► creating $h{"10000"} creates no other other elements –only elements that exist need to take up memory ► just have to pretend that keys (really strings) are integers –like student ID numbers ► may have to write some code to fake “order” of elements –foreach $element (sort {$a $b} keys %h)
20
Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University 20 Hashes: lookup table Using hash, can look up string (value) given string (key) ► look up the capital of a country –capital of Malaysia is Kuala Lumpur ► look up a word in a dictionary –definition of dog is “domestic canine” ► look up the IP address of machine –slashdot.org’s IP address is 66.35.250.150 ► look up the value of a variable in an interpreter –value of variable x is 5 ► look up the title of a book –book with ISBN 0-596-00027-8 is “Programming Perl” ► look up the real name of a student –student 11111111 is Bart Simpson Any relationship with a one-to-many relationship is perfect for a hash Using hash, can look up string (value) given string (key) ► look up the capital of a country –capital of Malaysia is Kuala Lumpur ► look up a word in a dictionary –definition of dog is “domestic canine” ► look up the IP address of machine –slashdot.org’s IP address is 66.35.250.150 ► look up the value of a variable in an interpreter –value of variable x is 5 ► look up the title of a book –book with ISBN 0-596-00027-8 is “Programming Perl” ► look up the real name of a student –student 11111111 is Bart Simpson Any relationship with a one-to-many relationship is perfect for a hash
21
Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University 21 Timeout # Using the program's environment # All processes have a set of names and values which # they inherit from their parents. These can be # set in the shell by typing NAME=VALUE. print "Your home directory is $ENV{'HOME'}\n"; if ($ENV{'SHELL'} eq "/bin/csh") { # Commiserate with user. print "Your shell is csh. Yuck!"; } print "Commands are looked for in these dirs:\n"; print " $_\n" foreach (split /:/, $ENV{'PATH'}) # split: Topic 7 # Using the program's environment # All processes have a set of names and values which # they inherit from their parents. These can be # set in the shell by typing NAME=VALUE. print "Your home directory is $ENV{'HOME'}\n"; if ($ENV{'SHELL'} eq "/bin/csh") { # Commiserate with user. print "Your shell is csh. Yuck!"; } print "Commands are looked for in these dirs:\n"; print " $_\n" foreach (split /:/, $ENV{'PATH'}) # split: Topic 7
22
Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University 22 Hashes: counting strings Use hash to count frequency of strings ► key is the string (“dog”) ► value (integer) is the count (has been seen 3 times so far) ► increment the value every time a key is read Can be used to find intersection (common elements) between two arrays ► iterate over first array: count elements found ► iterate over second array: include element in result only if it was seen in the first array ► can compute union and difference similarly Use hash to count frequency of strings ► key is the string (“dog”) ► value (integer) is the count (has been seen 3 times so far) ► increment the value every time a key is read Can be used to find intersection (common elements) between two arrays ► iterate over first array: count elements found ► iterate over second array: include element in result only if it was seen in the first array ► can compute union and difference similarly
23
Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University 23 Timeout # Counting strings. %seen = (); # Nothing has been seen so far. while (<>) # Read words from input. { chomp; # Increment the counter with line's text as key. $seen{$_}++; print "$_ has been seen $seen{$_} times so far\n"; } # Final report. while (($line, $count) = each %seen) { print "$line was seen $count times overall\n"; } # Counting strings. %seen = (); # Nothing has been seen so far. while (<>) # Read words from input. { chomp; # Increment the counter with line's text as key. $seen{$_}++; print "$_ has been seen $seen{$_} times so far\n"; } # Final report. while (($line, $count) = each %seen) { print "$line was seen $count times overall\n"; }
24
Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University 24 Timeout # Intersection of two arrays. %seen = (); @intersection = (); foreach (@one) # Iterate through first array. { # Remember which elements have been seen. $seen{$_} = 1; # Any true value will do. } foreach (@two) # Now iterate through second array. { # Only add to result if was seen in @one. push @intersection, $_ if $seen{$_}; } # Intersection of two arrays. %seen = (); @intersection = (); foreach (@one) # Iterate through first array. { # Remember which elements have been seen. $seen{$_} = 1; # Any true value will do. } foreach (@two) # Now iterate through second array. { # Only add to result if was seen in @one. push @intersection, $_ if $seen{$_}; }
25
Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University 25 Hashes: removing duplicates An extension of counting elements in a list ► if this is the first time element seen, include in result ► otherwise, skip this element An extension of counting elements in a list ► if this is the first time element seen, include in result ► otherwise, skip this element
26
Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University 26 Timeout # Simple implementation of Unix sort and sort -u # Was -u (unique) switch given? if ($ARGV[0] eq "-u") { $unique = 1; shift; # Remove -u argument. } # Read all input lines and sort them. @result = sort <>; if ($unique) { # Filter out anything already seen. @result = grep { !$seen{$_}++ } @result; } print @result; # Output remaining lines. # Simple implementation of Unix sort and sort -u # Was -u (unique) switch given? if ($ARGV[0] eq "-u") { $unique = 1; shift; # Remove -u argument. } # Read all input lines and sort them. @result = sort <>; if ($unique) { # Filter out anything already seen. @result = grep { !$seen{$_}++ } @result; } print @result; # Output remaining lines.
27
Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University 27 Hashes: named parameters Calling subroutines with many parameters is messy ► printformatted(56, "$", 8, 2, "decimal"); –what did the 8 mean again? ► especially when some parameters are optional and have a reasonable default anyway Can use hash to identify optional parameters and give them values ► printformatted(56, prefix => '$', format => "decimal", precision => 8, places => 2); –self-documenting code –order of parameters no longer matters ► printformatted(56, format => "hex"); –only need to name the parameters with non-default values ► subroutines require a little code to handle this Calling subroutines with many parameters is messy ► printformatted(56, "$", 8, 2, "decimal"); –what did the 8 mean again? ► especially when some parameters are optional and have a reasonable default anyway Can use hash to identify optional parameters and give them values ► printformatted(56, prefix => '$', format => "decimal", precision => 8, places => 2); –self-documenting code –order of parameters no longer matters ► printformatted(56, format => "hex"); –only need to name the parameters with non-default values ► subroutines require a little code to handle this
28
Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University 28 Timeout # Map formats to printf percent-things. %format = (decimal => "d", hex => "x", octal => "o"); # Print a number with a certain format. sub printformatted { my $number = shift; # Value to print. my %param = ( format => "decimal", # Defaults. precision => "6", @_ # Rest of sub params. ); printf( # Build up printf format string. ($param{"prefix"}. "%". $param{"precision"}. ".". $param{"places"}. $format{$param{"format"}}), $number); } # Map formats to printf percent-things. %format = (decimal => "d", hex => "x", octal => "o"); # Print a number with a certain format. sub printformatted { my $number = shift; # Value to print. my %param = ( format => "decimal", # Defaults. precision => "6", @_ # Rest of sub params. ); printf( # Build up printf format string. ($param{"prefix"}. "%". $param{"precision"}. ".". $param{"places"}. $format{$param{"format"}}), $number); }
29
Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University 29 Covered in this topic Hashes Hash variables ► $hash{key}, %hash Functions which use hashes ► keys, values ► each Uses of hashes ► data lookup ► sparse arrays ► counting elements in a list ► removing duplicates from a list ► accessing a process’ environment ► subroutines with optional parameters Hashes Hash variables ► $hash{key}, %hash Functions which use hashes ► keys, values ► each Uses of hashes ► data lookup ► sparse arrays ► counting elements in a list ► removing duplicates from a list ► accessing a process’ environment ► subroutines with optional parameters
30
Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University 30 Going further Tying ► treat an external file (or any other object) like an internal hash (or any other type) ► Camel3 pages 363-398 Databases ► talking to databases with Perl ► Programming the Perl DBI by Alligator Descartes and Tim Bunce, O’Reilly 2000 Shells ► the Unix command-line interface ► man sh Tying ► treat an external file (or any other object) like an internal hash (or any other type) ► Camel3 pages 363-398 Databases ► talking to databases with Perl ► Programming the Perl DBI by Alligator Descartes and Tim Bunce, O’Reilly 2000 Shells ► the Unix command-line interface ► man sh
31
Original Slides by Debbie Pickett, Modified by David Abramson, 2006, Copyright Monash University 31 Next topic Regular expressions ► pattern matching Regular expressions ► pattern matching Llama3 chapters 7-9, pages 98-127 Camel3 pages 139-195 perlre manpage
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.