Sie sind auf Seite 1von 319


Practical Extraction and Report Language

(Larry Wall 1986)

Manish Sharma

| D- 4 Sector 59. NOIDA 201307. India | | Main: +91 120 4074000 | Fax: +91 120 9999999 |

Introduction to Perl
Practical Extraction and Report Language

How to run Perl

Perl is an interpreted language. This means you run it through an interpreter, not a compiler. 2 methods:
Run interpreter directly, giving name of perlscript as argument

Run as a Unix shell program. First line of program will tell the shell where the Perl interpreter is located
This line is called the shebang The shebang MUST be the very first line in the code #!/usr/bin/perl

One more step

If you choose to use the shebang, you must tell the OS that this is an executable file. Use chmod (see intro to unix slides) Usually only need to give yourself execute permissions. Once its executable, type the filename at a prompt, and it runs. This is the preferred method of running Perl

Perl Script Structure

semicolon ends each simple statement semicolon optional at final line of a block or loop
optional, but very recommended

Functions & variables are case sensitive Comments begin with # and extend to end of line
dont try to use // or /* */

Perl will try to figure out what you mean

wont even give warnings, unless you tell it to either use warnings; or put w after shebang or command line this is VERY Recommended.

Variable Declarations
In Perl, you do not need to declare your variables.
Unless you declare that you need to declare them

To force needed declarations:

use strict; use warnings;

Three (basic) types of variables.
Scalar Array Hash

There are others, but well talk about them at a later time.

Scalar meaning single value In C/C++, many many different kinds of scalars:
int, float, double, char, bool

In Perl, none of these types need to be declared Scalar variable can hold all these types, and more.

All scalar variables begin with a $ next character is a letter or _ remaining characters: letters, numbers, or _ Variable names can be between 1 and 251 characters in length Ex: $foo, $a, $zebra1, $F87dr_df3 Wrong: $24da, $hi&bye, $bar$foo

Scalar Assignments
Scalars hold any data type: $foo = 3; $foo = 4.43; $foo = Z; $foo = Hello, Im Paul.

A list (aka list literal) is a sequence of scalar values, separated by commas and enclosed in parentheses. A list can hold any number or type of scalars:
(43, Hello World, 3.1415)

Lists provide a way of assigning several scalars at once:

($a, $b, $c) = (42, Foo bar, $size); $a42, $bFoo bar, $c$size

List can also be represented with ranges:

($a, $b, $c, $d, $e) = (1..4, 10); ($x, $y, $z) = (a .. c);

List assignments
Both sides of the = do not necessarily need to have the same number of elements. ($a, $b, $c) = (5, 10, 15, 20);
$a5, $b10, $c15. (20 ignored)

($a, $b, $c) = (5, 10);

$a5, $b10, $cundef

($t1, $t2) = ($t2, $t1);

$temp = $t1; $t1 = $t2; $t2 = $temp;

Arrays are variables that hold a list
(analogous to difference between scalar variable and string literal)

much more dynamic than C/C++

no declaration of size, type can hold any kind of value, and multiple kinds of values

All array variables start with the @ character

@arr, @foo, @My_Array, @temp34

Array assignments
@foo = (1, 2, 3, 4); @bar=(my,name,is,Paul); @temp = (34, z, Hi!, 43.12); Arrays are 0-indexed, just as in C/C++ $let = $temp[1]; # $let is now z
NOTE: This is a *single value*, hence the $ $bar[2] = was; @bar now (my, name, was, Paul);

Lists of Arrays
Arrays within LHS of list will eat remaining values on RHS:
($foo, @bar, $baz)=(1, 2, 3, 4, 5, 6); $foo=1; @bar=(2, 3, 4, 5, 6); $baz=undef;

Arrays within RHS flatten to a single array.

@a1 = (1, 2, 3); @a2 = (4, 5, 6); @a3 = (@a1, @a2); @a3 (1, 2, 3, 4, 5, 6)

Array vs. Scalar

$foo = 3; @foo = (43.3, 10, 8, 5.12, a); $foo and @foo are *completely unrelated* In fact, $foo has nothing to do with $foo[2]; This may seem a bit weird, but thats okay, because it is weird.
Programming Perl, pg. 54

More about arrays

special variable for each array:
@foo = (6, 25, 43, 31);

$#foo 3. Last index of @foo. $foo[$#foo] 31;

This can be used to dynamically alter the size of an array:

$#foo = 5;
creates two undefined values on the end of @foo

$#foo = 2;
destroys all but the first three elements of @foo

Even more about arrays

Arrays can take a negative index as well. (since 0 is first, -1 is last, -2 is second-to-last, etc)
$foo[$#foo] and $foo[-1] always refer to same element

Slices piece of an array (or list) (or hash)

@bar = @foo[1..3]; @bar = @foo[0,2]; @bar = @foo[1]; @bar (25, 43, 31) @bar (3, 43) @bar (25);

You probably dont want that

Built-in Perl functions split: split a string into a list of values
$BigString = Hello,_I_am_Paul; @strings = split _, $BigString; @strings (Hello,, I, am, Paul);

join: join a list/array of values together

$BigString = join , @strings; $BigString Hello, I am Paul;

(somewhat) Analogous to hashtable datatype.
More closely resembles STL map

aka Associative Array ie, array not indexed by numerical sequence. list of keys and values.
Keys and Values can be any kind of scalar value, including mixed values within the same hash

All hash variables start with % Use to keep list of corresponding values
TIP: Any time you feel the need to have two separate arrays, and do something with elements at corresponding positions in the arrays (but dont care where in array elements actually are), USE A HASH

Hash example
Want a list of short names for months: %months = ( Jan => January, Feb => February, Mar => March, );

reference by *curly* brackets

Avoid confusion with array notation

$month{Jan} January;

More Hash Examples

Hash elements can be dynamically created (in fact, so can entire hashes) $profs{Perl} = Paul Lalli; $profs{Op Sys} = Robert Ingalls; $profs{CS1} = David Spooner; %profs (Perl => Paul Lalli, Op Sys => Robert Ingalls, CS1 => David Spooner); Hashes will flatten into normal lists: @p_arr = %profs; @p_arr (Perl, Paul Lalli, Op Sys, Robert Ingalls, CS1, David Spooner)

Special Variables
See Chapter 2 of Camel for full list $! last error received by operating system $, string used to separate items in a printed list $_ - default variable, used by several functions %ENV Environment variables @INC directories Perl looks for include files $0 name of currently running script @ARGV command line arguments $#ARGV # of command line arguments

Very basic I/O

simple introduction to reading/writing from keyboard/terminal.

This will be just enough to allow us to do some examples, if necessary.

Output to terminal
the print statement.

Takes a list of arguments to print out Before the list of arguments, optionally specify a filehandle to which to print
If omitted, default to STDOUT Also have STDIN, STDERR

If the list of arguments is omitted, print whatever value is currently in variable $_

Output examples
Hello World program: #!/usr/bin/env perl print Hello World\n; as this is Perl, you can put string in parentheses, but you dont need to (usually because this is Perl). more examples:
print print print print My Hi 5 + ((4 name is $name\n; , whats , yours?\n; 3; * 4). \n);

One catch
Recall that print takes a list of arguments. By default, print outputs that list to the terminal one right after another
@nums = (23, 42, 68); print @nums, \n;

234268 To change string printed between list items, set the $, variable:
$, = , ; print @nums, \n;

23, 42, 68

Input from keyboard

read line operator: <>
aka angle operator, diamond operator Encloses file handle to read from. Defaults to STDIN, which is what we want.

$input = <>;
read one line from STDIN, and save in $input

@input = <>;
read all lines from STDIN, and save as array in @input

Our First Bit of Magic

The Camel will describe several Perl features as magical. The <> operator is the first such feature. If you pass the name of a file (or files) as command line arguments to your script, <> does not read from STDIN. Instead, it will automatically open the first file on the command line, and read from that file. When first file exhausted, it opens and reads from next file. When all files exhausted, THEN <> reads from STDIN If you want to read from STDIN before files have been read, must do it explicitly:
$line = <STDIN>;

Chop & Chomp

When reading in a line, newline (\n) is included.
Usually dont want that.

chomp will remove the newline from the end of a string chop takes off last character of a string, regardless of what it is.
Hence, chomp is safer.

chomp ($foo = <>);

Very common method of reading in one string from input.

chomp actually takes a list, and will chomp each element of that list chomp (@s = (foo\n,bar\n,baz\n)); @s (foo, bar, baz);

Control Structures

semantically the same as C/C++ syntactically, slightly different. if ($a > 0){ print \$a is positive\n; } elsif ($a == 0){ print \$a equals 0\n; } else { print \$a is negative\n; } brackets are *required*!

another way of writing if (!) {}

analogous to English meaning of unless unless (CONDITION) BLOCK

do BLOCK unless CONDITION is true do BLOCK if CONDITION is false

can use elsif and else with unless as well

while/until loops
while is similar to C/C++ while (EXPR) BLOCK
While EXPR is true, do BLOCK

until (EXPR) BLOCK

Until EXPR is true, do BLOCK While EXPR is false, do BLOCK another way of saying while (!) {}

again, brackets are *required*

Execute all statements in following block, and return value of last statement executed When modified by while or until, run through block once before checking condition do { $i++; } while ($i < 10); Note that Perl does not consider do to be an actual loop structure. This is important later on

for loops
Perl has 2 styles of for.

First kind is virtually identical to C/C++ for (INIT; TEST; INCREMENT) { } for ($i = 0; $i < 10; $i++){ print \$i = $i\n; } yes, the brackets are required.

foreach loops
Second kind of for loop in Perl
no equivalent in core C/C++ language

foreach VAR (LIST) {} each member of LIST is assigned to VAR, and the loop executed $sum = 0; foreach $value (@nums){ $sum += $value; }

More About for/foreach

for and foreach are actually synonyms
Anywhere you see for you can replace it with foreach and viceversa
Without changing ANYTHING ELSE

they can be used interchangeably. usually easier to read if conventions followed:

for ($i = 0; $i<10; $i++) {} foreach $item (@array) {}

but this is just as syntactically valid:

foreach ($i = 0; $i<10; $i++) {} for $i (@array) {}

Two More Things (about for)

foreach VAR (LIST) {}

while iterating through list, VAR becomes an *alias* to each member of LIST
Changes within loop to VAR affect LIST

if VAR omitted, $_ used @array = (1, 2, 3, 4, 5); foreach (@array) { $_ *= 2; }

@array now (2, 4, 6, 8, 10)

Reading it in English
Perl has a cute little feature that makes simple loop constructs more readable If your if, unless, while, until, or foreach block contains only a single statement, you can put the condition at the end of the statement: if ($a > 10) {print \$a is $a\n;} print \$a is $a\n if $a > 10; Using this modifier method, brackets and parentheses are unneeded This is syntactic sugar whichever looks and feels right to you is the way to go.

Loop Control next, last, redo

last equivalent of C++ break
exit innermost loop

next (mostly) equivalent of C++ continue

begin next iteration of innermost loop

redo no real equivalent in C++

restart the current loop, without evaluating conditional

Recall that do is not a looping block. Hence, you cannot use these keywords in a do block (even if its modified by while)

continue block
while, until, and foreach loops can have a continue block. placed after end of loop, executed at end of each iteration executed even if the loop is broken out of via next (but not if broken via last or redo) foreach $i (@array){ next if ($i % 2 != 0); $sum += $i; } continue { $nums++; }

Breaking Out of More Loops

next, last, redo operate on innermost loop Labels are needed to break out of nesting loops TOP: while ($i < 10){ MIDDLE: while ($j > 20) { BOTTOM: foreach (@array){ if ($j % 2 != 0){ next MIDDLE; } if ($i * 3 < 10){ last TOP; } } } }

yes, it exists and works as in any other language LABEL: some code

goto LABEL;

Variable Interpolation, Backslash Interpolation

Sometimes called substitution
In Perl, Substitution means something else

Interpolation = replacing symbol/variable with its meaning/value within a string Two kinds of interpolation variable and backslash Done *only* in double-quoted strings, not single-quoted strings.

Backslash interpolation
aka: character interpolation, character escapes, escape sequences. When any of these sequences are found inside a double quoted string, theyre interpolated All escapes listed on page 61 of Camel Most common: \n, \t

Backslashes in Reverse
A backslash in a double-quoted string makes normal characters special.
makes n into a newline, t into tab, etc

Also makes special characters normal.

$, @, %, \ are all special. If you want to use them in a double quoted string, must backslash them. print My address is
Error, thinks @rpi is an array

print My address is bauerd\

Prints correctly.

Translation Escapes
pg 61, table 2-2 of Camel \u next character is uppercase \l next character is lowercase \U all characters until \E are uppercase \L all characters until \E are lowercase \Q all characters until \E are backslashed \E end \U, \L, or \Q

Variable Interpolation
variables found within are interpolated. strings are NOT searched for interpolation
$foo = hello; $bar = $foo world; $bar gets value: hello world $bar2 = $foo world; $bar2 gets value: $foo world

Dont confuse the parser

perl looks in double-quoted strings for anything that looks like a variable. The parser stops only when it gets to a character that cannot be part of the variable name
$thing = bucket; print I have two $things\n;

perl assumes you are printing a variable $things Specify where the variable ends with {}
print I have two ${thing}s\n;

What can be interpolated?

Scalars, arrays, slices of arrays, slices of hash
NOT entire hashes

Arrays (and slices) will print out each member of array separated by a space:
@array = (1, 3, 5, 7); print The numbers are @array.\n;

output: The numbers are 1 3 5 7.

Change separation sequence via $ variable

Quote-like operators
You might not always want to specify a string by double quotes:
He said, John said, blah\n. You would have to backslash all those quotes

Perl allows you to choose your own quoting delimiters, via the quote-like operators: q() and qq() A string in a q() block is treated as a singlequoted string. A string in a qq() block is treated as a double-quoted string.

Choosing your own delimiter

Choose any non-alpha-numeric character: /, !
print qq/Hi John\n/; $s = q!Foo Bar!;

If you choose a paren-like character (), [], {}, you must start the string with the left character and end it with the right.
print I said \Jon said \take it\\\n;

Is equivalent to:
print qq(I said Jon said take it\n);


Perl has MANY operators. Many operators have numeric and string version
remember Perl will convert variable type for you.

We will go through them in decreasing precedence.

++ and - Prefix and Postfix work as they do in C/C++ $y = 5; $x = $y++;
$y 6, $x 5

$y = 5; $x = ++$y;
$y 6; $x 6

Incrementation Magic
++ is magical. (-- is not)
if value is purely numeric, works as expected if string value, or ever used as string, magic happens 99++ 100 a9++ b0 Az++ Ba zz++ aaa

Try it, see what happens.

Even better
In addition to that magic, ++ will also automatically convert undef to numeric context, and then increment it. #!/usr/bin/env perl w $a++; print $a\n;

Prints 1 with no errors/warnings undef is equivalent to 0 in numeric context

** Exponentiation.
works on floating points or integers 2**3 pow(2, 3) 2 to the power of 3 8

NOTE: higher precedence than negation

-2**4 -(2**4) -16

Unary Operators
! logical negation
0, 0, , (), undef all false anything else true

- arithmetic negation (if numeric)

if non-numeric, negates the string ex: $foo = -abc; $bar = -$foo; $bar gets value +abc;

~ bitwise negation

/ -- Division. Done in floating point. % -- Modulus. Same as in C. * -- Numeric multiplication x -- String multiplication (aka repetition).
123 * 3 369 123 x 3 123123123 (scalar context) (123) x 3 (123, 123, 123) (list context)

+ normal addition - normal subtraction . string concatenation
$var1 = hello; $var2 = world; $var3 = $var1 . $var2;
$var3 contains helloworld

$var3 = $var1 $var2;

$var3 contains hello world

Shift operators
<< and >> - work as in C.
Shift bits in left argument number of places in right argument

1 << 4 16
0000 00012 << 4 0000 10002 1610

32 >> 4 2
0010 00002 >> 4 0000 00102 210

Relational Operators


Greater Than



Greater Than or Equal

Less Than



Less Than or Equal

Equality Operators
Numeric String Meaning

!= <=>

ne cmp

Equal to
not equal to comparison

About the comparison operator: -1 if left < right 0 if left == right 1 if left > right

The danger of mixing contexts

$s1 = Foo Bar; $s2 = Hello World; if ($s1 == $s2){print Yes\n; }

$a = <>; #user enters 42 $b = <>; #user enters 42.00 if ($a eq $b) {print Yes\n; }

Bitwise Operators
& -- AND. | -- OR ^ -- XOR
& has higher precedence

if either value numeric:

convert to integer, bitwise comparison on integers

if both values strings:

bitwise comparison on corresponding bits from the two strings

Logical Operators
&& - AND || - OR
&& has higher precedence

operate in short-circuit evaluation

ie, evaluate only whats needed creates this common Perl line: open (FILE, file.txt) || die Cant open file.txt;

return last value evaluated

Conditional Operator
?: -- Trinary operator in C. like an if-else statement, but its an expression
$a = $ok ? $b : $c; if $ok is true, $a = $b. if $ok is false, $a = $c

Assignment operators
=, **=, *=, /=, %=, x=, +=, -=, .=, &=, |=, ^=, <<=, >>=, &&=, ||= In all cases, all assignments of form TARGET OP= EXPR evaluate as: TARGET = TARGET OP EXPR

Comma Operator
Scalar context:
evaluate each list element, left to right. Throw away all but last value. $a = (fctn(), fctn2(), fctn3());
fctn() and fctn2() called, $a gets value of fctn3()

Array context:
list separator, as in array assignment @a = (fctn(), fctn2(), fctn3());
@a gets return values of ALL three functions

Logical and, or, not, xor

Functionally equivalent to &&, ||, ! $xyz = $x || $y || $z; $xyz = $x or $y or $z;

Whats the difference?

Incomplete list
some skipped over, well talk about them later.


Every operation in Perl is done in a specific context.
mode, manner, meaning

return value of operation can change depending on its context Perl variables and functions are evaluated in whatever context Perl is expecting for that situation Two *major* contexts Scalar & List

Scalar Context
$x = fctn(); if (fctn() < 5) { } Perl is expecting a scalar, so fctn() is evaluated in scalar context
assign to a scalar variable, or use an operator or function that takes a scalar argument

Also, force scalar context by scalar keyword

$x = scalar fctn();

Scalar Sub-contexts
Scalar values can be evaluated in Boolean, String, or Numeric contexts Boolean:
0, 0, , and undef are all false anything else is true

String: hello world, I have 4 Numeric: 5, 3.4, -5

Perl will *automatically* convert to and from each of these contexts for you. Almost never need to concern yourself with them.

Automatic Conversions
If a number is used as a string, the conversion is straight forward.
853 becomes 853 -4.7 becomes -4.7

If a string is used as a number, Perl will convert the string based on the first character(s)
If first character is numeric (ie, number, period (decimal), or negative (hyphen)), converted number reads from start to first non-numeric character. -534.4ab32 -534.4 If first character is non-numeric, converted number is 0. a4332.5 0

If a scalar is used in a conditional (if, while), it is treated as a boolean value

When does this happen?

$foo = 4; print Enter a number; $bar = <STDIN>; #do we need to chomp? $sum = $foo + $bar;

Note that $bar is unaffected. Its used as a number in that one statement, assuming the input started with a numeric value Method for checking input for numeric data involves Regular Expressions
dont worry about it now

List Context
@x = fctn(); @x = split ( , fctn()); Assign to a list/array, or use in a function or operator that is expecting a list There is no analogy to the scalar keyword for lists. If you use a scalar in any kind of list context, it is promoted to a list.
@array = 5; @array gets value: (5)

Context Fun
arrays evaluated in scalar context produce the size of that array
@x = (4, 8, 12); $sizex = @x; $sizex is assigned value 3.

print @x has . @x . values.\n;

4 8 12 has 3 values.

@x = (a, b, c); $y = @x; # Scalar context ($z) = @x; # List context $y 3, $z a

Any Perl variable which exists but is not defined has default value undef
($a,$b,$c)=(15,20); # $c == undef

In string context, undef In numeric context, undef 0 In boolean context, undef false In list context, undef ()
ie, an empty list

Command Line Arguments

Command Line Arguments

Similar (yet different) to C/C++ in C/C++:
argv[] contains program name and arguments argc contains number of arguments plus one

in Perl:
@ARGV contains list of arguments $0 contains program name

myscript 15 4 hello
array @ARGV (15, 4, hello) scalar $0 myscript scalar(@ARGV) 3

@ARGV ( ) $0 myscript scalar(@ARGV) 0

Built-In Functions

For each function, its name and prototype has been given.
prototype = number and type of arguments

ARRAY means an actual named array (i.e., variable starting with @) LIST means any list of elements (i.e., a list literal or a named array) HASH means a named hash variable (%) other types will identify the purpose of a scalar value


add values of LIST to end of ARRAY push @array, 5;
adds 5 to end of @array

push @foo, (4, 3, 2);

adds 4, 3, and 2, to the end of @foo

@a = (1, 2, 3); @b = (10, 11, 12); push @a, @b;

@a now (1, 2, 3, 10, 11, 12)

remove and return last element of ARRAY @array = (1, 5, 10, 20); $last = pop @array;
$last 20 @array (1, 5, 10)

@empty = (); $value = pop @empty;

$value undef.

unshift ARRAY, LIST

Add elements of LIST to front of ARRAY unshift @array, 5;
adds 5 to front of @array

unshift @foo, (4, 3, 2);

adds 4, 3, and 2, to the front of @foo

@a = (1, 2, 3); @b = (10, 11, 12); unshift @a, @b;

@a now (10, 11, 12, 1, 2, 3)

shift ARRAY
remove and return first element of ARRAY @array = (1, 5, 10, 20); $first = shift @array;
$first 1 @array (5, 10, 20);

@empty = (); $value = shift @empty;

$value undef


functionality of push, pop, shift, unshift
(plus a little bit more)

remove LENGTH elements from ARRAY, starting at position OFFSET, and replace them with LIST. In scalar context, return last element removed In list context, return all elements removed @foo = (1 .. 10); @a = splice @foo, 4, 3, a .. e; @foo (1, 2, 3, 4, a, b, c, d, e, 8, 9, 10)
@a (5, 6, 7)

splice w/o some arguments


Omit LIST: remove elements, dont replace Omit LIST and LENGTH: remove all elements starting at OFFSET Omit LIST, LENGTH, and OFFSET: clear entire ARRAY as its being read

splice equivalencies
splice ARRAY, OFFSET, LENGTH, LIST push @a, ($x, $y);
splice (@a, @a, 0, $x, $y);

pop @a;
splice (@a, $#a); splice (@a, -1); # remove last element

shift @a;
splice (@a, 0, 1); # remove first element

unshift @a, ($x, $y);

splice (@a, 0, 0, $x, $y);

$a[$x] = $y;
splice (@a, $x, 1, $y); # insert at position $x

keys HASH; values HASH

keys return list of all keys from HASH
seemingly random order

values return list of all values from HASH

same random order as keys produces

Example hash:
%months = (Jan => January, Feb => February, Mar => March, );

keys (%months) (Jan, Feb, Mar, ) values (%months) (January, February, March, )
NOT necessarily in that order.

length EXPR
return number of characters in EXPR $a = Hello\n; $b = length $a;

$b 6
Cannot use to find size of array or hash


Look for first occurrence of SUBSTR within STR (starting at OFFSET)
OFFSET defaults to 0 if omitted

Return first position within STR that SUBSTR is found.

$a = index Hello World\n, o; $b = index Hello World\n, o, $a+1;

$a 4, $b 7

Returns -1 if SUBSTR not found. rindex return last position found

reverse LIST
in list context, return an array consisting of elements in LIST, in opposite order
@foo = (1 .. 10); @bar = reverse @foo;
@foo (1, 2, 3, 4, 5, 6, 7, 8, 9, 10) @bar (10, 9, 8, 7, 6, 5, 4, 3, 2, 1)

in scalar context, take LIST, concatenate all elements into a string, and return reverse of that string
$rev = reverse @foo; $rev 01987654321

Return a 13-element list containing statistics about file named by FILEHANDLE
can also be used on a string containing a file name.

($dev, $ino, $mode, $nlink, $uid, $gid, $rdev, $size, $atime, $mtime, $ctime, $blksize, $blocks) = stat $filename; See Camel page 801 for full description Common uses:
@info = stat $file print File size is $info[7], last access time is $info[8], last modified time is $info[9]\n;

sort LIST
returns LIST sorted in ASCIIbetical order.
undef comes before , then sort by ASCII chart does not affect LIST that is passed in note that by ASCII chart, 100 comes before 99

@f = (Banana, Apple, Carrot); @sorted = sort @f;

@sorted (Apple, Banana, Carrot)

@f unmodified

@nums = (97 .. 102); @s_nums = sort @nums; @s_nums = (100, 101, 102, 97, 98, 99)

Advanced Sorting
You can tell sort how you want a function sorted
Write a small function describing the sort order In this function, Perl will assign $a and $b to be two list elements.

If you want $a to come before $b in sort order, return 1. If you want $b first, return 1.
if order of $a and $b doesnt matter, return 0

sub numeric_sort{ if ($a < $b) {return 1;} elsif ($a > $b) {return 1;} else {return 0;} }

Using Your Own Sort

Now that we have that function, use it in sort:
@nums = (4, 2, 9, 10, 14, 11); @sorted = sort numeric_sort @nums; @sorted (2, 4, 9, 10, 11, 14);

Look at that function again if ($a < $b) {return 1;} elsif ($a > $b) {return 1;} else {return 0;} This can be simplified quite a bit.
return ($a <=> $b);

Simplifying Further
We now have: sub by_number{ return ($a <=> $b); }

When sort function is that simple, dont even need to declare it:
@sorted = sort {$a <=> $b} @nums; Excellent description of sorting in Llama chapter 15


Analagous (somewhat) to pointers in C/C++
Far less messy, and definitely less dangerous

Assign the memory location of a variable to a scalar variable. Use the \ to create a reference:
@foo = (1, 2, 3); $foo_ref = \@foo;

$foo_ref now contains a reference to the array @foo;

Changes to @foo will affect array referenced by $foo_ref

Once you have a reference, de-reference it using the appropriate variable symbol (@ for array, % for hash, etc) $foo_ref = \@foo; @new_array = @$foo_ref;
@new_array is a different array, which contains the same values of members that the array referenced by $foo_ref contained. Changes to @foo (or even $foo_ref) do NOT affect @new_array

Referencing Other Types

You can also reference other kinds of variables: %hash=(Paul=>23, Justin=>22); $h_ref = \%hash; $bar = hello world\n; $bar_ref = \$bar;

Anonymous References
A value need not be contained in a defined variable to create a reference. To create an anonymous array reference:
use square brackets, instead of parens

$a_ref = [20, 30, 50, hi!!]; @a = @$a_ref;

@a (20, 30, 50, hi!!);

For hash references, use curly brackets, instead of parens: $h_ref={sky=>blue,grass=>green} %h = %$h_ref;
%h (sky => blue, grass => green);

To de-reference specific element of references.

In fact, there are three $a_ref = [Hi, Hiya, Hello]; $$a_ref[2] = Hello; ${$a_ref}[2] = Hello; $a_ref->[2] = Hello;
$$h_ref{$key} = $value; ${$h_ref}{$key} = $value; $h_ref->{$key} = $value; These are all valid and acceptable. The form you choose is whatever looks the best to you.

Dimensional Array
To create a two dimensional array, create an array of array references: @two_d = ([1, 2], [3, 4], [5, 6]); $two_d[1] is a reference to an array containing (3, 4) @{$two_d[1]} is an array containing (3, 4) $two_d[1][0] is the scalar value 3.

More Complicated
Using similar methods, you can create arrays of hashes, hashes of arrays, hashes of hashes, arrays of arrays of hashes, hashes of hashes of arrays, arrays of hashes of arrays, . .... %letters = ( lower => [a .. z], upper => [A .. Z] ) $letters{lower} is an array reference; @{$letters{lower}} is an array; $letters{lower}[1] is scalar value b.


aka: user-defined functions, methods, procedures, sub-procedures, etc etc etc Well just say Subroutines.
Functions generally means built-in functions

Well attempt to start out most basic, and work our way up to complicated.

sub myfunc { print Hey, Im in a function!\n; } myfunc( ); # call to subroutine

The Basics

Because the subroutine is already declared, () are optional (ie, you can just say myfunc; )
If you call the function before declaring it, the () are required

You can declare a subroutine without defining it (yet):

sub myfunc; Make sure you define it eventually. actual name of the subroutine is &myfunc ampersand not normally necessary to call it

(aka Arguments, inputs, etc) You can call any subroutine with any number of parameters. The parameters get passed in via local @_ variable. sub myfunc{ foreach $word (@_){ print $word ; } $foobar = 82; myfunc hello, world, $foobar; prints hello world 82


Passing current parameters

Can call a function with the current value of @_ as the parameter list by using &.
&myfunc; myfuncs @_ is alias to current @_

same as saying myfunc(@_);

its faster internally

Squashing array parameters

If arrays or hashes are passed into a subroutine, they get squashed into one flat array: @_
@a = (1, 2, 3); @b = (8, 9, 10); myfunc (@a, @b);

inside myfunc, @_ (1, 2, 3, 8, 9, 10); Maybe this is what you want.

if not, you need to use references

References in Parameters
To pass arrays (or hashes), and not squash them: sub myfunc{ ($ref1, $ref2) = @_; @x = @$ref1; @y = @$ref2; } @a = (1, 2, 3); @b = (8, 9, 10); myfunc (\@a, \@b);

Passing by reference
Within subroutine, changes to the array reference passed in affect the array that was referenced:
sub f1{ $ref1 = shift(@_); $ref2 = shift(@_); @a2 = @$ref2; for ($i=0; $i<@$ref1; $i++){ $$ref1[$i]++; } for ($i=0; $i<@a2; $i++){ $a2[$i]--; } } @foo=(1, 1, 1); @bar=(1, 1, 1); f1(\@foo, \@bar);

At this point,

@foo (2, 2, 2), but @bar (1, 1, 1)

Return values
In Perl, subroutines return last expression evaluated. sub count { ... $_[0] + $_[1]; } $total = count(4, 5); $total 9 Standard practice is to use return keyword sub myfunc{ ... return $retval; }

Return issues
Can return values in list or scalar context.
sub toupper{ @params = @_; foreach (@params) {tr/a-z/A-Z/;} return @params; } @uppers = toupper ($word1, $word2); $upper = toupper($word1, $word2);

$upper gets size of @params

Anonymous functions
You can declare a subroutine without giving it a name. Store the return value of sub in a scalar variable
$subref = sub { print Hello\n; };

to call, de-reference the stored value:


works with parameters too..

&$subref($param1, $param2);

Up to now, weve used global variables exclusively. Perl has two ways of creating local variables
local and my

what you may think of as local (from C/C++) is actually achieved via my.

my creates a new variable scoped to inner most block
The block may be a subroutine, loop, or bare { }

variables created with my are not accessible (or even visible) to anything outside scope. sub fctn{ my $x = shift(@_); } print $x; #ERROR!!!

lexical variables
Variables declared with my are called lexical variables or lexicals Not only are they not visible outside block, they mask globals with same name: $foo = 10; { my $foo = 3; print $foo; #prints 3 } print $foo; #prints 10

Wheres the scope

subroutines declared within a lexicals scope have access to that lexical
this is one way of implementing static variables in Perl

{ my $num = 20; sub add_to_num { $num++ } sub print_num { print num = $num\n;} } add_to_num; print_num; print $num;


local does not create new variable instead, assigns temporary value to existing (global) variable has dynamic scope, rather than lexical functions called from within scope of local variable get the temporary value

sub fctn { print a = $a, b = $b\n; }; $a = 10; $b = 20; { local $a = 1; my $b = 2; fctn(); } #prints a = 1, b = 20

What to know about scope

my is statically (lexically) scoped
Look at the actual code. Whatever block encloses my is the scope of the variable

local is dynamically scoped

The scope is the enclosing block, plus any subroutines called from within that block

Almost always want my instead of local

notable exception: cannot create lexical variables such as $_. Only normal, alphanumeric variables for built-in variables, localize them.

Perls way of letting you limit how youll allow your subroutine to be called. when defining the function, give it the type of variable you want it to take: sub f1 ($$) {}
f1 must take two scalars

sub f2($@) {}
f2 takes a scalar, followed by a list

sub f3(\@$) {}
f3 takes an actual array, followed by a scalar

Prototype conversions
sub fctn($$) { } fctn(@foo, $bar) Perl converts @foo to scalar (ie, takes its size), and passes that into the function sub fctn2(\@$) {} fctn2(@foo, $bar) Perl automatically creates reference to @foo to pass as first member of @_

Prototype generalities
if prototype char is: \$ \@ \% $ @ % Perl expects: actual scalar variable actual array variable actual hash variable any scalar value Array or list eats rest of params and force list context Hash or list eats rest of params and forces hash context


file handle
subroutine (name or

Getting around parameters

If you want to ignore parameters, call subroutine with & character in front sub myfunc (\$\$){ } myfunc (@array); #ERROR! &myfunc (@array); #No error here

Objected Oriented Perl

An introduction

Classes in Perl
A class is defined by storing code which defines it in a separate file, and then useing that file The file must be named with the name of the class (starting with an capital letter), followed by the extension .pm After the shebang in your main file, this line of code: use Classname; You can then create instances of the class anywhere in your file.

Defining an Object
In Perl, an object is simply a reference containing the members of a class.
typically, a reference to a hash, but can be any kind of reference

The reference becomes an object when it is blessed when you tell Perl the reference belongs to a certain class.

Simple Example
package Student; $obj = {Name => Bob, ID => 123}; bless ($obj, Student); $obj is now an object of the class Student

package Student is the first line of your .pm file. It identifies all following code as belonging to this class/package/module

Unlike C++, a Constructor in Perl is simply another subroutine. Typically named new, but you can give it any name you want. package Student; sub new { my $ref = {Name => , ID => 0}; bless ($ref, Student); return $ref; } In this example, dont actually have to give $ref any elements. You can define them all in a later subroutine, if you choose.

Calling the Constructor

As you may be able to guess, TMTOWTDI $student = new Student; $student = Student->new; $student = Student::new(Student);
First two methods get translated to 3rd method internally by perl. This has beneficial consequences

Arguments to Constructor
(actually, this applies to arguments to any class method) Every time the constructor is called, first argument to function is the name of the class. Remaining arguments are caller-defined
$obj = new Student (Bob, 123); $obj = Student->new(Bob, 123); $obj = Student::new(Student, Bob, 123);

So, when defining constructor, often see this:

sub new{ my $class = shift; my ($name, $ID) = @_; my $ref = {Name => $name, ID => $ID}; bless ($ref, $class); return $ref;

More Methods
Within the .pm file, any subroutines you declare become methods of that class. For all object methods, first argument is always the object that the method is being called on. This is also beneficial sub name{ my $ref = shift; my $name = shift; return $ref->{Name} if $name eq undef; $ref->{Name} = $name; } To call this method: $obj->name(Bob); Perl translates this to:

One more thing

You need to place the following statement at the end of your .pm file: 1; This is because the use keyword needs to take something that returns a true value. Perl returns the last statement evaluated.

Be Kind to One Another

Note that class variables are not strictly private in the C++ sense. There is nothing preventing the user of your class from modifying the data members directly, bypassing your interface functions. Perls general philosophy is If someone wants to shoot himself in the foot, who are you to stop him? When using other peoples classes, its almost always a better idea to use the functions theyve given you, and pretend you cant get at the internal data. There are, of course, methods you can use to prevent users from doing this.
Really not worth the trouble Significantly beyond scope of this tutorial

Standard Modules
Perl distributions come with a significant number of pre-installed modules that you can use in your own programs. These files are found in systemspecified directories. To find where the files are located on your system, examine the @INC array: print @INC\n; Gives a listing of all directories that are looked at when Perl finds a use statement.

Standard Module Example

Well look at a pre-defined class that implements Complex-number support in Perl. This module is defined in a subdirectory of the include path, called Math. use Math::Complex; The constructor for this class is called make, not new. $z = Math::Complex->make(4,3);
Creates an instance of the class Complex.

$z can be thought to hold the value 4 + 3i

More Math::Complex is also good enough to overload the basic mathematical operators to work with members of the complex class
Overloading operators is beyond the scope of this lecture It also defines a constant i = sqrt(-1);

use Math::Complex; $z = Math::Complex->make(3,5); print $z\n; Prints: $z = $z + 4; 3 + 5i print $z\n; 7 + 5i $z = $z 3*i; print $z\n; 7 + 2i

If you examine the file, youll see that the internal structure of the class is (at least partially) represented by an array reference called cartesian. Perl will let you modify this directly: $z = Math::Complex->make (3,5); ${$z->cartesian}[0] = 40; Dont do that. Instead, use the functions provided by the class: Re() and Im()
$z->Re(40); #set real part of $z to 40 $img = $z->Im(); #get img. part of $z


Pragmatic modules
Some modules do not define classes, but rather effect how your perl script is compiled. These modules are called pragmatic modules or just pragmas. By convention, they should be all lowercase. (whereas class modules start with a capital) Youve already seen two of these pragmas: warnings and strict.

use and no
Certain pragmas take a list of options, declaring which policies of the pragma to import. use warnings qw(syntax numeric);

This will only give warnings that fall into the syntax or numeric categories Pragmas can be turned off using no use warnings; { no warnings digit; #something that would give warnings }

More Standard Pragmas

use integer; print 10/3=, 10/3; #prints 3 { no integer; print 10/3=, 10/3; #3.333333333 } use constant PI => 4 * atan2 1, 1;

Help on Standard Modules

For documentation for all these modules, you have several resources: Unix program perldoc
ex, perldoc Math::Complex


A Few More Functions

One more quoting operator


Takes a space separated sequence of words, and returns a list of single-quoted words.
no interpolation done

@animals = qw/cat dog bird mouse/;

@animals (cat, dog, bird, mouse); As with q//, qq//, qx//, m//, and s///, you may choose any non-alphanumeric character for the delimiter.

evaluate EXPRESSION (or BLOCK) for each value of LIST. Sets $_ to each value of LIST, much like a foreach loop Returns a list of all results of the expression

@words = map {split } @lines;

Set $_ to first member of @lines, run split (split acts on $_ if no arg provided), push results into @words, set $_ to next member of @lines, repeat.

More map examples

@times = qw/morning afternoon night/; @greetings = map Good $_\n, @times;

@greetings (Good morning, Good afternoon, Good night) @nums = (1..5); @doubles = map {$_ * 2} @nums; @doubles (2, 4, 6, 8, 10);

Similar concept to map (and same syntax) returns a list of all members of the original list for which evaluation was true.
(map returns list of all return values of each evaluation)

Typically used to pick out lines you want to keep

@comments = grep {/^\s*#/} @all_lines;
picks out all lines beginning with comments Assigns $_ to each member of @all_lines, then evaluates the pattern match. If pattern match is true, $_ is added to @comments

Pick out keys and values from a hash. In scalar context, just gets the next key: while ($key = each %hash){...} foreach $key (keys %hash){...}
In list context, gets the next key and value of the hash. while (($key,$val)= each %hash) {...} foreach $key (keys %hash) { $val = $hash{$key}; . . . } If your list contains more than 2 variables, others get assigned to undef

glob EXPR
returns EXPR after it has passed through Unix filename expansion.

In Unix, ~ is a wild card that means home directory of this user

ie, my home directory is ~user23/

Unix also uses * to mean 0 or more of any character, and ? to mean exactly one of any character. This fails: opendir DIR, ~user23; This doesnt: opendir DIR, glob(~user23);

glob returns
If the pattern expansion results in more than one directory/file, only the first one is returned in scalar context
theyre all returned in list context.

@files = glob (*.pl);

gets a list of all files with a .pl extension in the current directory.

File/Directory manipulation

Opening a File
To read from a file, must use a file handle.
by convention, all caps

open a file (for reading):

open FILE, myfile.txt; open file if exists, else return false;

open file for writing:

open OUTFILE, >output.txt; clobber file if exists, else create new

open file for appending:

open APPFILE, >>append.txt; open file, positioned at end if exists, else create new

Reading from a file

open FILE, myfile.txt; $one_line = <FILE>; @all_lines = <FILE>;
@all_lines get all remaining lines from myfile.txt puts each line of file into one member of array

remember chomp! Rewind a file by seeking to the beginning:

seek (FILE, 0, 0); see Camel for explanation

printing to a file
open OUTFILE, >output.txt; print OUTFILE Hello World!\n;
this can be tedious if all outputs are to same output file. select OUTFILE;
make OUTFILE the default file handle for all print statements.

Close your files!

open FILE, myfile.txt; @all_lines = <FILE>; close FILE;

opening another file to the same filehandle will implicitly close the first one.
dont rely on this. Its not Good Programming Practice.

File Test Operators

Test to see if something is true about a file full list on page 98 of Camel
exists, readable, zero size, directory, link, text file, etc if (-e myfile.txt){ print file exists, now opening\n; open FILE, myfile.txt; }

can operate on filename or already existing filehandle

Directory manipulation
directories can be opened, read, created, deleted, much like files. take care when doing these operations: youre affecting your directory structure many of the functions success will depend on what permissions you have on the specified directories.

open, read, close

opendir DIR, public_html; $nextfile = readdir DIR; @remaining_files = readdir DIR; closedir DIR;

opendir DIR, .; $firstfile = readdir DIR; $secondfile = readdir DIR; rewinddir DIR; @allfiles = readdir DIR;

Change, Create, and Delete

chdir change working directory. mkdir create directory (like unix call) rmdir remove directory (like unix call)
works if and only if directory is empty

chdir public_html; mkdir images; rmdir temp;

Dont do useless work

Theres no reason to change to a directory to open a file in that directory You can specify the path of the file in the file open statement:
open FILE, public_html/index.html; This opens the file, without going through the bother of changing your working directory twice.

Running external programs

back-ticks, system(), and pipes

Takes a pathname, followed by a list of arguments. Executes the program specified by the pathname Returns the return value of the command that was executed.
You may have to shift the result right 8 bits

Third kind of quoting operator: Enclose a pathname. Executes the command specified by the pathname Returns the output of that command.
single string in scalar context one line per element in list context

@files = `ls al`; @files contains one file name per element

more on backticks
Just like double quotes, backticks are subject to interpolation.
$cmd =; $output = `$cmd h`;

like q// for single quotes, and qq// for double quotes, backticks can be represented by qx//
same delimiter rules apply

the open function can link a filehandle to a running process, instead of a file. To write to a process, prepend pathname with the | To read from a process, append pathname with the | open MAIL, | /usr/lib/sendmail; open LS, ls al |;

Piping issues
Why pipe instead of using ` `?
with pipes, you can read one line of output at a time, and terminate process at any time, using close()

Opening a process for bi-directional communication is more complicated.

Involves using the IPC module.

Regular Expressions

What are regular expressions?

A means of searching, matching, and replacing substrings within strings. Very powerful (Potentially) Very confusing Fundamental to Perl Something C/C++ cant even begin to accomplish correctly

Getting started
Matching: STRING =~ m/PATTERN/;
Searches for PATTERN within STRING. If found, return true. If not, return false. (in scalar context)

Substituting/Replacing/Search-and-replace: STRING =~ s/PATTERN/REPLACEMENT/;

Searches for PATTERN within STRING. If found, replace PATTERN with REPLACEMENT, and return number of times matched If not, leave STRING as it was, and return false.

*most* characters match themselves. They behave (according to our text) if ($string =~ m/foo/){ print $string contains foo\n; }

some characters misbehave. They affect how other characters are treated: \ | ( ) [ { ^ $ * + ? .
To match any of these, precede them with a backslash:

if ($string =~ m/\+/){ print $string contains a plus sign\n; }

same rules apply to the PATTERN, but not the REPLACEMENT
No need to backslash the dirty dozen in the replacement. Except you must backslash the / no matter what, since its the RegExps delimiter

greeting =~ s/hello/goodbye/; $sentence =~ s/\?/./; $path =~ s/\\/\//;

Leaning Toothpicks
that last example looks pretty bad. s/\\/\//;

This can sometimes get even worse:


This is known as Leaning toothpick syndrome. Perl has a way around this: instead of /, use any non-alphanumeric, non-whitespace delimiters, just as you can with q() and qq() s#/foo/bar/#\\foo\\bar\\#;

No more toothpicks
Recall that any non-alphanumeric, nonwhitespace characters can be used as delimiters. If you choose brackets, braces, parens:
close each part Can choose different delimiters for second part s(egg)<larva>;

If you do use /, you can omit the m (but not the s) $string =~ /found/; $sub =~ /hi/bye/; #WRONG!!

Binding and Negative Binding

=~ is the binding operator. Usually read matches or contains.
$foo =~ /hello/
Dollar foo contains hello

!~ is the negative binding operator. Read Doesnt match or doesnt contain

$foo !~ /hello/ Dollar foo doesnt contain hello equivalent of !($foo =~ /hello/)

No binding
If no string is given to bind to (either via =~ or !~), the match or substitution is taken out on $_ if (/foo/){ print $_ contains the string foo; print \n; }

Variable interpolation is done inside the pattern match/replace, just as in a double-quoted string
UNLESS you choose single quotes for your delimiters

$foo1 = hello; $foo2 = goodbye; $bar =~ s/$foo1/$foo2/; #same as $bar =~ s/hello/goodbye/; $a = hi; $b = bye; $c =~ s$a$b; #this does NOT interpolate. Will literally search for $a in $c and replace it with $b

Saving your matches

parts of your matched substring can be automatically saved for you. Group the part you want to save in parentheses matches saved in $1, $2, $3, if ($string =~ /(Name)=(Paul)/){ print First = $1, Second = $2; print \n; } prints First = Name, Second = Paul If match fails, $1, $2, etc are unchanged.

Now were ready

Up to this point, no real regular expressions
pattern matching only

Now we get to the heart of the beast recall 12 misbehaving characters:

\ | ( ) [ { ^ $ * + ? .

Each one has specific meaning inside of regular expressions.

Weve already seen 3 of them

simply: or use the vertical bar: |
similar (logically) to || operator

$string =~ /(Paul|Justin)/
search $string for Paul or for Justin return first one found in $1

search $_ for Name=Roberto or Name=Roberta return either Roberto or Roberta in $1 (also returns either o or a in $2)

Capturing and Clustering

Weve already seen examples of this, but lets spell it out: Anything within the match enclosed in parentheses are returned (captured) in the numerical variables $1, $2, $3 Order is read left-to-right by *Opening* parenthesis.
/((foo)=(name))/ $1 foo=name, $2 foo, $3name;

Parentheses are also used to cluster parts of the match together.
similar to the function of parens in mathematics

matches prob or n or r or l or ate

Matches pro, followed by one of b, n, r or l, followed by ate Matches probate or pronate or prorate or prolate $1 equals b, n, r or l if match occurs

Clustering without Capturing

For whatever reason, you might not want to capture the matches, only cluster something together with parens. use (?: ) instead of plain ( ) in previous example: /pro(?:b|n|r|l)ate/
matches probate or pronate or prorate or prolate this time, $1 does not get value of b, n, r, or l

Beginnings strings
^ matches the beginning of a string $string = Hi Bob. How goes it? $string2 = Bob, how are you?\n; $string =~ /^Bob/;
returns false

$string2 =~ /^Bob/;
returns true

Ends of Strings
$ matches the end of a string $s1 = Go home; $s2 = Your home awaits; $s1 =~ /home$/;

$s2 =~ /home$/;

$ does not consider terminating newline. foo bar\n =~ /bar$/;


For complete list, see pg 161 of Camel \d any digit: 0 9

\D any non-digit

*Some* meta-characters

\w any word character: a-z,A-Z,0-9,_

\W any non-word character

\s any whitespace: , \n, \t

\S any non-whitespace character

\b a word boundary
this is zero-length. Its simply true when at the boundary of a word, but doesnt match any actual characters \B true when not at a word boundary

The . Wildcard
A single period matches any character.
Except the new line

matches filename.txt, filename.doc, filename.exe, or any other 3 character extension

How many of previous characters to match * 0 or more + 1 or more ? 0 or 1 {N} exactly N times {N, } at least N times {N, M} between N and M times

Quantifier examples
/a*/ match 0 or more letter as
matches a,aa,aaa,,bb

/((?:foo)+)/ match 1 or more foo, and saves them all in $1

matches foob,foobfoob,bfoofoofoo

/o{2}/ matches 2 letter os

matches foo, foooooo

/(b{3,5})/ matches 3, 4, or 5 letter bs, and saves what it matched in $1

matches bbb, abbbba, abbbbbba

All quantifiers are greedy by nature. They match as much as they possibly can. They can be made non-greedy by adding a ? at the end of the quantifier $string = hello there! $string =~ /e(.*)e/;
$1 gets llo ther

$string =~ /e(.*?)e/;
$1 gets llo th;

Character classes
Use [ ] to match characters that have a certain property
Can be either a list of specific characters, or a range

search $_ for a vowel

search $_ for any characters in the 1st half of the alphabet, in either case

search $_ for any hex digit.

use ^ at very beginning of your character class to negate it /[^aeiou]/

Search $_ for any non-vowel Careful! This matches consonants, numbers, whitespace, and non-alpha-numerics too!

Character class catches

. wildcard loses its specialness in a character class

/[\w\s.]/ Search $_ for a word character, a whitespace, or a dot

to search for ] or ^, make sure you backslash them in a character class

More Regular Expressions

List vs. Scalar Context for m//

We said that m// returns true or false in scalar context. (really, 1 or 0). In list context, returns list of all matches enclosed in the capturing parentheses.
i.e.: $1, $2, $3, etc are still set

If no capturing parentheses, returns (1) If m// doesnt match, returns ()

following the final delimiter, you can place one or more special characters. Each one modifies the regular expression and/or the matching operator full list of modifiers on pages 150 (for m//) and 153 (for s///) of Camel

/i Modifier
/i case insensitive matching. Ordinarily, m/hello/ would not match Hello. However, this match *does* work:
print Yes! if Hello =~ m/hello/i;

Works for both m// and s///

/s Modifier
/s Treat string as a single line Ordinarily, the . wildcard matches any character except the newline If the /s modifier is provided, Perl will treat your RegExp as a single line, and therefore the . wildcard will match \n characters as well. Also works for both m// and s/// Foo\nbar\nbaz =~ m/F(.*)z/;
Match fails

Foo\nbar\nbaz =~ m/F(.*)z/s;
Match succeeds - $1 oo\nbar\nbaz

/m Modifier
/m Treat string as containing multiple lines As we saw last week, ^ and $ match beginning of string and end of string respectively. if /m provided, ^ will also match right after a \n, and $ will match right before a \n
in effect, they match the beginning or end of a line rather than a string

Yet again, works on both m// and s///

/x Modifier
/x Allow formatting of pattern match Ordinarily, whitespace (tabs, newlines, spaces) inside of a regular expression will match themselves. with /x, you can use whitespace to format the pattern match to look better m/\w+:(\w+):\d{3}/;
match a word, colon, word, colon, 3 digits

m/\w+ : (\w+) : \d{3}/;

match word, space, colon, space, word, space, colon, space, 3 digits (literal interpretation of whitespace in search string)

m/\w+ : (\w+) : \d{3}/x;

match a word, colon, word, colon, 3 digits Makes it look pretty, but who cares?

More /x Fun
/x also allows you to place comments in your regexp Comment extends from # to end of line, just as normal m/ #begin match \w+ : #word, then colon (\w+) #word, returned by $1 : \d{3} #colon, and 3 digits /x #end match Do not put end-delimiter in your comment yes, works on m// and s///

/g Modifier (for m//)

List context: return list of all matches within string, rather than just true
if there are any capturing parentheses, return all occurrences of those sub-matches if not, return all occurrences of entire match

$nums = 1-518-276-6505; @nums = $nums =~ m/\d+/g;

@nums (1, 518, 276, 6505)

$string = ABC123 DEF GHI789; @foo = $string =~ /([A-Z]+)\d+/g;

@foo (ABC, GHI)

More m//g
Scalar context initiate a progressive match Perl will remember where your last match on this variable left off, and continue from there $s = abc def ghi; for (1..3){ print $1 if $s =~ m/(\w+)/; }
abc abc abc

for (1..3){ print $1 if $s =~ m/(\w+)/g; }

abc def ghi

/g Modifier (for s///)

/g global replacement

Ordinarily, only replaces first instance of PATTERN with REPLACEMENT with /g, replace all instances at once. $a = $a / has / many / slashes /; $a =~ s#/#\\#g; $a now $a \ has \ many \ slashes \

Return Value of s///

Regardless of context, s/// always returns the number of times it successfully search-and-replaced If search fails, didnt succeed at all, so returns 0, which is equivalent to false unless /g modifier is used, s/// will always return 0 or 1. with /g, returns total number of global search-and-replaces it did

/e Modifier
/e Evaluate Perl code in replacement Looks at REPLACEMENT string and evaluates it as perl code first, then does the substitution
s/ hello # replace hello / Good .($time<12?Morning:Evening) # with Good Morning or Good Evening # depending on value of $time variable /xe

Modifier notes
Modifiers can be used alone, or with any other modifiers. Order of more-than-one modifiers does not matter s/$a/$b/gixs;
search $_ for $a and replace it with $b. Search globally, ignoring case, allow whitespace, and allow . to match \n.

A Bit More on Clustering

So far, we know that after a pattern match, $1, $2, etc contain sub-matches. What if we want to use the sub-matches while still in the pattern match? If were in the replacement part of s///, no problem go ahead and use them:
s/(\w+) (\w+)/$2 $1/; # swap two words

if still in match, however.

Clustering Within Pattern

to find another copy of something youve already matched, you cannot use $1, $2, etc
operation passed to variable interpolation *first*, then to regexp parser

instead, use \1, \2, \3, etc m/(\w+) .* \1/; Find a word, followed by a space, followed by anything, followed by a space, followed by that same word.

Transliteration Operator
tr/// does not use regular expressions.
Probably shouldnt be in RegExp section of book Authors couldnt find a better place for it.

tr/// does, however, use the binding operators =~ and !~ formally: tr/SEARCH_LIST/REPLACEMENT_LIST/;
search for characters in SEARCH_LIST, replace with corresponding characters in REPLACEMENT_LIST

What to Search, What to Replace?

Much like character classes, tr/// takes a list or range of characters. tr/a-z/A-Z/;
replace any lowercase characters with corresponding capital character.

TAKE NOTE: SearchList and ReplacementList are NOT REGULAR EXPRESSIONS

attempting to use RegExps here will give you errors

Also, no variable interpolation is done in

tr/// Notes
In either context, tr/// returns the number of characters it modified. if no binding string given, tr/// operates on $_, just like m// and s/// tr/// has an alias, y///. Its deprecated, but you may see it in old code.

tr/// Notes
if Replacement list is shorter than Search list, final character repeated until its long enough
tr/a-z/A-N/; replace a-m with A-M. replace n-z with N

if Replacement list is null, repeat Search list

useful to count characters

if Search list is shorter than Replacement list, ignore extra characters is Replacement

tr/// Modifiers
/c Compliment the search list
real search list contains all characters *not* in given searchlist

/d Delete characters with no corresponding characters in the replacement

tr/a-z/A-N/d; replace a-n with A-N. Delete o-z.

/s Squash duplicate replaced characters

sequences of characters replaced by same character are squashed to single instance of character

CGI Programming

What is CGI?
Common Gateway Interface A means of running an executable program via the Web. CGI is not a Perl-specific concept. Almost any language can produce CGI programs
even C++ (gasp!!)

However, Perl does have a *very* nice interface to creating CGI methods

How Does it Work?

A program is created, like normal. The program must be made user-executable A user visits a web page. The web browser contacts the server where the CGI program resides, and asks it to run the program
passes Parameters to the program
similar to command line arguments

The server runs the program with the parameters passed in The server takes the output of the program and returns it to the web browser. The web browser displays the output as an HTML page

Most (not all) CGI scripts are contacted through the use of HTML forms. A form is an area of a web page in which the user can enter data, and have that data submitted to another page. When user hits a submit button on the form, the web browser contacts the script specified in the form tag.

Creating a Form
<form method=post action=file.cgi> <input type=submit value=Submit Form> </form> Method attribute specifies how parameters are passed to the CGI program. post means theyre passed in the HTTP header (and therefore arent seen anywhere). get means theyre passed through the address bar in the web browser. Action attribute specifies which program you want the web browser to contact. <input> is a tag used to accept User data. type=submit specifies a Submit button. When user

Many different ways of getting data from user. Most specified by <input> tag, type specified by type attribute text a text box checkbox a check box radio a Radio button password password field
(text box, characters display as ******)

Form Input Types

hidden hidden field (nothing displayed in browser) submit Submit button. Submits the form reset Reset button. Clears form of all data. button A button the user can press
(usually used w/ javaScript. *shudder*)

file field to upload a file image an image user can click to submit form

Other Attributes of <input>

name name of input field. value value returned from checks & radios; text of submits and buttons; contents of text, password, and hidden size width of text or password checked radio or checkbox turned on src URL of an image

Inputs that Dont Use <input>

<textarea> - Multi-line text field. You can specify rows and cols attributes <select> - create a drop-down menu.
<option value=> Options in the drop down menu.

Great. We can input. Now what?

Now, we can write a CGI program that takes those inputs and *does stuff* with them. This is still a perl script. Therefore, still need the shebang as top line of the code. Next, need to include all the CGI methods. These are stored in
As you may guess, TMTOWTDI.

use CGI; use CGI :standard;

Object-Oriented actually defines a class. Therefore, if you use CGI; you can then do $q = new CGI; and access the CGI subroutines as methods of $q. print $q->start_html(); This allows you to maintain multiple states using more than one instance of the CGI class Very useful for complicated programs

Alternatively, you can import a set of subroutines to be called directly, without the need to declare an object. Tell Perl which subroutines to import by given a quoted list in the use statement:
use CGI (start_html, end_html, header); defines sets of these functions. Most common is the standard set: use CGI :standard; For a full list of which functions are in which sets, examine the file, looking at the variable %EXPORT_TAGS Now, you can call all the CGI subroutines directly, without declaring any objecs

Outputting from CGI

Just as CGI program took input from user via web browser, it outputs back to user via web browser as well. STDOUT is redirected to the web browser that contacted it. This means you dont have to learn any new output functions. print() will now throw data to the web browser.

Beginning a CGI Program

#!/usr/local/bin/perl use CGI :standard; print header(text/html);

header() prints HTTP header to web browser. Argument is MIME type of data. Defaults to text/html, so you can usually just leave the argument out.

Now Create your Output

Remember, youre building an HTML page in the output. So you must follow the HTML format: print <html><head>, <title>My CGI Program</title>\n, </head><body>\n; gives you a better way to do this. Well get to it soon.

Whered Those Inputs Go?

They were passed into the CGI program as parameters. You can retrieve them using the param() function. Called in list context w/ no argument, returns names of all parameters. Called in scalar context, takes name of one parameter, and returns value of that parameter Called in list context w/ an argument, returns array of all values for that parameter (ie, for checks and radios)

subroutine defined in Retrieves list of parameters and creates an HTML list of all parameters and all values. Like most CGI functions, doesnt print anything. You must manually print the return value of the function call. print dump; For some reason that I havent analyzed yet, this causes abort errors on the CS CGI machine. To fix it, capitalize: print Dump; HTML Shortcuts gives you methods to create HTML code without actually writing HTML. most HTML tags are aliased to CGI functions. unpaired tags:
print br(); # sends <br> to browser print p; # sends <p> to browser

paired tags:
print b(Bold text); #<b>Bold text</b> print i(Italic); #<i>Italic</i>

More shortcuts
For tags with attributes, place name/value attributes in a hash reference as the first argument. The string enclosed in tag is the second argument. Ex:
a({href=>sample.html, target=>top}, Sample file);

Produces: <a href=sample.html target=top>Sample file</a> You may think this is needless amounts of extra learning, with no time or length benefits
Youre probably right. In this case.

Can take one parameter, the title. print start_html(My title);

<html><head><title>My title </title></head><body>


Can also take named parameters, with attributes to give to <body> tag: print start_html (-title=>My Title, -bgcolor=>Red);
<html><head><title>My title </title></head><body bgcolor=Red>

print end_html;

HTML Form Shortcuts

For full list of Form shortcuts, see Or examine the %EXPORT_TAGS variable in Each one takes parameters as name/value pairs. Name starts with a dash. Most parameters are optional
startform(-method=>post, -action=> foo.cgi)
default method is post. default action is current script
this is *very* beneficial.

produces: <form method=post action=foo.cgi> endform(); # produces </form>

Input Shortcuts
textfield(-name=>MyText, -default =>This is a text box)
<input type=text name=MyText value=This is a text box>

All HTML Form input shortcuts are similar. Again, see for full list and description.

Programmer Beware
default in input methods is value used *first* time script is loaded only. After that, they hold the values they had the last time the script was run. to override (ie, force default value to appear), set parameter -override=>1 inside input method: textfield(-name=>foo, -default =>bar, -override=>1);

Avoid Conflicts
Some HTML tags have same name as internal Perl functions. Perl will get very confused if you try to use the CGI shortcut methods to print these tags <tr> table row. conflicts with tr///
use Tr() or TR() instead

<select> dropdown list. conflicts with select().

use Select() instead

<param> - pass parameters to Java applet conflicts with param().

use Param() instead

<sub> - Make text subscripted. Conflicts with sub keyword.

Use Sub() instead

Running CGI on CS machines

The CGI-enabled server in the CS Department is To run your scripts on cgi2.cs
set shebang: #!/usr/bin/env perl make file user-executable put the file in public.html/cgi-bin directory Make sure public.html and cgi-bin are worldexecutable go to

If all goes well, your scripts output will be displayed. If all does not go well, youll get HTTP 500

Debugging a CGI program can be a very frustrating task. If there are any errors whatsoever, the web browser will simply display 500 Internal Server Error with no helpful information. One solution is to run the program from the command line. will ask you for name=value pairs of parameters. Enter each name and value of the parameters, separated by newline. Then press CTRLD.
(some newer versions of dont support this pass name=value pairs on command-line instead)

This way, you get the compiler errors, and you can see the pure HTML output of your CGI script. The other method is to examine the servers error logs.

Well, lastly for today anyway. One common method of CGI programming is to make both the form and the response all in one script. Heres how you do that #!/usr/bin/env perl -w use CGI :standard; print header; if (!param()){ print start_html(-title => Heres a form); print startform; #no action #create your form here } else { print start_html(-title=> Heres the result); #display the results of the submitting form

More CGI Programming

Multiple Submits Cookies Emailing File Uploading

Deciding Where to Go
What if you want to have more than one functionality on your form? In other words, have more than one button the user can push. The name and value of the submit button are passed as params This is useful.

Multiple Submits
Just as you can have many different text fields or checkboxes, you can have different submit buttons Make sure you give each submit a different name. Only the submit button that is pressed will be passed as a parameter. Check to see if this parameter exists. <input type=submit name=Submit1 value=Go Here!> <input type=submit name=Submit2 value=Go There!> if (param(Submit1){ } elsif (param(Submit2){ } else{ }

File Uploading
Another input method we did not talk about is fileuploading To use file-uploading feature, must use a special kind of HTML form:
Add ENCTYPE=multipart/form-data to <form> Or, in Perl, use start_multipart_form() instead of start_form() Html: <input type=file name=uploaded> Perl: filefield(-name=>uploaded)

Creates a field in which user can enter name of file to send to server. Also creates Browse button to search local machine. User enters name or path of a file to upload. When form submitted, CGI script can then get this file

Getting the File

To get the name of the file user wants to upload, use param() function. $file = param(uploaded); If you use $file as a string, it will be the name of the file. If you use $file as a filehandle, it will be a link to the actual file. print Contents of file $file are:<br>\n; foreach $line <$file>{ print $line<br>; }

Thats Great for Text Files

But users can upload any kind of file. Need to find out what kind of file it was. uploadInfo() function. Returns reference to a hash containing info about the file. $file = param(uploaded); $info = uploadInfo($file); $type = $info->{Content-Type};

$type may contain text/html, text/plain, image/jpeg, etc etc

If File is not Text

Need function to read from binary files. read($filename, $buffer, $size)
$filenamefilehandle to read $bufferscalar in which to store data $sizemax number of bytes to read returns number of bytes read

$file = param(uploaded); open UPLOAD, >binary.jpg; while ($num=read($file,$buf,1024)) { print UPLOAD $buf; } close UPLOAD;

Emailing from your CGI Script

In actuality, you can use this process to email from any Perl program. Note that this will be a Unix-specific.

barebones emailing program. No friendly user interface whatsoever. standard with most Unix distributions. We need to run it with the t flag. This tells the program to search the message for the To:, Cc:, Bcc:, etc
For more information, man sendmail

You can open a pipe to another program or process in almost the same way you open a file. A pipe is a connection between your program and another executable program. You can feed it input as though you were writing to the file Instead of <, >, or >>, use the | character in front of file name. open (PROG, |myprogram.exe) or die Cannot open program; For more information, CSCI-4210 & CSCI-4220

Put Them Together

open (MAIL, |/usr/lib/sendmail t) || die Cannot begin mail program; print MAIL From: lallip\\n; print MAIL To: president\\n; print MAIL Subject: I want a raise!\n; print MAIL You know, Dr. J, Im not quite sure this is really worth it. \n; close MAIL;

Love them or hate them, they exist. And youll learn how to use them.
learning to use them responsibly is your own task.

A cookie is a (usually very small) piece of text that a server sends to a web browser for later retrieval. Can be used to track a users preferences, or other information user has told the server.

To Set a Cookie
Create the cookie cookie() function. Takes many (mostly optional) parameters:
-name=> Name of the cookie -value=> Value of the cookie can be a scalar, array reference, or hash reference -expires=> Expiration date/time of the cookie -path=> Path to which cookie will be returned -domain=> Domain to which cookie will be returned -secure=> 1 if cookie returned to SSL only

Cookie Expiration
Expires: absolute or relative time for cookie to expire
+30s in 30 seconds +10m in 10 minutes +1h in one hour -d yesterday (ASAP) now immediately +3M in 3 Months +10y in 10 Years Wed, 05-Dec-2001 18:00:00 GMT On Wednesday, 12/5/2001 at 6pm GMT.

Cookie Path
region of server to check before sending back the cookie. If I set a cookie with path = /perl/f01/ Then only CGI scripts in /perl/f01 (and its subdirectories) will receive the cookie. By default, path is equal to the path of the current CGI script. To send cookie to all CGI scripts on server, specify path = /

Cookie Domain
domain (or partial domain) to send cookie back to. must contain at least 2 periods (so cant send cookie to all .com domains) if I set cookie domain =, cookie will be sent to scripts on,,, etc if set to, cookie only sent to,,, etc if set to, cookie sent only to pages on Note that both domain and path must match

Cookie Created, Now Set it.

$cookie = cookie( ); print header(-cookie=>$cookie); To set more than one cookie, use array reference $cookie1 = cookie (); $cookie2 = cookie (); print header(-cookie=>[$cookie1, $cookie2]);

Read the Cookies

Once again, use the cookie() function. This time, just give the name: $mycookie = cookie(lallip);

$mycookie now has value of cookie with name lallip.

Perl Database Interface (DBI)

Outline Introduction to DBI Working with DBI

Manipulating a Database with DBI

DBI and the Web DBI Utility Functions MySQL Server Internet and World Wide Web Resources

Introduction to DBI
E-mail, purchasing online, storing files

Distributed Applications
Several machines get data from one database That data is then shown on one machine
Called the client

Introduction to DBI (cntd.)

Perl DBI
Interface provides uniform access across all databases Handles objects in the interface
Diver handles encapsulate the driver, make database handles Database handles encapsulate specific SQL statements Statement handles are created by database handles

Working with DBI

Must be registered with a valid ODBC source before use

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29

#!/usr/bin/perl # Fig. 15.18: # Program to query a database and display contents in a table use use use use warnings; strict; DBI; DBD::ODBC;

Loads the DBI module and base driver

my $dbh = DBI->connect( "DBI:ODBC:employeeDB", "", "" ) or die("Could not make connection to database: $DBI::errstr" ); my $sth = $dbh->prepare( q{ SELECT * FROM employee } ) or die( "Cannot prepare statement: ", $dbh->errstr(), "\n" ); $sth->execute() or die( "Cannot execute statement: ", $sth->errstr(), "\n" ); my @array;

Creates a statement handle execute Calls the

method of the statement handle Displays the arrays in the format programmed

while ( @array = $sth->fetchrow_array() ) { write(); } # Check to see if fetch terminated early warn( $DBI::errstr ) if $DBI::err; $dbh->disconnect(); $sth->finish();

30 format STDOUT =

The format to display an array

31 @<<<<<<@<<<<<<<<<@<<<<<<<<<<@<<<<<@<<<<<<<<<<< 32 $array[ 0 ], $array[ 1 ], $array[ 2 ], $array[ 3 ], $array[ 4 ] 33 .

0004 0001 0002 0003

Michael Jim Kate Wendy

Black Blue Green White

1965 1943 1977 1959

222-44-8888 999-85-3698 111-21-7454 000-84-3196

Program Output

Working with DBI

Function Name fetchrow_array fetchrow_arrayref fetchrow_hashref Return Type array array ref hash ref Description Returns a single row in an array. Returns a single row in an array reference. Returns a single row in a hash reference with fieldname value pairs.


Fig. 15.19

Returns the whole result set in a reference to an array. The array consists of references to arrays that hold the rows of data. Functions for extracting the results of a query.

array ref

Manipulating a Database with DBI

How to:
Add records Delete records Update records

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

#!/usr/bin/perl # Fig. 15.20: # Program to insert a new record into the database. use use use use warnings; strict; DBI; DBD::ODBC;

my $dbh = DBI->connect( "dbi:ODBC:employeeDB", "", "", { RaiseError => 1 } ); print( chomp( print( chomp( print( chomp( print( chomp( print( chomp( "Please enter your employee ID: " ); my $newemploy = <STDIN> ); "Please enter your first name: " ); my $newfirst = <STDIN> ); "Please enter your last name: " ); my $newlast = <STDIN> ); "Please enter your year of birth: " ); my $newbirthyr = <STDIN> ); "Please enter your social security number: " ); my $newsoc = <STDIN> );

If there are connection problems RaiseError will kill the program and generate a message

Takes in all the data fields to be added to the database

my $querystring = "INSERT INTO employee VALUES ( '$newemploy','$newfirst','$newlast', '$newbirthyr','$newsoc' );"; # Execute the statement $dbh->do( $querystring );

Inserts the values into the database

31 # Now print the updated database 32 my $sth = $dbh->prepare( q{ SELECT * FROM employee 33 34 $sth->execute(); 35 36 print( "\n" ); 37 38 my @array; 39 40 while ( @array = $sth->fetchrow_array() ) { 41 42 } 43 44 # Clean up 45 warn( $DBI::errstr ) if $DBI::err; 46 $sth->finish(); 47 $dbh->disconnect(); 48 49 format STDOUT = 50 @<<<<<<@<<<<<<<<<@<<<<<<<<<<@<<<<<@<<<<<<<<<<< 51 $array[ 0 ], $array[ 1 ], $array[ 2 ], $array[ 3 ], $array[ 4 ] 52 . write(); } );

Prints the updated database

Closes the table and the connection to the database

Please Please Please Please Please 0004 0001 0005 0002 0003

enter enter enter enter enter

your your your your your

employee ID: 0005 first name: Orinthal last name: Orange year of birth: 1947 social security number: 999-88-7777 Black Blue Orange Green White 1965 1943 1947 1977 1959 222-44-8888 999-85-3698 999-88-7777 111-21-7454 000-84-3196

Michael Jim Orinthal Kate Wendy Program Output

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29

#!/usr/bin/perl # Fig. 15.21: # Program to delete a record from the database use use use use warnings; strict; DBI; DBD::ODBC;

my $dbh = DBI->connect( "dbi:ODBC:employeeDB", "", "", { RaiseError => 1 } ); print( "Enter the Employee ID number of the record ", "you wish to delete: " ); chomp( my $IDdel = <STDIN> ); Prompts the user for which print( "Delete this record: ($IDdel)? (Y/N) " ); employee they want to delete chomp( my $choice = <STDIN> ); if ( $choice eq 'Y' || $choice eq 'y' ) { my $query = "DELETE FROM employee " . "WHERE EmployeeID = '$IDdel'"; print( "$query \n\n" ); $dbh->do( $query ); }

Confirms the delete

my $sth = $dbh->prepare( q{ select * FROM employee } ); $sth->execute();

30 my @array; 31 32 while ( @array = $sth->fetchrow_array() ) { 33 34 } 35 36 # Clean up 37 warn( $DBI::errstr ) if $DBI::err; 38 $dbh->disconnect(); 39 $sth->finish(); 40 41 format STDOUT = 42 @<<<<<<@<<<<<<<<<@<<<<<<<<<<@<<<<<@<<<<<<<<<<< 43 $array[ 0 ], $array[ 1 ], $array[ 2 ], $array[ 3 ], $array[ 4 ] 44 . write( STDOUT );

Displays the table in the correct format

Enter the Employee ID number of the record you wish to delete: 0005 Delete this record: (0005)? (Y/N) y DELETE FROM employee WHERE EmployeeID = '0005' 0004 0001 0002 0003 Michael Jim Kate Wendy Black Blue Green White 1965 1943 1977 1959 222-44-8888 999-85-3698 111-21-7454 000-84-3196 Program Output

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26

#!/usr/bin/perl # Fig. 15.22: # Program to update a record in the database. use use use use warnings; strict; DBI; DBD::ODBC;

my $dbh = DBI->connect( "dbi:ODBC:employeeDB", "", "", { RaiseError => 1 } ); print( "Enter the Employee ID number of the record ", "you wish to change: " ); chomp( my $ID = <STDIN> ); print( print( print( print( print( print( print( chomp( "Which value would you like to change:\n" ); "1. Employee Identification. \n" ); "2. First name.\n" ); "3. Last name.\n" ); "4. Year of Birth.\n" ); "5. Social Security Number.\n" ); "? " ); my $change = <STDIN> );

Finds out the correct data from the user. What should be changed and what the new data should be.

27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57

my $field; if ( $change == 1 ) { $field = "EmployeeID"; print( "Enter the employee's } elsif ( $change == 2 ) { $field = "FirstName"; print( "Enter the employee's } elsif ( $change == 3 ) { $field = "LastName"; print( "Enter the employee's } elsif ( $change == 4 ) { $field = "YearBorn"; print( "Enter the employee's } elsif ( $change == 5 ) { $field = "SocialSecurity"; print( "Enter the employee's } else { print( "Invalid value.\n" ); return; }

new employee number: " );

Determines the proper field that the user wants to change

new First name: " );

new Last name: " );

new year of birth: " );
new social security number: ");

chomp( my $newvalue = <STDIN> ); my $query = "UPDATE employee SET $field = '$newvalue' WHERE EmployeeID = '$ID'";

Updates the database the user desires

58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81

print( "$query \n" ); $dbh->do( $query ); # Now print the updated database my $sth = $dbh->prepare( q{ SELECT * FROM employee $sth->execute(); print( "\n" ); my @array;

} );

Prints the database in its updated form with the special format

while ( @array = $sth->fetchrow_array() ) { write(); } # Clean up warn( $DBI::errstr ) if $DBI::err; $dbh->disconnect(); $sth->finish();

format STDOUT = @<<<<<<@<<<<<<<<<@<<<<<<<<<<@<<<<<@<<<<<<<<<<< $array[ 0 ], $array[ 1 ], $array[ 2 ], $array[ 3 ], $array[ 4 ] .

Enter the Employee ID number of the record you wish to change: 0004 Which value would you like to change: 1. Employee Identification. 2. First name. 3. Last name. 4. Year of Birth. 5. Social Security Number. ? 2 Enter the employee's new First name: Michelle UPDATE employee SET FirstName = 'Michelle' WHERE EmployeeID = '0004' 0004 0001 0002 0003 Michelle Jim Kate Wendy Black Blue Green White 1965 1943 1977 1959 222-44-8888 999-85-3698 111-21-7454 000-84-3196 Program Output

DBI and the Web

Same as when making any other CGI script No special issues when dealing with the web

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

#!perl # Fig. 15.21: # Demonstrates providing a web interface for a database. use use use use use warnings; strict; DBI; DBD::ODBC; CGI qw( :standard );

my $DSN = "dbi:ODBC:employeeDB";

Sets the background print header(), start_html( { title => "Working with DBI", background=>"http://localhost/images/background.jpg" });
param ) { h1( "Database Manager" ), start_form(), popup_menu( -name => 'selection', -value => [ 'View the Database', 'Insert a Record', 'Delete a Record', 'Update a Record' ] ), hidden( { -name => "LAST", -value => "MAIN" } ), br(), br(), br(), br(), br(), submit( -value => "Click to Proceed" ), end_form();

16 17 unless ( 18 print 19 20 21 22 23 24 25 26 27 28 29 }

Creates a new form with the dropdown list to choose an action to perform on the table

30 else { 31 my $dbh = DBI->connect( $DSN, "", "", { RaiseError => 1 } ); 32 Connects to the database 33 if ( param( "LAST" ) eq "MAIN" ) { 34 my $selection = param( "selection" ); or else generates an error 35 36 view( $dbh ) if ( $selection eq "View the Database" ); 37 displayInsert() if ( $selection eq "Insert a Record" ); 38 displayDelete($dbh) if ($selection eq "Delete a Record"); 39 displayUpdate($dbh) if ($selection eq "Update a Record"); 40 } 41 elsif ( param( "LAST" ) eq "INSERT" ) { 42 insertRecord( $dbh ); 43 view( $dbh ); 44 } 45 elsif ( param( "LAST" ) eq "DELETE" ) { 46 deleteRecord( $dbh ); 47 view( $dbh ); 48 } 49 elsif ( param( "LAST" ) eq "UPDATE1" ) { 50 updateRecordForm( $dbh ); 51 } 52 elsif ( param( "LAST" ) eq "UPDATE2" ) { 53 updateRecord( $dbh ); 54 view( $dbh ); 55 } 56 $dbh->disconnect(); 57 } 58 59 print end_html(); 60

Executed the code based on the choice made by the user

61 sub view 62 { 63 my $dbh = shift(); 64 65 my $sth = $dbh->prepare( Orders all the employees 66 "SELECT * FROM employee ORDER BY EmployeeID ASC" ); by their ID number 67 $sth->execute(); 68 69 my $rows = $sth->fetchall_arrayref(); 70 $sth->finish(); 71 Create a table to display all 72 my $tablerows = of the database information 73 Tr( th( { -bgcolor => "#dddddd", -align=>'left' }, 74 [ "ID", "First", "Last"] ), 75 th( { -bgcolor => "#dddddd" }, [ "YOB", "SSN" ] ) ); 76 77 foreach my $row ( @$rows ) { 78 $tablerows .= Tr( td( { -bgcolor => "#dddddd" }, $row )); 79 } 80 81 print h1( "Employee Database" ), A page that will display 82 table( { -border => 0, -cellpadding => 5, the database results 83 -cellspacing => 0 }, $tablerows ), 84 br(), br(), 85 "Your query yielded ", b( scalar( @$rows ) ), 86 " records.",br(), br(), 87 a( { -href => "/cgi-bin/" }, 88 "Back to the Main Database Page" ); 89 } 90

91 sub displayInsert 92 { 93 print h3( "Add a new employee to the database." ), br(), 94 start_form(), 95 "Employee ID", br(), 96 textfield( -name => 'ID' ), br(), A group of fields to be 97 "First Name", br(), filled in to add the new 98 textfield( -name => 'FIRST' ), br(), employee to the database 99 "Last Name", br(), 100 textfield( -name => 'LASTNAME' ), br(), 101 "Year of Birth", br, 102 textfield( -name => 'YEAR' ), br(), 103 "Social Security Number", br(), 104 textfield( -name => 'SSN' ), 105 hidden( { -name => "LAST", -value => "INSERT", 106 -override => "1" } ), 107 br(), br(), submit( -value => "Add New Employee" ), 108 end_form(), br(), br(), 109 a( { -href => "/cgi-bin/" }, 110 "Back to the Main Database Page" ); 111 } 112 113 sub displayDelete 114 { 115 my $dbh = shift(); 116 117 my $sth = $dbh->prepare( 118 "SELECT EmployeeID, FirstName, LastName FROM employee "); 119

120 121 122 123 124 125 126 127

$sth->execute(); my ( %names, @ids ); while ( my @row = $sth->fetchrow_array ) { push( @ids, $row[ 0 ] ); $names{ $row[ 0 ] } = join( " ", @row[ 1, 2 ] ); } $sth->finish; print h3( "Delete an employee from the database" ), br(), start_form(), "Select an Employee to delete ", popup_menu( -name => 'DELETE_ID', -value => \@ids,

129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 } 146 end_form(), font( { -color => "red" }, "This action removes the record permanently." ), br(), br(), a( { -href => "/cgi-bin/" }, "Back to the Main Database Page" );

Generates a list of the employees in the database to be removed

Uses the list to create a dropdown list with employee names to be deleted

-labels => \%names ), br(), br(), br(), hidden( { -name => "LAST", -value => "DELETE", -override => 1 } ), submit( -value => "Delete a Record" ), br(), br(),

Passes the employee to be deleted

147 sub displayUpdate 148 { 149 my $dbh = shift(); 150 151 my $sth = $dbh->prepare( 152 "SELECT EmployeeID, FirstName, LastName FROM employee "); 153 154 $sth->execute(); Create the table to display 155 the updated information 156 my ( %names, @ids ); 157 158 while ( my @row = $sth->fetchrow_array ) { 159 push( @ids, $row[ 0 ] ); 160 $names{ $row[ 0 ] } = join( " ", @row[ 1, 2 ] ); 161 } 162 163 $sth->finish; 164 165 print h3( "Update an employee in the database" ), br(), 166 start_form(), 167 "Select an Employee to update ", Generates a list of the users 168 popup_menu( -name => 'UPDATE_ID', in the database for updating 169 -value => \@ids, 170 -labels => \%names ), br(), br(), br(), 171 hidden( { -name => "LAST", -value => "UPDATE1", 172 -override => 1 } ), 173 submit( -value => "Update a Record" ), br(), br(), 174 end_form(), 175 a( { -href => "/cgi-bin/" }, Passes the employee to 176 "Back to the Main Database Page" ); have fields changed 177 } 178

179 sub updateRecordForm 180 { 181 182 183 184 185 186 187 188 189 190 191 192 193 194 print h3("Updating the record for employee #$values[ 0 ]."), my @values = $sth->fetchrow_array; my @names = ( "", "First Name ", "Last Name ", "Year Born ", "Social Security Number " ); $sth->finish(); $sth->execute(); my $dbh = shift(); my $statement = "SELECT * FROM employee " . "WHERE EmployeeID = '" . param( 'UPDATE_ID' ) . "'"; my $sth = $dbh->prepare( $statement );

Gets the desired employees information

Displays the employees information

195 196 197 198 199

200 201 202 203 204 205 }

br(), br(), start_form(), Passes that

employee to have the record changed ] } );

"@values\n", br(), hidden( { -name => '0', -value => $values[ 0

foreach ( 1 .. 4 ) { print $names[$_], br(),

textfield( -name=>$_, -value => $values[ $_ ], -override => 1 ), br();

206 print submit( -value => "Update the Record" ), 207 hidden( { -name => "LAST", -value => "UPDATE2", 208 -override => 1 } ), 209 end_form(), 210 a( { -href => "/cgi-bin/" }, 211 "Back to the Main Database Page" ); 212 } 213 214 sub insertRecord Adds a new record to the database 215 { 216 my $dbh = shift(); 217 my ( $id, $first, $last, $year, $ssn ) = 218 ( param( 'ID' ), param( 'FIRST' ), param( 'LASTNAME' ), 219 param( 'YEAR' ), param( 'SSN' ) ); 220 my $string = "INSERT INTO employee VALUES 221 ( '$id', '$first', '$last', '$year', '$ssn' );"; 222 223 $dbh->do( $string ); 224 } 225 226 sub deleteRecord Removes a specified 227 { 228 my $dbh = shift(); record from the database 229 my $string = "DELETE FROM employee ". 230 "WHERE EmployeeID = '" . 231 param( 'DELETE_ID' ) . "'"; 232 233 $dbh->do( $string ); 234 print "Employee #", param( 'DELETE_ID' ), 235 " deleted.", br(), br(); 236 }

238 sub updateRecord 239 { 240 241 242 243 244 245 my $dbh = shift(); my ( $id, $first, $last, $year, $ssn ) = ( param( '0' ), param( '1' ), param( '2' ), param( '3' ), param( '4' ) ); my $string = "UPDATE employee SET FirstName = '$first', " . "LastName = '$last', YearBorn = '$year', " .

247 248

"SocialSecurity = '$ssn' " .

"WHERE EmployeeID = '$id'";

250 }

$dbh->do( $string );

Updates the correct record from the database with the user entered information Program Output Program Output Program Output Program Output Program Output Program Output Program Output

DBL Utility Functions

Utility functions
Allows a user to determine the database support provided by the computer that the user is using The available_drivers function
Returns the available database drivers installed

The data_sources function

Returns the available databases registered to the system

MySQL Server
Multi-platform database Multi0user database Multithreaded database Open source software

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

#!/usr/bin/perl # Fig. 15.24: # Creating a table. use use use use warnings; strict; DBI; DBD::mysql;

Loads the MySQL driver

my $dbh = DBI->connect( "DBI:mysql:USERDB", "root", "", { RaiseError => 1 } );

my $string = "CREATE TABLE FirstName LastName Email Phone Continent Users ( Creates VARCHAR( 30 ), VARCHAR( 30 ), VARCHAR( 30 ), VARCHAR( 30 ), ENUM( 'North America', 'South America', 'Europe', 'Asia', 'Africa', 'Australia', 'Antarctica' ), ENUM( 'Windows NT', 'Windows 98', 'Macintosh', 'Linux', 'Other' ), INT, INT )";

an SQL table VARCHAR is used to tell the program that it is creating variable length textboxes


Hours Rating

32 33 $dbh->do( $string ); 34 35 $dbh->do( "INSERT INTO Users ( 36 37 38 39 40 41 42 my $sth = $dbh->prepare( "SELECT * FROM Users" ); 43 $sth->execute(); 44 45 while ( my @row = $sth->fetchrow_array() ) { 46 47 } 48 49 warn( $DBI::errstr ) if ( $DBI::err ); 50 $dbh->disconnect(); 51 $sth->finish(); print( "@row\n" ); FirstName, LastName, Email,

Executes the SQL statement

Inserts one record Phone,the database into

Continent, OpSys, Hours, Rating ) VALUES ( 'John', 'Doe', 'john\', '(555)555-5555', 'North America', 'Windows 98', 3, 4 )" );

John Doe (555)555-5555 North America Windows 98 3 4

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

#!/usr/bin/perl # Fig. 15.25: # Using a MySQL database use use use use use warnings; strict; DBI; DBD::mysql; CGI qw( :standard );

print( header(), start_html( "Registration Form" ), h1( "Registration Form" ) ); if ( param( "No" ) || !param ) { registrationForm(); } elsif ( !param( "Yes" ) ) { my $first = param( "FIRST" ); my $last = param( "LAST" ); my $email = param( "EMAIL" ); my $phone = param( "PHONE" ); my $land = param( "CONTINENT" ); my $os = param( "OS" ); my $time = param( "HOURS" ); my $value = param( "RATING" );

Creates a list of variables that hold all the user entered data and asks the user if the information is correct

if ( $phone !~ / \( \d{3} \) \d{3} - \d{4} /x ) { print( "Please enter your phone number ", "in the correct format.", br() ); registrationForm(); }

Checks the format of the phone number entered

32 33 34 35 36 37 38 39 40 41 42 43

elsif ( ( $time !~ / \d+ /x ) || $time < 0 || $time > 24 ) { print( "Please enter an integer for hours that is ", "between 0 and 24", br() ); registrationForm(); } else { print( h4( "You entered", br(), "Name: "E-mail: "Phone: "Continent: "OS: "Rating of product: start_form(), hidden( -name => "FIRST" ), hidden( -name => "LAST" ), hidden( -name => "EMAIL" ), hidden( -name => "PHONE" ), hidden( -name => "CONTINENT" ), hidden( -name => "OS" ), hidden( -name => "HOURS" ), hidden( -name => "RATING" ), "Is this information correct? ", br(), submit( -name => "Yes" ), submit( -name => "No" ), end_form() ); } $first $last", br(), $email", br(), $phone", br(), $land", br(), $os", br(), $value:" ), br(),

Checks to format of the hours the product was used

45 46 47 48 49 50 51 52 53 54 55 56 57 }

"Hours using product: $time", br(),

Passes the information to the next form

58 else { 59 my $first = param( "FIRST" ); 60 my $last = param( "LAST" ); Sets the variables equal 61 my $email = param( "EMAIL" ); to the user entered data 62 my $phone = param( "PHONE" ); 63 my $land = param( "CONTINENT" ); 64 my $os = param( "OS" ); 65 my $time = param( "HOURS" ); 66 my $value = param( "RATING" ); 67 68 if ( $phone =~ / \( \d{3} \) \d{3} - \d{4} /x ) { Opens the database 69 my $dbh = DBI->connect( "DBI:mysql:USERDB", "root", "", 70 { RaiseError => 1 } ); 71 72 my $statement = "INSERT INTO Users VALUES 73 ( '$first', '$last', '$email', '$phone', 74 '$land', '$os', '$time', '$value' )"; Inserts the information 75 into the database 76 $dbh->do( $statement ); 77 78 print( "Thank you for completing the ", 79 "registration form $first", br(), 80 "The following information has been recorded:", 81 br(), br(), 82 table( { -border => 3, -cellspacing => 3 }, 83 Tr( th( [ "Name", "E-mail", "Phone Number", 84 "Continent", "OS", "Hours", 85 "Rating" ] )), 86 Tr( td( { -align => "center" }, 87 [ "$first $last", $email, $phone, 88 $land, $os, $time, $value ] ) ) ) ); 89 }

90 else { 91 print( "Please enter your phone number in the ", 92 " correct format.", br() ); 93 registration_form(); 94 } 95 } 96 Has the user fill in several 97 print end_html(); fields of which will be 98 entered into the database 99 sub registrationForm { 100 print( 101 h3( "Please fill in all fields and then click Proceed."), 102 start_form(), 103 table( { -cellpadding => "3" }, 104 Tr( { -valign => "top" }, 105 td( { -width => '300' }, strong( "First Name:"), br(), 106 textfield( -name => "FIRST", -size => 15 ) ), 107 108 td( strong( "Last Name:" ), br(), 109 textfield( -name => "LAST", -size => 15 ) ) ), 110 111 Tr( { -valign => "top" }, 112 td( strong( "E-mail Address:" ), br(), 113 textfield( -name => "EMAIL", -size => 25 ) ), 114 115 td( strong( "Phone Number" ), br(), 116 textfield( -name => "PHONE", -size => 20 ), br(), 117 "Must be of the form (555)555-5555" ) ) ), 118

119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 }

h4( "What Continent do you live on? " ), popup_menu( -name => "CONTINENT", -value => [ "North America", "South America", "Asia", "Europe", "Australia", "Africa", "Antarctica" ] ), br(), h4( "Which Operating System are you currently running?" ), radio_group( -name => 'OS', -value => [ "Windows 98", "Windows NT", "Macintosh", "Linux", "Other" ] ), br(), h4( "How many hours a day do you use our product? " , textfield( -name => 'HOURS', -size => 3 ) ), h4( "How would you rate our product on a scale of 1 - 5" ), radio_group( -name => "RATING", -value => [ '1', '2', '3', '4', '5' ] ), br(), submit( "Proceed" ), end_form() ); Program Output Program Output Program Output

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

#!/usr/bin/perl # Fig. 15.26: # Makes a webpage of statistics from the database. use use use use use warnings; strict; DBI; DBD::mysql; CGI qw( :standard );

my $dbh = DBI->connect( "DBI:mysql:USERDB", "root", "", { RaiseError => 1 } ); my $sth = $dbh->prepare("SELECT Continent, OpSys, Hours, Rating FROM Users" ); $sth->execute(); my $results = $sth->fetchall_arrayref(); my $total = scalar( @$results );

Connect to the database

Extracts the data using the fetchall_arrayref

my ( $rating, $hours, %lands, %op ); foreach my $row ( @$results ) { $lands{ $row->[ 0 ] }++; $op{ $row->[ 1 ] }++; $hours += $row->[ 2 ]; $rating += $row->[ 3 ]; }

Totals the user input hours and ratings

31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50

$hours /= $total; $rating /= $total;

Obtains an average

print header, start_html( "User Stats" ), h1( "User Statistics" ); printf"You have a total of %d users spending an average of %.2f hours using your product. They rate it an average of %.2f out 5.", $total, $hours, $rating; my $landrows = Tr( th( { width => "100" }, "Continent" ), th( { width => "50" }, "Total Users" ), th( "Percent Of Users" ) ); foreach ( sort { $lands{ $b }<=>$lands{ $a }} keys( %lands )) { my $percent = int( $lands{ $_ } * 100 / $total ); $landrows .= Tr( td( $_ ), td( $lands{ $_ } ), td( table( { -width => "100%" }, Tr( td( { -width => "$percent%", -bgcolor=>"#0000FF" }, br ),

Creates an HTML table to display the results Creates a bar graph to display the data

51 td( br ) ) ) 52 ) ); 53 } 54 55 print h3( { -align => "center" }, "Users by Continent" ), 56 table( { -border => 1, -width => "100%" }, $landrows ); 57 58 my $oprows = Tr( th( { width => "100" }, "Operating System" ), 59 th( { width => "50" }, "Total Users" ), 60 th( "Percent Of Users" ) ); 61

62 foreach ( sort { $op{ $b } <=> $op{ $a } } keys( %op ) ) { 63 64 65 66 67 68 69 70 } 71 72 print h3( { -align => "center" }, "Operating System statistics" ), 73 74 75 $dbh->disconnect(); table( { -border => 1, -width => "100%" }, $oprows ); ) ) ) ); my $percent = int( $op{ $_ } * 100 / $total ); $oprows .= Tr( td( $_ ), td( $op{ $_ } ), td( table( { -width => "100%" }, Tr( td( { -width => "$percent%", -bgcolor => "#0000FF" }, br ), td( br )

Creates the same table only for the OS stats Program Output Program Output