You are on page 1of 21

Introduction to Perl Programming

( perl 5 )

Contents
Basics
Variables and Operators
Branching
Looping
File Test Operators
Regular Expressions
Input and Output
Processing files mentioned on the Command line
Get Filenames
Pipe input and ouput from/to Unix Commands
Execute Unix Commands
The Perl built-in Functions
Subroutines
Some of the special Variables
Forking
Building Pipes for forked Children
Building a Socket Connecting to another Computer
Get User and Network Information
Arithmetics
Formatting Output with "format"
Commandline Switches
Basics
Scripts
Perl is a script language, which is compiled each time before running. That unix knows
that it is a perl script there must be the following header at the topline of every perl script:
#!/usr/bin/perl where the path to perl has to be correct and the line must not exeed 32
charachters.

Comments and Commands


After the header line: #!/usr/bin/perl there are either empty lines with no effect or
command lines or commentary lines. Everything from and behind a "#" up to the end of
the line is comment and has no effect on the program. Commands start with the first non
space charachter on a line and end with a ";". So one can continue a command over
several lines and terminates it only with the semicolon.

Direct commands and subroutines


Normal commands are executed in the order written in the script. But subroutines can be
placed anywhere and will only be evaluated when called from a normal command line.
Perl knows it's a subroutine if it the code is preceded with a "sub" and enclosed in a block
like: sub name { command;}

Other special lines


Perl can include other programming code with: require something or with use
something.

Quotations
Single quote: '' or: q//
Double quote: "" or: qq//
Quote for execution: `` or: qx//
Quote a list of words: ('term1','term2','term3') or: qw/term1 term2 term3/
Quote a quoted string: qq/"$name" is $name/;
Quote something wich contains "/": qq!/usr/bin/$file is readdy!;

Scalar and list context


That perl distinguishes between scalar and list context is the big feature, which makes it
unique and more useful then most other script languages.

A subroutine can return lists and not only scalars like in C. Or an array gives the number
of elements in a scalar context and the elements itself in a list context.

The enormous value of that feature should be evident.

Variables and Operators


General
There are scalar variables, one and two dimensional arrays and associative arrays. Instead
of declaring a variable, one precedes it with a special character. $variable is a normal
scalar variable. @variable is an array and %variable is an associative array. The user of
perl does not have to distinguish between a number and a string in a variable. Perl
switches the type if necessary.

Scalars
Fill in a scalar with: $price = 300; $name = "JOHN"; Calculate with it like: $price *=
2; $price = $oldprice * 4; $count++; $worth--; Print out the value of a scalar with:
print $price,"\n";

Arrays
Fill in a value: $arr[0] = "Fred"; $arr[1] = "John"; Print out this array: print join('
',@arr),"\n";
If two dimensional: $arr[0][0] = 5; $arr[0][1] = 7;

Hashes (Associative Arrays)


Fill in a single element with: $hash{'fred'} = "USA"; $hash{'john'} = "CANADA";

Fill in the entire hash:


%a = (
'r1', 'this is val of r1',
'r2', 'this is val of r2',
'r3', 'this is val of r3',
);
or with:
%a = (
r1 => 'this is val of r1',
r2 => 'this is val of r2',
r3 => 'this is val of r3',
);

Assignments
Put something into a variable with a "=" or with some combined operator which assigns
and does something at the same time:

$var = "string"; Puts the string into $var


$var = 5; Puts a number into $var

$var .= "string"; Appends string to $var


$var += 5; Adds number to $var
$var *= 5; Multipliy with 5
$var ||= 5; If $var is 0 make it 5
$var x= 3; Make $var to three times $var as string: from a to aaa

Modify and assign with:

($new = $old) =~ s/pattern/replacement/;

Comparisons
Compare strings with: eq ne like in: $name eq "mary".
Compare numbers with: == != >= <= <=> like in: $price == 400.

And/Or/Not
Acct on success or failure of an expression: $yes or die; means exit if $yes is not set.
For AND we have: && and "and" and for OR we have: || or "or". Not is "!" or "not".

AND,OR and NOT are regularly used in if() statements:


if($first && $second){....;}
if($first || $second){....;}
if($first && ! $second{....;} means that $first must be non zero but $second must not be
so.
But many NOT's can be handled more reasonable with the unless() statement. Instead:
print if ! $noway; one uses: print unless $noway;

Branching
if
if(condition){
command;
}elsif(condition){
command;
}else{
command;
}

command if condition;

unless (just the opposite of if)


unless(condition){
command;
}else{
command;
}

command unless condition;

Looping
while
while(condition){
command;
}

# Go prematurely to the next iteration


while(condition){
command;
next if condition;
command;
}

# Prematureley abort the loop with last


while(condition){
command;
last if condition;
}

# Prematurely continue the loop but do continue{} in any case


while(condition){
command;
continue if condition;
command;
}continue{
command;
}

# Redo the loop without evaluating while(condtion)


while(condtion){
command;
redo if condition;
}

command while condition;

until (just the opposite of while)


until(condition){
command;
}

until(condition){
command;
next if condition;
command;
}

until(condition){
command;
last if condition;
}

until(condition){
command;
continue if condition;
command;
}continue{
command;
}

command until condtion;

for (=foreach)
# Iterate over @data and have each value in $_
for(@data){
print $_,"\n";
}

# Get each value into $info iteratively


for $info (@data){
print $info,"\n";
}

# Iterate over a range of numbers


for $num (1..100){
next if $num % 2;
print $num,"\n";
}

# Eternal loop with (;;)


for (;;){
$num++;
last if $num > 100;
}

map
# syntax
map (command,list);
map {comm1;comm2;comm3;} list;
# example
map (rename($_,lc($_),<*>);

File Test Operators


File test operators check for the status of a file: Some examples:
-f $file It's a plain file
-d $file It's a directory
-r $file Readable file
-x $file Executable file
-w $file Writable file
-o $file We are owner
-l $file File is a link
-e $file File exists
-z $file File has zero size, but exists
-s $file File is greater than zero
-t FILEHANDLE This filehandle is connected to a tty
-T $file Text file
-B $file Binary file
-M $file Returns the day number of last modification time

Regular Expressions
What it is
A regular expression is an abstract formulation of a string. Usually one has a search
pattern and a match which is the found string. There is also a replacement for the match,
if a substitution is made.

Patterns
A pattern stands for either one, any number, several, a particular number or none cases of
a character or a character-set given literally, abstractly or octaly.
PATTERN MATCH
. any character (dot)
.* any number on any character (dot asterix)
a* the maximum of consecutive a's
a*? the minimum of consecutive a's
.? one or none of any characters
.+ one or more of any character
.{3,7} three up to seven of any characters, but as many as possible
.{3,7}? three up to seven, but the fewest number possible
.{3,} at least 3 of any character
.{3} exactly 3 times any character
[ab] a or b
[^ab] not a and also not b
[a-z] any of a through z
^a
a at beginning of string
\Aa
a$
a at end of string
a\Z
A|bb|CCC A or bb or CCC
tele(f|ph)one telefone or telephone
\w A-Z or a-z or _
\W none of the above
\d 0-9
\D none of 0-9
\s space or \t or \n (white space)
\S non space
\t tabulator
\n newline
\r carridge return
\b word boundary
\bkey matches key but not housekey
(?#.......) Comment
(?i) Case insensitive match. This can be inside a pattern variable.
(?:a|b|c) a or b or c, but without string in $n
(?=.....) Match ..... but do not store in $&
(?!.....) Anything but ..... and do not store in $&

Substitutions
One can replace found matches with a replacement with the s/pattern/replacement/;
statement.
The "s" is the command. Then there follow three delimiters with first a search pattern and
second a replacement between them. If there are "/" within the pattern or the replacement
then one chooses another delimiter than "/" for instance a "!".

To change the content of a variable do: $var =~ s/pattern/replacement/;


To put the changed value into another variable, without distorting the original variable
do:
($name = $line) =~ s/^(\w+).*$/$1/;
COMMAND WHAT it DOES
s/A/B/; substitute the first a in a string with B
s/A/B/g; substitute every a with a B
s/A+/A/g; substitute any number of a with one A
s/^#//; substitute a leading # with nothing. i.e remove it
s/^/#/; prepend a # to the string
substitute a followed by a number with b followed by the
s/A(\d+)/B$1/g;
same number
s/(\d+)/$1*3/e; substitute the found number with 3 times it's value
Use two "e" for to get an eval effect:
perl -e '$aa = 4; $bb = '$aa'; $bb =~ s/(\$\w+)/$1/ee; print $bb,"\n";'
s/here goes date/$date/g; substitute "here goes date" with the value of $date
s/(Masumi) (Nakatomi)/$2
switch the two terms
$1/g;
s/\000//g; remove null charachters
s/$/\033/; append a ^M to make it readable for dos

Input and Output


Output a value from a variable
print $var,"\n";

Output a formated string


printf("%-20s%10d",$user,$wage);

Read in a value into a variable and remove the newline


chomp() (perl5) removes a newline if one is there. The chop() (perl4) removes any last
character.

chomp($var = <STDIN>);

Read in a file an process its linewise


open(IN,"<filename") || die "Cannot open filename for input\n";
while(<IN>){
command;
}
close IN;

Read a file into an array


open(AAA,"<infile") || die "Cannot open infile\n";
@bigarray = <AAA>;
close AAA;

Output into a file


open(OUT,">file") || die "Cannot oben file for output\n";
while(condition){
print OUT $mystuff;
}
close OUT;

Check, whether open file would yield something (eof)


open(IN,"<file") || die "Cannot open file\n";
if(eof(IN)){
print "File is empty\n";
}else{
while(<IN>){
print;
}
}
close IN;

Process Files mentioned on the


Commandline
The empty filehandle "<>" reads in each file iteratively. The name of the current
processed file is in $ARGV. For example print each line of several files prepended with
its filename:
while(<>){
$file = $ARGV;
print $file,"\t",$_;
open(IN,"<$file") or warn "Cannot open $file\n";
....commands for this file....
close(IN);
}

Get Filenames
Get current directory at once
@dir = <*>;

Use current directory iteratively


while(<*>){
...commands...
}

Select files with <>


@files = </longpath/*.c>;

Select files with glob()


This is the official way of globbing:
@files = glob("$mypatch/*$suffix");

Readdir()
Perl can also read a directory itself, without a globing shell. This is faster and more
controllable, but one has to use opendir() and closedir().
opendir(DIR,".") or die "Cannot open dir.\n";
while(readdir DIR){
rename $_,lc($_);
}
closedir(DIR);

Pipe Input and Output from/to Unix


Commands
Process Data from a Unix Pipe
open(IN,"unixcommand|") || die "Could not execute unixcommand\n";
while(<IN>){
command;
}
close IN;
Output Data into a Unix Pipe
open(OUT,"|more") || die "Could not open the pipe to more\n";
for $name (@names){
$length = length($name);
print OUT "The name $name consists of $lenght characters\n";
}
close OUT;

Execute Unix Commands


Execute a Unix Command and forget about the Output
system("someprog -auexe -fv $filename");

Execute a Unix Command an store the Output into a


Variable
If it's just one line or a string:

chomp($date = qx!/usr/bin/date!); The chomp() (perl5) removes the trailing "\n". $date
gets the date.

If it gives a series of lines one put's the output into an array:

chomp(@alllines = qx!/usr/bin/who!);

Replace the whole perl program by a unix program


exec anotherprog; But then the perl program is gone.
The Perl built-in Functions
String Functions
Get all upper case with: $name = uc($name);
Get only first letter uppercase: $name = ucfirst($name);
Get all lowercase: $name = lc($name);
Get only first letter lowercase: $name = lcfirst($name);
Get the length of a string: $size = length($string);
Extract 5-th to 10-th characters from a string: $part = substr($whole,4,5);
Remove line ending: chomp($var);
Remove last character: chop($var);
Crypt a string: $code = crypt($word,$salt);
Execute a string as perl code: eval $var;
Show position of substring in string: $pos = index($string,$substring);
Show position of last substring in string: $pos = rindex($string,$substring);
Quote all metacharachters: $quote = quotemeta($string);

Array Functions
Get expressions for which a command
@found = grep(/[Jj]ohn/,@users);
returned true:
Applay a command to each element of an
@new = map(lc($_),@start);
array:
Put all array elements into a single string: $string = join(' ',@arr);
@data =
Split a string and make an array out of it:
split(/&/,$ENV{'QUERY_STRING'};
Sort an array: sort(@salery);
Reverse an array: reverse(@salery);
Get the keys of a hash(associative array): keys(%hash);
Get the values of a hash: values(%hash);
Get key and value of a hash iteratively: each(%hash);
Delete an array: @arr = ();
Delete an element of a hash: delete $hash{$key};
Check for a hash key: if(exists $hash{$key}){;}
Check wether a hash has elements: scalar %hash;
Cut of last element of an array and return
$last = pop(@IQ_list);
it:
Cut of first element of an array and return
$first = shift(@topguy);
it:
Append an array element at the end: push(@waiting,$name);
Prepend an array element to the front: unshift(@nowait,$name);
Remove first 2 chars an replace them with
splice(@arr,0,2,$var);
$var:
Get the number of elements of an array: scalar @arr;
Get the last index of an array: $lastindex = $#arr;

File Functions
Open a file for input: open(IN,"</path/file") || die "Cannot open file\n";
Open a file for output: open(OUT,">/path/file") || die "Cannot open file\n";
Open for appending: open(OUT,">>$file") || &myerr("Couldn't open $file");
Close a file: close OUT;
Set permissions: chmod 0755, $file;
Delete a file: unlink $file;
Rename a file: rename $file, $newname;
Make a hard link: link $existing_file, $link_name;
Make a symbolic link: symlink $existing_file, $link_name;
Make a directory: mkdir $dirname, 0755;
Delete a directory: rmdir $dirname;
Reduce a file's size: truncate $file, $size;
Change owner- and group-ID: chown $uid, $gid;
Find the real file of a symlink: $file = readlink $linkfile;
Get all the file infos: @stat = stat $file;
Conversions Functions
Number to character: chr $num;
Charachter to number: ord($char);
Hex to decimal: hex(0x4F);
Octal to decimal: oct(0700);
Get localtime from time: localtime(time);
Get greenwich meantime: gmtime(time);
Pack variables into string: $string = pack("C4",split(/\./,$IP));
Unpack the above string: @arr = unpack("C4",$string);

Subroutines (=functions in C++)


Define a Subroutine
sub mysub {
command;
}
Example:
sub myerr {
print "The following error occured:\n";
print $_[0],"\n";
&cleanup;
exit(1);
}

Call a Subroutine
&mysub;

Give Arguments to a Subroutine


&mysub(@data);
Receive Arguments in the Subroutine
As global variables:
sub mysub {
@myarr = @_;
}
sub mysub {
($dat1,$dat2,$dat3) = @_;
}
As local variables:
sub mysub {
local($dat1,$dat2,$dat3) = @_;
}

Some of the Special Variables


SYNTAX MEANING
$_ String from current loop. e.g. for(@arr){ $field = $_ . " ok"; }
$. Line number from current file processed with: while(<XX>){
$0 Program name
$$ Process id of current program
$< The real uid of current program
$> Effective uid of current program
$| For flushing output: select XXX; $| = 1;
$& The match of the last pattern search
$1.... The ()-embraced matches of the last pattern search
$` The string to the left of the last match
$' The string to the right of the last match

Forking
Forking is very easy! Just fork. One puts the fork in a three way if(){} to separately the
parent, the child and the error.
if($pid = fork){
# Parent
command;
}elsif($pid == 0){
# Child
command;
# The child must end with an exit!!
exit;
}else{
# Error
die "Fork did not work\n";
}

Building Pipes for forked Children


Building a Pipe
pipe(READHANDLE,WRITEHANDLE);

Flushing the Pipe


select(WRITEHANDLE); $| = 1; select(STDOUT);

Setting up two Pipes between the Parent and a Child


pipe(FROMCHILD,TOCHILD); select(TOCHILD); $| = 1; select(STDOUT);
pipe(FROMPARENT,TOPARENT);select(TOPARENT);$| = 1; select(STDOUT);

if($pid = fork){
# Parent
close FROMPARENT;
close TOPARENT;
command;
}elsif($pid == 0){
# Child
close FROMCHILD;
close TOCHILD;
command;
exit;
}else{
# Error
command;
exit;
}

Building a Socket Connection to another


Computer
# Somwhere at the beginning of the script
require 5.002;
use Socket;
use sigtrap;

# Prepare infos
$port = 80;
$remote = 'remotehost.domain';
$iaddr = inet_aton($remote);
$paddr = sockaddr_in($port,$iaddr);

# Socket
socket(S,AF_INET,SOCK_STREAM,$proto) or die $!;

# Flush socket
select(S); $| = 1; select(STDOUT);

# Connect
connect(S,$paddr) or die $!;

# Print to socket
print S "something\n";

# Read from socket


$gotit = <S>;

# Or read a single character only


read(S,$char,1);

# Close the socket


close(S);

Get Unix User and Network Information


Get the password entry for a particular user with: @entry = getpwnam("$user");
Or with bye user ID: @entry = getpwuid("$UID");

One can information for group, host, network, services, protocols in the above way with
the commands: getgrnam, getgrid, gethostbyname, gethostbyaddr, getnetbyname,
getnetbyaddr, getservbyname, getservbyport, getprotobyname, getprotobynumber.

If one wants to get all the entries of a particular category one can loop through them by:
setpwent;
while(@he = getpwent){
commands...
}
entpwent;

For example: Get a list of all users with their home directories:
setpwent;
while(@he = getpwent){
printf("%-20s%-30s\n",$he[0],$he[7]);
}
endpwent;
The same principle works for all the above data categories. But most of them need a
"stayopen" behind the set command.

Arithmetics
Addition: +
Subtraction: -
Multiplication: *
Division: /
Rise to the power of: **
Rise e to the pwoer of: exp()
Modulus: %
Square root: sqrt()
Absolut value: abs()
Tangens: atan2()
Sinus: sin()
Cosine: cos()
Random number: rand()

Formatting Output with "format"


This should be simplification of the printf formatting. One formats once only and then it
will be used for every write to a specified file handle. Prepare a format somwhere in the
program:

format filehandle =
@<<<<<<<<<<@###.#####@>>>>>>>>>>@||||||||||
$var1, $var3, $var4
.

Now use write to print into that filhandle according to the format:

write FILEHANDLE;
The @<<< does left adjustment, the @>>> right adjustment, @##.## is for numericals
and @||| centers.

Command line Switches


Show the version number of perl: perl -v;
Check a new program without runing it: perl -wc <file>;
Have an editing command on the command
perl -e 'command';
line:
Automatically print while precessing lines: perl -pe 'command' <file>;
Remove line endings and add them again: perl -lpe 'command' <file>;
Edit a file in place: perl -i -pe 'command' <file>;
perl -a -e 'print if $F[3] =~ /ETH/;'
Autosplit the lines while editing:
<file>;
Have an input loop without printing: perl -ne 'command' <file>;