Вы находитесь на странице: 1из 142

Practical Report and Extraction Language

(PERL)
Introduction
• What is PERL?
– Practical Report and Extraction Language.
– It is an interpreted language optimized for scanning
arbitrary text files, extracting information from them, and
printing reports based on that information.
– Very powerful string handling features.
– Available on all platforms.

Internet & Web Based Technology 2


Main Advantages
• Speed of development
– You can enter the program in a text file, and just run it. It is an
interpretive language; no compiler is needed.
• It is powerful
– The regular expressions of Perl are extremely powerful.
– Uses sophisticated pattern matching techniques to scan large
amounts of data very quickly.
• Portability
– Perl is a standard language and is available on all platforms.
– Free versions are available on the Internet.
• Editing Perl programs
– No sophisticated editing tool is needed.
– Any simple text editor like Notepad or vi will do.

Internet & Web Based Technology 3


• Flexibility
– Perl does not limit the size of your data.
– If memory is available, Perl can handle the whole file as a single
string.
– Allows one to write simple programs to perform complex tasks.

Internet & Web Based Technology 4


How to run Perl?
• Perl can be downloaded from the Internet.
– Available on almost all platforms.
• Assumptions:
– For Windows operating system, you can run Perl programs
from the command prompt.
• Run “cmd” to get command prompt window.
– For Unix/Linux, you can run directly from the shell prompt.

Internet & Web Based Technology 5


Working through an example
• Recommended steps:
– Create a directory/folder where you will be storing the Perl
files.
– Using any text editor, create a file “test.pl” with the
following content:

print “Good day\n”;


print “This is my first Perl program\n”;

– Execute the program by typing the following at the


command prompt:
perl test.pl

Internet & Web Based Technology 6


• On Unix/Linux, an additional line has to be given at
the beginning of every Perl program.

#!/usr/bin/perl
print “Good day\n”;
print “This is my first Perl program \n”;

Internet & Web Based Technology 7


Variables
• Scalar variables
– A scalar variable holds a single value.
– Other variable types are also available (array and
associative array) – to be discussed later.
– A ‘$’ is used before the name of a variable to indicate that it
is a scalar variable.
$xyz = 20;

Internet & Web Based Technology 8


• Some examples:
$a = 10;
$name=“Indranil Sen Gupta”;
$average = 28.37;

– Variables do not have any fixed types.

– Variables can be printed as:


print “My name is $name, the average
temperature is $average\n”;

Internet & Web Based Technology 9


• Data types:
– Perl does not specify the types of variables.
• It is a loosely typed language.
• Languages like C or java are strongly typed.

Internet & Web Based Technology 10


Variable Interpolation

• A powerful feature
– Variable names are automatically replaced by values when
they appear in double-quoted strings.
• An example:
$stud = “Rupak”;
$marks = 75;
print “Marks obtained by $stud is $marks\n”;
print ‘Marks obtained by $stud is $marks\n’;

Internet & Web Based Technology 11


– The program will give the following output:

Marks obtained by Rupak is 75


Marks obtained by $stud is $marks

– What do we see:
• If we need to do variable interpolation, use double
quotes; otherwise, use single quotes.

Internet & Web Based Technology 12


• Another example:

$Expense = ‘$100’;
print “The expenditure is $Expense.\n”;

Internet & Web Based Technology 13


Expressions with Scalars

• Illustrated through examples (syntax similar to C)

$abc = 10;
$abc++;
$total- -;
$a = $b ** 10; # exponentiation
$a = $b % 10; # modulus
$balance = $balance + $deposit;
$balance += $deposit;

Internet & Web Based Technology 14


• Operations on strings:
– Concatenation: the dot (.) is used.
$a = “Good”;
$b = “ day”;
$c = “\n”;
$total = $a.$b.$c; # concatenate the strings

$a .= “ day\n”; # add to the string $a

Internet & Web Based Technology 15


– Arithmetic operations on strings
$a = “bat”;
$b = $a + 1;
print $a, “ and ”, $b;

will print bat and bau

– Operations carried out based on ASCII codes.


• May not always be meaningful.

Internet & Web Based Technology 16


– String repetition operator (x).
$a = $b x3;
will concatenate three copies of $b and assign it to $a.

print “Ba”. “na”x2;

will print the string “banana”.

Internet & Web Based Technology 17


String as a Number
• A string can be used in an arithmetic expression.
– How is the value evaluated?
– When converting a string to a number, Perl takes any
spaces, an optional minus sign, and as many digits it can
find (with dot) at the beginning of the string, and ignores
everything else.

“23.54” evaluates to 23.54


“123Hello25” evaluates to 123
“banana” evaluates to 0

Internet & Web Based Technology 18


Escaping
• The character ‘\’ is used as the escape character.
– It escapes all of Perl’s special characters (e.g., $, @, #, etc.).

$num = 20;
print “Value of \$num is $num\n”;

print “The windows path is c:\\perl\\”;

Internet & Web Based Technology 19


Line Oriented Quoting
• Perl supports specification of a string spanning
multiple lines.
– Use the marker ‘<<’.
– Follow it by a string, which is used to terminate the quoted
material.
• Example:
print << terminator;
Hello, how are you?
Good day.
terminator

Internet & Web Based Technology 20


• Another example:

print “<HTML>\n”;
print “<HEAD><TITLE>Test page </TITLE></HEAD>\n”;
print “<BODY>\n”;
print “<H2>This is a test document.<H2>\n”;
print “</BODY></HTML>”;

Internet & Web Based Technology 21


print << EOM;
<HTML>
<HEAD><TITLE>Test page </TITLE></HEAD>
<BODY>
<H2>This is a test document.<H2>
</BODY></HTML>
EOM

Internet & Web Based Technology 22


Lists and Arrays
Basic Difference

• List is an ordered list of scalars.


• Array is a variable that holds a list.
• Each element of an array is a scalar.
• The size of an array:
– Lower limit: 0
– Upper limit: no specific limit; depends on virtual memory.

Internet & Web Based Technology 24


List Literal

• Examples:
(10, 20, 50, 100)
(‘red', “blue", “green")
(“a", 1, 2, 3, ‘b')

($a, 12)
() # empty list
(10..20) # list constructor function
(‘A’..’Z’) # same, for lettere\s

Internet & Web Based Technology 25


Specifying Array Variable
• We use the special character ‘@’.
@months # denotes an array

The individual elements of the array are scalars, and can be


referred to as:
$months[0] # first element of @months
$months[1] # second element of @months
……

Internet & Web Based Technology 26


Initializing an Array
• Two ways:
– Specify values, separated by commas.
@color = (‘red’, ‘green’, “blue”, “black”);

– Use the quote words (qw) function, that uses space as the
delimiter:
@color = qw (red green blue black);

Internet & Web Based Technology 27


Array Assignment
– Assign from a list of literals
@numbers = (1, 2, 3);
@colors = (“red”, “green”, “blue”);

– From the contents of another array.


@array1 = @array2;

– Using the qw function:


@word = qw (Hello good morning);

– Combination of above:
@allcolors = (“white”, @colors, “brown”);

Internet & Web Based Technology 28


– Some other examples:

@xyz = (2..5);

@xyz = (1, @xyz);

@xyz = (@xyz, 6);

Internet & Web Based Technology 29


Multiple Assignments
($x, $y, $y) = (10, 20, 30);

($x, $y) = ($y, $x); # swap elements

($a, @col) = (‘red’, ‘green’, ‘blue’);

# $a gets the value ‘red’


# @col gets the value (‘green’, ‘blue’)

($first, @val, $last) = (1, 2, 3, 4);

# $first gets the value 1


# @val gets the value (2, 3, 4)
# $last is undefined

Internet & Web Based Technology 30


Number of Elements in Array

• Two ways:
$size = scalar @colors;
$size = @colors;

Internet & Web Based Technology 31


Accessing Elements
@list = (1, 2, 3, 4);

$first = $list[0];

$fourth = $list[3];

$list[1]++; # array becomes (1, 3, 3, 4)

$x = $list[5]; # $x gets the value undef

$list[2] = “Go”; # array becomes (1, 2, “Go”, 4)

Internet & Web Based Technology 32


• The $# is the index of the last element of the array.
@value = (1, 2, 3, 4, 5);

print “$#value \n”; # prints 4

• An empty array has the value


$#value = -1;

Internet & Web Based Technology 33


shift and unshift
• They operate on the front of the array.
– ‘shift’ removes the first element of the array.
– ‘unshift’ replaces the element at the start of the array.

Internet & Web Based Technology 34


• Example:
@color = qw (red, blue, green, black);

$first = shift @color;


# $first gets “red”, and @color becomes
# (blue, green, black)

unshift (@color, “white”);


# @color becomes (white, blue, green, black)

Internet & Web Based Technology 35


pop and push
• They operate on the bottom of the array.
– ‘pop’ removes the last element of the array.
– ‘push’ replaces the last element of the array.

Internet & Web Based Technology 36


• Example:
@color = qw (red, blue, green, black);

$first = pop @color;


# $first gets “black”, and @color becomes
# (red, blue, green)

push (@color, “white”);


# @color becomes (red, blue, green, white)

Internet & Web Based Technology 37


Reversing an Array

• By using the ‘reverse’ keyword.

@names = (“Mina”, “Tina”, ‘Rina”)

@rev = reverse @names;


# Reversed list stored in ‘rev’.

@names = reverse @names;


# Original array is reversed.

Internet & Web Based Technology 38


Printing an Array

• Example:

@colors = qw (red, green, blue);

print @colors;
# prints without spaces – redgreenblue

print “@colors”;
# prints with spaces – red green blue

Internet & Web Based Technology 39


Sort the Elements of an Array
• Using the ‘sort’ keyword, by default we can sort the
elements of an array lexicographically.
– Elements considered as strings.

@colors = qw (red blue green black);


@sort_col = sort @colors
# Array @sort_col is (black blue green red)

Internet & Web Based Technology 40


– Another example:

@num = qw (10 2 5 22 7 15);


@new = sort @num;
# @new will contain (10 15 2 22 5 7)

– How do sort numerically?

@num = qw (10 2 5 22 7 15);


@new = sort {$a <=> $b} @num;
# @new will contain (2 5 7 10 15 22)

Internet & Web Based Technology 41


The ‘splice’ function
• Arguments to the ‘splice’ function:
– The first argument is an array.
– The second argument is an offset (index number of the list
element to begin splicing at).
– Third argument is the number of elements to remove.

@colors = (“red”, “green”, “blue”, “black”);


@middle = splice (@colors, 1, 2);
# @middle contains the elements removed

Internet & Web Based Technology 42


File Handling
Interacting with the user
• Read from the keyboard (standard input).
– Use the file handle <STDIN>.
– Very simple to use.

print “Enter your name: ”;


$name = <STDIN>; # Read from keyboard
print “Good morning, $name. \n”;

– $name also contains the newline character.


• Need to chop it off.

Internet & Web Based Technology 44


The ‘chop’ Function
• The ‘chop’ function removes the last character of
whatever it is given to chop.
• In the following example, it chops the newline.

print “Enter your name: ”;


chop ($name = <STDIN>);
# Read from keyboard and chop newline
print “Good morning, $name. \n”;

• ‘chop’ removes the last character irrespective of


whether it is a newline or not.
– Sometimes dangerous.

Internet & Web Based Technology 45


Safe chopping: ‘chomp’
• The ‘chomp’ function works similar to ‘chop’, with
the difference that it chops off the last character
only if it is a newline.

print “Enter your name: ”;


chomp ($name = <STDIN>);
# Read from keyboard and chomp newline
print “Good morning, $name. \n”;

Internet & Web Based Technology 46


File Operations
• Opening a file
– The ‘open’ command opens a file and returns a file handle.
– For standard input, we have a predefined handle <STDIN>.

$fname = “/home/isg/report.txt”;
open XYZ , $fname;
while (<XYZ>) {
print “Line number $. : $_”;
}

Internet & Web Based Technology 47


– Checking the error code:

$fname = “/home/isg/report.txt”;
open XYZ, $fname or die “Error in open: $!”;
while (<XYZ>) {
print “Line number $. : $_”;
}

– $. returns the line number (starting at 1)


– $_ returns the contents of last match
– $i returns the error code/message

Internet & Web Based Technology 48


• Reading from a file:
– The last example also illustrates file reading.
– The angle brackets (< >) are the line input operators.
• The data read goes into $_

Internet & Web Based Technology 49


• Writing into a file:

$out = “/home/isg/out.txt”;
open XYZ , “>$out” or die “Error in write: $!”;
for $i (1..20) {
print XYZ “$i :: Hello, the time is”,
scalar(localtime), “\n”;
}

Internet & Web Based Technology 50


• Appending to a file:

$out = “/home/isg/out.txt”;
open XYZ , “>>$out” or die “Error in write: $!”;
for $i (1..20) {
print XYZ “$i :: Hello, the time is”,
scalar(localtime), “\n”;
}

Internet & Web Based Technology 51


• Closing a file:
close XYZ;
where XYZ is the file handle of the file being closed.

Internet & Web Based Technology 52


• Printing a file:
– This is very easy to do in Perl.

$input = “/home/isg/report.txt”;
open IN, $input or die “Error in open: $!”;
while (<IN>) {
print;
}
close IN;

Internet & Web Based Technology 53


Command Line Arguments
• Perl uses a special array called @ARGV.
– List of arguments passed along with the script name on the
command line.
– Example: if you invoke Perl as:
perl test.pl red blue green
then @ARGV will be (red blue green).
– Printing the command line arguments:

foreach (@ARGV) {
print “$_ \n”;
}

Internet & Web Based Technology 54


Standard File Handles
• <STDIN>
– Read from standard input (keyboard).
• <STDOUT>
– Print to standard output (screen).
• <STDERR>
– For outputting error messages.
• <ARGV>
– Reads the names of the files from the command line and
opens them all.

Internet & Web Based Technology 55


– @ARGV array contains the text after the program’s name in
command line.
• <ARGV> takes each file in turn.
• If there is nothing specified on the command line, it
reads from the standard input.
– Since this is very commonly used, Perl provides an
abbreviation for <ARGV>, namely, < >
– An example is shown.

Internet & Web Based Technology 56


$lineno = 1;
while (< >) {
print $lineno ++;
print “$lineno: $_”;
}

– In this program, the name of the file has to be given on the


command line.
perl list_lines.pl file1.txt
perl list_lines.pl a.txt b.txt c.txt

Internet & Web Based Technology 57


Control Structures
Introduction
• There are many control constructs in Perl.
– Similar to those in C.
– Would be illustrated through examples.
– The available constructs:
• for
• foreach
• if/elseif/else
• while
• do, etc.

Internet & Web Based Technology 59


Concept of Block
• A statement block is a sequence of statements
enclosed in matching pair of { and }.

if (year == 2000) {
print “You have entered new millenium.\n”;
}

• Blocks may be nested within other blocks.

Internet & Web Based Technology 60


Definition of TRUE in Perl
• In Perl, only three things are considered as FALSE:
– The value 0
– The empty string (“ ”)
– undef
• Everything else in Perl is TRUE.

Internet & Web Based Technology 61


if .. else
• General syntax:

if (test expression) {
# if TRUE, do this
}
else {
# if FALSE, do this
}

Internet & Web Based Technology 62


• Examples:

if ($name eq ‘isg’) {
print “Welcome Indranil. \n”;
} else {
print “You are somebody else. \n”;
}

if ($flag == 1) {
print “There has been an error. \n”;
}
# The else block is optional

Internet & Web Based Technology 63


elseif
• Example:

print “Enter your id: ”;


chomp ($name = <STDIN>);
if ($name eq ‘isg’) {
print “Welcome Indranil. \n”;
} elseif ($name eq ‘bkd’) {
print “Welcome Bimal. \n”;
} elseif ($name eq ‘akm’) {
print “Welcome Arun. \n”;
} else {
print “Sorry, I do not know you. \n”;
}

Internet & Web Based Technology 64


while

• Example: (Guessing the correct word)

$your_choice = ‘ ‘;
$secret_word = ‘India’;
while ($your_choice ne $secret_word) {
print “Enter your guess: \n”;
chomp ($your_choice = <STDIN>);
}

print “Congratulations! Mera Bharat Mahan.”

Internet & Web Based Technology 65


for
• Syntax same as in C.
• Example:

for ($i=1; $i<10; $i++) {


print “Iteration number $i \n”;
}

Internet & Web Based Technology 66


foreach
• Very commonly used function that iterates over a
list.
• Example:

@colors = qw (red blue green);


foreach $name (@colors) {
print “Color is $name. \n”;
}

• We can use ‘for’ in place of ‘foreach’.

Internet & Web Based Technology 67


• Example: Counting odd numbers in a list
@xyz = qw (10 15 17 28 12 77 56);
$count = 0;

foreach $number (@xyz) {


if (($number % 2) == 1) {
print “$number is odd. \n”;
$count ++;
}
print “Number of odd numbers is $count. \n”;
}

Internet & Web Based Technology 68


Breaking out of a loop
• The statement ‘last’, if it appears in the body of a
loop, will cause Perl to immediately exit the loop.

– Used with a conditional.

last if (i > 10);

Internet & Web Based Technology 69


Skipping to end of loop
• For this we use the statement ‘next’.
– When executed, the remaining statements in the loop will
be skipped, and the next iteration will begin.
– Also used with a conditional.

Internet & Web Based Technology 70


Relational Operators
The Operators Listed
Comparison Numeric String

Equal == eq

Not equal != ne

Greater than > gt

Less than < lt

Greater or equal >= ge

Less or equal <= le

Internet & Web Based Technology 72


Logical Connectives
• If $a and $b are logical expressions, then the
following conjunctions are supported by Perl:
– $a and $b $a && $b
– $a or $b $a || $b
– not $a ! $a
• Both the above alternatives are equivalent; first one
is more readable.

Internet & Web Based Technology 73


String Functions
The Split Function
• ‘split’ is used to split a string into multiple pieces using a
delimiter, and create a list out of it.

$_=‘Red:Blue:Green:White:255';
@details = split /:/, $_;
foreach (@details) {
print “$_\n”;
}

– The first parameter to ‘split’ is a regular expression that


specifies what to split on.
– The second specifies what to split.

Internet & Web Based Technology 75


• Another example:

$_= “Indranil isg@iitkgp.ac.in 283496”;


($name, $email, $phone) = split / /, $_;

• By default, ‘split’ breaks a string using space as


delimiter.

Internet & Web Based Technology 76


The Join Function
• ‘join’ is used to concatenate several elements into a
single string, with a specified delimiter in between.

$new = join ' ', $x1, $x2, $x3, $x4, $x5, $x6;

$sep = ‘::’;
$new = join $sep, $x1, $x2, $w3, @abc, $x4, $x5;

Internet & Web Based Technology 77


Regular Expressions
Introduction
• One of the most useful features of Perl.
• What is a regular expression (RegEx)?
– Refers to a pattern that follows the rules of syntax.
– Basically specifies a chunk of text.
– Very powerful way to specify string patterns.

Internet & Web Based Technology 79


An Example: without RegEx

$found = 0;
$_ = “Hello good morning everybody”;
$search = “every”;
foreach $word (split) {
if ($word eq $search) {
$found = 1;
last;
}
}
if ($found) {
print “Found the word ‘every’ \n”;
}

Internet & Web Based Technology 80


Using RegEx
$_ = “Hello good morning everybody”;

if ($_ =~ /every/) {
print “Found the word ‘every’ \n”;
}

• Very easy to use.


• The text between the forward slashes defines the
regular expression.
• If we use “!~” instead of “=~”, it means that the
pattern is not present in the string.

Internet & Web Based Technology 81


• The previous example illustrates literal texts as
regular expressions.
– Simplest form of regular expression.
• Point to remember:
– When performing the matching, all the characters in the
string are considered to be significant, including
punctuation and white spaces.
• For example, /every / will not match in the previous
example.

Internet & Web Based Technology 82


Another Simple Example
$_ = “Welcome to IIT Kharagpur, students”;

if (/IIT K/) {
print “’IIT K’ is present in the string\n”;
{

if (/Kharagpur students/) {
print “This will not match\n”;
}

Internet & Web Based Technology 83


Types of RegEx

• Basically two types:


– Matching
• Checking if a string contains a substring.
• The symbol ‘m’ is used (optional if forward slash used
as delimiter).
– Substitution
• Replacing a substring by another substring.
• The symbol ‘s’ is used.

Internet & Web Based Technology 84


Matching
The =~ Operator
• Tells Perl to apply the regular expression on the
right to the value on the left.
• The regular expression is contained within
delimiters (forward slash by default).
– If some other delimiter is used, then a preceding ‘m’ is
essential.

Internet & Web Based Technology 86


Examples
$string = “Good day”;

if ($string =~ m/day/) {
print “Match successful \n";
}

if ($string =~ /day/) {
print “Match successful \n";
}

• Both forms are equivalent.


• The ‘m’ in the first form is optional.

Internet & Web Based Technology 87


$string = “Good day”;

if ($string =~ m@day@) {
print “Match successful \n";
}

if ($string =~ m[day[ ) {
print “Match successful \n";
}

• Both forms are equivalent.


• The character following ‘m’ is the delimiter.

Internet & Web Based Technology 88


Character Class
• Use square brackets to specify “any value in the list
of possible values”.
my $string = “Some test string 1234";
if ($string =~ /[0123456789]/) {
print "found a number \n";
}
if ($string =~ /[aeiou]/) {
print "Found a vowel \n";
}
if ($string =~ /[0123456789ABCDEF]/) {
print "Found a hex digit \n";
}

Internet & Web Based Technology 89


Character Class Negation
• Use ‘^’ at the beginning of the character class to
specify “any single element that is not one of these
values”.

my $string = “Some test string 1234";


if ($string =~ /[^aeiou]/) {
print "Found a consonant\n";
}

Internet & Web Based Technology 90


Pattern Abbreviations
• Useful in common cases

. Anything except newline (\n)

\d A digit, same as [0-9]

\w A word character, [0-9a-zA-Z_]

\s A space character (tab, space, etc)

\D Not a digit, same as [^0-9]

\W Not a word character

\S Not a space character

Internet & Web Based Technology 91


$string = “Good and bad days";

if ($string =~ /d..s/) {
print "Found something like days\n";
}

if ($string =~ /\w\w\w\w\s/) {
print "Found a four-letter word!\n";
}

Internet & Web Based Technology 92


Anchors
• Three ways to define an anchor:
^ :: anchors to the beginning of string
$ :: anchors to the end of the string
\b :: anchors to a word boundary

Internet & Web Based Technology 93


if ($string =~ /^\w/)
:: does string start with a word character?

if ($string =~ /\d$/)
:: does string end with a digit?

if ($string =~ /\bGood\b/)
:: Does string contain the word “Good”?

Internet & Web Based Technology 94


Multipliers

• There are three multiplier characters.


* :: Find zero or more occurrences
+ :: Find one or more occurrences
? :: Find zero or one occurrence
• Some example usages:
$string =~ /^\w+/;
$string =~ /\d?/;
$string =~ /\b\w+\s+/;
$string =~ /\w+\s?$/;

Internet & Web Based Technology 95


Substitution
Basic Usage

• Uses the ‘s’ character.


• Basic syntax is:
$new =~ s/pattern_to_match/new_pattern/;

What this does?


• Looks for pattern_to_match in $new and, if found,
replaces it with new_pattern.
• It looks for the pattern once. That is, only the first
occurrence is replaced.
• There is a way to replace all occurrences (to be
discussed shortly).

Internet & Web Based Technology 97


Examples

$xyz = “Rama and Lakshman went to the forest”;

$xyz =~ s/Lakshman/Bharat/;

$xyz =~ s/R\w+a/Bharat/;

$xyz =~ s/[aeiou]/i/;

$abc = “A year has 11 months \n”;

$abc =~ s/\d+/12/;

$abc =~ s /\n$/ /;

Internet & Web Based Technology 98


Common Modifiers

• Two such modifiers are defined:


/i :: ignore case
/g :: match/substitute all occurrences

$string = “Ram and Shyam are very honest";


if ($string =~ /RAM/i) {
print “Ram is present in the string”;
}

$string =~ s/m/j/g;
# Ram -> Raj, Shyam -> Shyaj

Internet & Web Based Technology 99


Use of Memory in RegEx
• We can use parentheses to capture a piece of
matched text for later use.
– Perl memorizes the matched texts.
– Multiple sets of parentheses can be used.
• How to recall the captured text?
– Use \1, \2, \3, etc. if still in RegEx.
– Use $1, $2, $3 if after the RegEx.

Internet & Web Based Technology 100


Examples

$string = “Ram and Shyam are honest";

$string =~ /^(\w+)/;
print $1, "\n"; # prints “Ra\n”

$string =~ /(\w+)$/;
print $1, "\n"; # prints “st\n”

$string =~ /^(\w+)\s+(\w+)/;
print "$1 $2\n";
# prints “Ramnd Shyam are honest”;

Internet & Web Based Technology 101


$string = “Ram and Shyam are very poor";

if ($string =~ /(\w)\1/) {
print "found 2 in a row\n";
}

if ($string =~ /(\w+).*\1/) {
print "found repeat\n";
}

$string =~ s/(\w+) and (\w+)/$2 and $1/;

Internet & Web Based Technology 102


Example 1
• validating user input

print “Enter age (or 'q' to quit): ";


chomp (my $age = <STDIN>);

exit if ($age =~ /^q$/i);

if ($age =~ /\D/) {
print "$age is a non-number!\n";
}

Internet & Web Based Technology 103


Example 2: validation contd.
• File has 2 columns, name and age, delimited by one
or more spaces. Can also have blank lines or
commented lines (start with #).
open IN, $file or die "Cannot open $file: $!";
while (my $line = <IN>) {
chomp $line;
next if ($line =~ /^\s*$/ or $line =~ /^\s*#/);
my ($name, $age) = split /\s+/, $line;
print “The age of $name is $age. \n";
}

Internet & Web Based Technology 104


Some Special Variables
$&, $` and $’
• What is $&?
– It represents the string matched by the last successful
pattern match.
• What is $`?
– It represents the string preceding whatever was matched
by the last successful pattern match.
• What is $‘?
– It represents the string following whatever was matched by
the last successful pattern match .

Internet & Web Based Technology 106


– Example:

$_ = 'abcdefghi';
/def/;
print "$\`:$&:$'\n";
# prints abc:def:ghi

Internet & Web Based Technology 107


• So actually ….
– S` represents pre match
– $& represents present match
– $’ represents post match

Internet & Web Based Technology 108


Associative Arrays
Introduction
• Associative arrays, also known as hashes.
– Similar to a list
• Every list element consists of a pair, a hash key and a
value.
• Hash keys must be unique.
– Accessing an element
• Unlike an array, an element value can be found out by
specifying the hash key value.
• Associative search.
– A hash array name must begin with a ‘%’.

Internet & Web Based Technology 110


Specifying Hash Array
• Two ways to specify:

– Specifying hash keys and values, in proper sequence.

%directory = (
“Rabi”, “258345”,
“Chandan”, “325129”,
“Atul”, “445287”,
“Sruti”, “237221”
);

Internet & Web Based Technology 111


– Using the => operator.

%directory = (
Rabi => “258345”,
Chandan => “325129”,
Atul => “445287”,
Sruti => “237221”
);

– Whatever appears on the left hand side of ‘=>’ is treated as


a double-quoted string.

Internet & Web Based Technology 112


Conversion Array <=> Hash
• An array can be converted to hash.

@list = qw (Rabi 258345 Chandan 325129 Atul


445287 Sruti 237221);
%directory = @list;

• A hash can be converted to an array:

@list = %directory;

Internet & Web Based Technology 113


Accessing a Hash Element
• Given the hash key, the value can be accessed
using ‘{ }’.
• Example:

@list = qw (Rabi 258345 Chandan 325129 Atul


445287 Sruti 237221);
%directory = @list;
print “Atul’s number is $directory{“Atul”} \n”;

Internet & Web Based Technology 114


Modifying a Value
• By simple assignment:

@list = qw (Rabi 258345 Chandan 325129 Atul


445287 Sruti 237221);
%directory = @list;

$directory{Sruti} = “453322”;
$directory{‘Chandan’} ++;

Internet & Web Based Technology 115


Deleting an Entry
• A (hash key, value) pair can be deleted from a hash
array using the “delete” function.
– Hash key has to be specified.

@list = qw (Rabi 258345 Chandan 325129 Atul


445287 Sruti 237221);
%directory = @list;
delete $directory{Atul};

Internet & Web Based Technology 116


Swapping Keys and Values
• Why needed?
– Suppose we want to search for a person, given the phone
number.

@list = qw (Rabi 258345 Chandan 325129 Atul


445287 Sruti 237221);
%directory = @list;

%revdir = reverse %directory;


print “$revdir{237221} \n”;

Internet & Web Based Technology 117


Using Functions ‘keys’, ‘values’
• ‘keys’ returns all the hash keys as a list.
• ‘values’ returns all the values as a list.

@list = qw (Rabi 258345 Chandan 325129 Atul


445287 Sruti 237221);
%directory = @list;

@all_names = keys %directory;


@all_phones = values %directory;

Internet & Web Based Technology 118


An Example
• List all person names and telephone numbers.

@list = qw (Rabi 258345 Chandan 325129 Atul


445287 Sruti 237221);
%directory = @list;

foreach $name (keys %directory) {


print “$name \t $directory{$name} \n”;
}

Internet & Web Based Technology 119


Subroutines
Introduction
• A subroutine …..
– Is a user-defined function.
– Allows code reuse.
– Define ones, use multiple times.

Internet & Web Based Technology 121


How to use?
• Defining a subroutine
sub test_sub {
# the body of the subroutine goes here
# ……..
}

• Calling a subroutine
– Use the ‘&’ prefix to call a subroutine.
&test_sub;
&gcd ($val1, $val2); # Two parameters
– However, the ‘&’ is optional.

Internet & Web Based Technology 122


Subroutine Return Values
• Use the ‘return’ statement.
– This is also optional.
– If the keyword ‘return’ is omitted, Perl functions return the
last value evaluated.
• A subroutine can also return a non-scalar.
• Some examples are given next.

Internet & Web Based Technology 123


Example 1
$name = ‘Indranil';
welcome(); # call the first sub
welcome_namei(); # call the second sub
exit;

sub welcome {
print "hi there\n";
}

sub welcome_name {
print "hi $name\n";
# uses global $name variable
}

Internet & Web Based Technology 124


Example 2
# Return a non-scalar
sub return_alpha_and_beta {
return ($alpha, $beta);
}

$alpha = 15;
$beta = 25;

@c = return_alpha_and_beta;
# @c gets (5,6)

Internet & Web Based Technology 125


Passing Arguments
• All arguments are passed into a Perl function
through the special array $_.
– Thus, we can send as many arguments as we want.
• Individual arguments can also be accessed as $_[0],
$_[1], $_[2], etc.

Internet & Web Based Technology 126


Example 3

# Two different ways to write a subroutine to add two numbers


sub add_ver1 {
($first, $second) = @_;
return ($first + $second);
}

sub add_ver2 {
return $_[0] + $_[1];
# $_[0] and $_[1] are the first two
# elements of @_
}

Internet & Web Based Technology 127


Example 4

$total = find_total (5, 10, -12, 7, 40);

sub find_total {
# adds all numbers passed to the sub
$sum = 0;
for $num (@_) {
$sum += $num;
}
return $sum;
}

Internet & Web Based Technology 128


‘my’ variables
• We can define local variables using the ‘my’
keyword.
– Confines a variable to a region of code (within a block { } ).
– ‘my’ variable’s storage is freed whenever the variable goes
out of scope.
– All variables in Perl is by default ‘global’.

Internet & Web Based Technology 129


Example 5

$sum = 7;
$total = add_any (20, 10, -15);
# $total gets 15

sub add_any {
# local variable, won't interfere
# with global $sum
my $sum = 0;

for my $num (@_ ) {


$sum += $num;
}
return $sum;
}

Internet & Web Based Technology 130


Writing CGI Scripts in Perl
Introduction
• Perl provides with a number of facilities to facilitate
writing of CGI scripts.
– Standard library modules.
• Included as part of the Perl distribution.
• No need to install them separately.

#!/usr/bin/perl
use CGI qw (:standard);

Internet & Web Based Technology 132


• Some of the functions included in the CGI.pm (.pm
is optional) are:
– header
• This prints out the “Content-type” header.
• With no arguments, the type is assumed to be
“text/html”.
– start_html
• This prints out the <html>, <head>, <title> and <body>
tags.
• Accepts optional arguments.

Internet & Web Based Technology 133


– end_html
• This prints out the closing HTML tags, </body>, >/html>.

• Typical usages and arguments would be illustrated


through examples.

Internet & Web Based Technology 134


Example 1 (without using CGI.pm)

#!/usr/bin/perl
print <<TO_END;
Content-type: text/html

<HTML> <HEAD> <TITLE> Server Details </TITLE>


</HEAD>
<BODY>
Server name: $ENV{SERVER_NAME} <BR>
Server port number: $ENV{SERVER_PORT} <BR>
Server protocol: $ENV{SERVER_PROTOCOL}
</BODY> </HTML>
TO_END

Internet & Web Based Technology 135


Example 2 (using CGI.pm)

#!/usr/bin/perl -wT
use CGI qw(:standard);

print header (“text/html”);


print start_html ("Hello World");
print "<h2>Hello, world!</h2>\n";
print end_html;

Internet & Web Based Technology 136


Example 3: Decoding Form Input

sub parse_form_data {
my %form_data;
my $name_value;
my @nv_pairs = split /&/, $ENV{QUERY_STRING};

if ( $ENV{REQUEST_METHOD} eq ‘POST’ ) {
my $query = “”;
read (STDIN, $query, $ENV{CONTENT_LENGTH});
push @nv_pairs, split /&/, $query;
}

Internet & Web Based Technology 137


foreach $name_value (@nv_pairs) {
my ($name, $value) = split /=/, $name_value;

$name =~ tr/+/ /;
$name =~ s/%([\da-f][\da-f])/chr (hex($1))/egi;
$value =~ tr/+/ /;
$value =~ s/%([\da-f][\da-f])/chr (hex($1))/egi;

$form_data{$name} = $value;
}
return %form_data;
}

Internet & Web Based Technology 138


Using CGI.pm
• The decoded form value can be directly accessed
as:
$value = param (‘fieldname’);

• An equivalent Perl code as in the last example using


CGI.pm
– Shown in next slide.

Internet & Web Based Technology 139


Example 4

#!/usr/bin/perl -wT
use CGI qw(:standard);

my %form_data;
foreach my $name (param() ) {
$form_data {$name} = param($name);
}

Internet & Web Based Technology 140


Example 5: sending mail

#!/usr/bin/perl -wT
use CGI qw(:standard);

print header;
print start_html (“Response to Guestbook”);
$ENV{PATH} = “/usr/sbin”; # to locate sendmail
open (MAIL, “| /usr/sbin/sendmail –oi –t”);
# open the pipe to sendmail
my $recipient = ‘xyz@hotmail.com’;
print MAIL “To: $recipient\n”;
print MAIL “From: isg\@cse.iitkgp.ac.in\n”;
print MAIL “Subject: Submitted data\n\n”;

Internet & Web Based Technology 141


foreach my $xyz (param()) {
print MAIL “$xyz = “, param($xyz), “\n”;
}

close (MAIL);

print <<EOM;
<h2>Thanks for the comments</h2>
<p>Hope you visit again.</p>
EOM

print end_html;

Internet & Web Based Technology 142

Вам также может понравиться