Perl

The Crash Course

 

Perl

•      Free scripting language, downloadable from www.perl.com

•      Runs on Unix, Windows and Macintosh, and works with most Web servers

What is Perl

•      Practical Extraction and Report Language

•      Pathologically Eclectic Rubbish Lister

History

•      Originally developed as a text-processing utility for Unix systems

•      Since HTML pages are just marked-up text files, Perl has taken off with the growth of the World Wide Web

Features

•      Syntax. Similar to C and other Unix utilities.

•      No limits on size of data. Unlike many utilities, Perl does not arbitrarily limit the size of your data — if you've got the memory, Perl can slurp in your whole file as a single string

Features

•      Perl makes it easy for programmers. You don’t have to worry about heavy-duty things like allocating memory, passing pointers and handles back to subroutines, data typing and so forth

Features

•      Does most everything. Perl programs range from a simple one-liners to full-fledged Web servers written entirely in Perl.

Features

•      Associative arrays. Arrays that are keyed on strings, rather than numbers.

•      Regular expressions. One of Perl’s most powerful features, regular expressions are like find-and-replace on steroids

Regular Expression Examples

•      Strip all HTML tags from a document

 

s/<.+?>//gs

Regular Expressions Examples

•      Find e-mail addresses in message headers

 

while(<>){

if (m/^From\s+(\S+@\S+)\s+/) {

$address = $1;

}

Perl and the Web

•      Includes functions for translating input from HTML forms to Perl variables and objects

Cooperation with Other Programs

•      It's "glue" for connecting applications that normally would not talk to one another.

•      On Unix-based systems, where command-line programs proliferate, Perl makes it easy to turn other programs’ output into dynamic Web pages

Searchable Data
Without Data Bases

•      For small data sets -- several thousand records or less -- you can put searchable data on line without investing in SQL Server or other heavy-duty databases

Searchable Data
Without Data Bases

•      Since Perl was origins are in text-processing, it has very easy functions to extract data from text files

Searchable Data
Without Data Bases

•      Combine Perl’s handling of text files, and its regular expressions, and you can take a tab-delimited text file and make it searchable on the Web in very little time

SQL Data Bases

•      If you are using heavy-duty data bases, free modules are available that let you interface with common databases, including Oracle, Sybase, Informix and SQL Server

ODBC Drivers

•      For Windows versions of Perl, an ODBC module is available

•      The syntax is very similar to what you would use in an Active Server Page with VBScript

Limitations

•      It’s easy for large programs to get unwieldy

•      There’s more than one way to shoot yourself in the foot, especially with CGI scripts

•      To newcomers, Perl can look quite cryptic

•      Quite a bit of overhead for high-traffic sites

Hello World

#!/usr/bin/perl

 

print "Hello World!\n";

 

Syntax

•      Case-sensitive

•      Every line ends with a ;

•      White space doesn’t matter

•      Comments are marked with #

 

Scalar Variables

Scalars hold a singe value. There is no differentiation between strings and numbers. All scalar variables are preceded with $.

 

$var = "Matt";

$var = 42;

 

Scalar Variables

#!/usr/bin/perl

 

$name = "World";

print "Hello $name\n";

 

Arrays

Arrays hold a list of values. All arrays are start with @.

 

@array = ("Matt", "Jane", "Carl")

 

Arrays

When you are referring to a single element of the array, use $ before the variable name, since the individual element is a scalar. Subscripts for an array start at 0, as in C. Square brackets are used for the subscript.

$array[2] = "Carl";

print $array[2];

 

Arrays

#!/usr/bin/perl

 

@name = ( "Matt", "Jane", "Carl");

print "Hello $name[0]\n";

 

Foreach

•      foreach is used to loop through arrays, one element at a time.

 

Foreach

#!/usr/bin/perl

 

@name = ( "Matt", "Jane", "Carl");

foreach $person ( @name ) {

print "Hello $person\n";

}

 

Associative Arrays

Associative arrays are arrays that are keyed on strings, rather than numbers. Associative arrays start with %.

 

%color = ( "apple"  => "red",

           "grape"  => "purple",

           "banana" => "yellow" );

 

Associative Arrays

So, the following code:

print $color{"banana"}

prints

yellow

 

Curly braces are used for the subscripts.

 

Associative Arrays

#!/usr/bin/perl

 

%name = (  "self"    => "Matt",

           "sister"  => "Jane",

           "brother" => "Carl" );

print "Hello $name{‘brother’}\n";

 

Files and Filehandles

You read and write data from files, and from the screen or keyboard in Perl by specifying which file handle your want to read from, or write to.

The standard file handle for reading from the keyboard is STDIN

The standard file handle for writing from the screen is STDOUT

 

Files and Filehandles

#!/usr/bin/perl

 

print "Enter your name: ";

$name = <STDIN>;

chomp $name;

print "Hello $name\n";

 

Files and Filehandles

In the previous example, STDIN is the file handle. The <> characters, when used together, are called the diamond operator. It tells Perl to read a line of input from the file handle inside the operator

By associating a file handle with a text file, we can read data in from a file

 

Files and Filehandles

#!/usr/bin/perl

 

open ( PREZ, "prez.txt" );

$name = <PREZ>;

chomp $name;

print "Hello $name\n";

 

Files and Filehandles

We can also use the open command to write to a text file. The syntax is slightly different -- we add a > before the file name to tell Perl to open the file for writing.

open ( PREZ, ">prez.txt" );

 

Files and Filehandles

#!/usr/bin/perl

 

open ( PREZ, "prez.txt" );

open ( PREZOUT, ">prezout.txt" );

$name = <PREZ>;

chomp $name;

print PREZOUT "Hello $name\n";

 

While

•      while is the Perl version of the do ... while loop. It executes while the condition specified is true

 

while ( $name = <PREZ> ) {

chomp $name;

print "Hello $name\n";

}

 

While

#!/usr/bin/perl

 

open ( PREZ, "prez.txt" );

while ( $name = <PREZ> ) {

chomp $name;

print "Hello $name\n";

}

 

If-then

if ( $a == 1 ) {
print "Hello, World\n";

}

If-then-elsif-else

if ( $a == 1 ) {
print "Hello, World";

} elsif ( $a == 2 ) {

print "Hello";

} else {

print "Hi"

}

Numeric Comparisons

== Equal (not =, as you might think)

!= Not equal

<  Less than

<= Less than or equal to

>  Greater than

>= Greater than or equal to

String Comparisons

eq Equal

ne Not equal

 

Regular Expressions

•      You can use a regular expression to find patterns in strings: for example, to look for a specific name in a phone list or all of the names that start with the letter a. Pattern matching is one of Perl's most powerful and probably least understood features

 

Regular Expressions

•      Regular expressions use the
$string =~ /pattern/
notation for matching strings

•      If $string matches the pattern, it returns a value of true

•      There are elaborate rules for forming regular expressions -- consult your cheat sheet here

 

Matching

#!/usr/bin/perl

 

open ( PREZ, "prez.txt" );

while ( $name = <PREZ> ) {

  chomp $name;

  if ( $name =~ /Jefferson/ ) {

    print "$name\n";

  }

}

Case-Insensitive Matching

#!/usr/bin/perl

 

open ( PREZ, "prez.txt" );

while ( $name = <PREZ> ) {

  chomp $name;

  if ( $name =~ /Jefferson/i ) {

    print "$name\n";

  }

}

 

Advanced Matching

#!/usr/bin/perl

open ( PREZ, "prez.txt" );

while ( $name = <PREZ> ) {

   $name =~ /(\w+)/;

   $firstname{$1}++;

}

foreach $n ( keys %firstname ) {

print "$n: $firstname{$n}\n";

}

 

Search and Replace

•      Search-and-replace regular expressions use the
$string =~ s/pattern/newpattern/
notation for matching strings.

•      If a replacement is made on $string, it returns a value of true

 

Search and Replace

#!/usr/bin/perl

 

open ( PREZ, "prez.txt" );

while ( $name = <PREZ> ) {

  chomp $name;

  if ( $name =~ s/George/Fred/ ) {

    print "$name\n";

  }

}

 

Search and Replace

•      As with matching, adding an i after the last / does case-insensitive searching

•      The default behavoir for s// is to only replace one occurance per line. If we want multiple occurances, we need to add a g after the last /

 

Advanced Search-and-Replace

#!/usr/bin/perl

 

open ( PREZ, "prez.txt" );

while ( $name = <PREZ> ) {

  chomp $name;

  if ( $name =~ s/(.+) (.+)/$2, $1/ ) {

    print "$name\n";

  }

}

Splitting

•      Split splits a line into an array

•      You can choose the pattern it splits the line on

 

Splitting

#!/usr/bin/perl

 

open ( CPI, "cpi.txt" );

while ( $line = <CPI> ) {

  chomp $line;

  ( $year, $cpi ) = split ( /\t/, $line );

  print "The CPI in $year was $cpi\n";

}