Learning Perl

Most programming tutorials start out with a simple program the prints "Hello World" to the screen.. This is one way to introduce the student to one of the least complex, but a complete and functional program. This document is meant primarily for people who will be using perl in a UNIX environment, and all of the scripts will assume that perl executable is located in /usr/bin/perl. Here is my version of this script:

#!/usr/bin/perl

#hello.pl

print "Howdy World\n";

The first line tells the UNIX system what program should be used to interpret the data that follows. Anybody who is familiar with writing shell scripts should be used to seeing #!/bin/sh or some other shell at the beginning of a script. The second line is a comment, which in this case is the name of the file that the script is stored in. I like to do this so that if you print out the script, you know the name of the file the printout came from. The third line contains the "guts" that do the work. The print command is used to display output, and in its simplest form, displays text to the screen. The format of the rest of the line is text inside of quotes, followed by a semicolon. The \n at the end of the text is a "carriage return" which means "move to the next line." Without it, anything printed to the screen after this command would appear on the same line. Try running this script without the carriage return and see if you notice the difference.

(Remember that in the UNIX world you need to make a file executable with the chmod command before you can run it: chmod 755 hello.pl) One other thing that should be noted is that this line (like most lines that contain commands in a perl script) ends with a semicolon (;).

Most scripts need to do something more complex than just displaying static text, and the "work" usually involves retrieving information from a file. As an example, here is a script that gets the password entry for a user named cmrice.

#!/usr/bin/perl

#ex1.pl

$result = `grep cmrice /etc/passwd`;

print "$result\n";

Again the first line tells the system what to use to interpret the contents of the file, and the second line is a comment made up of the name of the file. The third line will look familiar to shell programmers; it is running an external command, and storing the results in a variable. $result is the name of a variable, and a variable is just a place to store things that you are going to use later. In perl, a variable name starts with a dollar sign ($). The command grep cmrice /etc/passwd is the same command that could have been typed on the command line. This command is surrounded by "back quotes" (`) which tells perl to treat the text inside of the "back quotes" as a command, and to run the command. The text that would normally be displayed to the screen is stored in the variable $result.

The fourth line of this script uses the same print statement as the "Hello World" script, but instead of displaying "static" text (text that never changes) it displays the contents of the variable $result, which can vary, depending on what changes have been made to the password file.

This style of script is typically written by somebody who is familiar with shell scripting, but new to perl. In this case, they are writing a perl script that "looks" like a shell script. A person with a background in C programming would probably have done something different. This script is not "wrong" -- it does what we asked it to do, but this script is inefficient because it calls an external program, where it could have performed the same function completely within perl.

There are many ways to write this script, and the example that follows uses a number of features of perl that must be understood by every perl programmer. The first topic to be introduced is that of files: how to open them, how to read from them, and how to close them. Before a file can be read, it must be opened, and when it is no longer needed it should be closed.

#!/usr/bin/perl

#ex2.pl

open (IN,"/etc/passwd");

while ($line = <IN>)

{

$result = grep(m#cmrice,$line);

if ($result == 1)

{

print "$line\n";

}

close IN;

By now you should understand the first and second lines. The third line is where the file is opened. Opening a file requires a "file handle," which is just the name you are going to use in your script to refer to the file, and a "filename", which is the true location of the file. The next line is the start of a "while loop." The basic meaning of a "while loop" is "while something is true, do the stuff inside the loop." In this case, the "something" that is true is $line=<IN>. What this line means is "take the next line from the input file (IN) and store it in the variable $line." If there are no more lines in the file, then this is false, and the commands inside the while loop will not be executed.

The commands within the while loop are surrounded by "curly brackets" ({ and }). This helps distinguish what commands should be run within the while loop, and they are further distinguished by being indented. The first line within the while loop uses perl’s built-in grep command to look at each line individually. This grep returns a 0 (zero) if the pattern did not match, or a 1 (one) if there was a match. This value is stored in $result. The next line is an "if statement" which is used to check whether something is true or not. In this script, we are checking to see if the value of $result is 1 (one). If this is true, then execute the commands that are within the "curly brackets" that follow the if. (If it is false, then skip these commands.) There is only one command in this script that will be executed if this is true, and it is a simple print statement, just like the one in last example. The two lines following the print statement are the "curly brackets" that "close" the pairs surrounding the commands for the if and while statements. Note that they are at the same level of indentation as the bracket that "opens" their section of code, but at different levels of indentation than each other. This help people who are reading the code determine which pairs belong together.

The last line of this script is where the file that was opened at the beginning is closed. Although the operating system should automatically close this file when the script ends, there are situations where it does not, which can cause problems. If you always remember to explicitly close any file that you have opened, you should never have a problem (at least not one related to open filehandles).

PREV UNIX part 4 NEXT Perl part 2