Вы находитесь на странице: 1из 14

INT420 Notes

To login to Zenit account, just ssh it like ssh zenit.senecac.on.ca and enter that Zenit
account information.
https://scs.sene HYPERLINK "https://scs.senecac.on.ca/~int420/" HYPERLINK
"https://scs.senecac.on.ca/~int420/" HYPERLINK "https://scs.senecac.on.ca/~int420/"c
HYPERLINK "https://scs.senecac.on.ca/~int420/" HYPERLINK
"https://scs.senecac.on.ca/~int420/" HYPERLINK
"https://scs.senecac.on.ca/~int420/"ac.on.ca/~int420/
Course is focused on 3 main things:
Apache
CGI scripts using Perl.
HTTP
CGI = Common Gateway Interface - It is, in simplest terms, programming for the web.
A CGI can be written in any language, but we focus on Perl. Perl is a scripting language
while HTML is a web developing language.
CGIs are what make webpages like Amazon & Yahoo run, as you can use them to
create forms the user can fill out, or even a simple counter to let people know how many
viewers the website has had.
Perl is an interpreted language, meaning that you dont have to compile it. You just write
it and run it. The disadvantage is you cannot find any bugs within the script until you run
it.

First week lecture (Refer to the .ppt for a visual explanation)


HTTP - Hyper Text Transfer Protocol
Client ------------------------------------------- Server
Computer sends HTTP Request packet to server. The packet will contain HTTP Request
header (contains a METHOD) and potentially an HTTP Request body. The header and
the body are separated by a blank line. This information contains what resource was
requested (Resource URL) and what method was used to get it.
The default method is GET. When you just enter a URL into the browser by or click on a
link in a page, thats GET. When you complete a form in the browser and you wanna
submit it to the server by clicking on Submit, it can be GET or POST. The Head method
is used for testing and diagnostics and basically it is when only the HTTP request header
is sent. In GET, form data is sent as part of the request headers. If you use POST, form
data is sent as part of the request body.
httpd (HTTP daemon) is the Apache HTTP process. It uses TCP port 80.

Child processes are responsible for sending back a response. Child process gets
resource from disk, create HTTP response headers. If all child processes are used, the
daemon will create a new one. Body of response packet will contain one of four things,
either the request resource ( if request was successful), error document (if not
successful), output of script, directory index (listing).
The httpd sends a request to the HTTP server in port 80. This then is assigned to an
httpd parent process which will then assign this to the first available child process. Which
will read the resource. Then the child process sends an HTTP Response.
When the server responds, it sends a HTTP response packet. In the response headers,
there will be instructions for the browser. For example, status (was the request
successful or did it have an error), what character coding you used, what the server OS
is. Most important piece of information is status code tho.
Status codes for HTTP fall in 4 categories:
200 - Indication of success. Request was successful. Most common one: 200.
300 - Redirection. Tells client to get resource from somewhere else. For example, 301,
302, 304
400 - Client side error. Most common ones 401 (AUTHENITICATION), 403 (Permission)
and 404. (Its broken)
500 - Server side errors. Most common one is 500 which means internal server error.
Youll get this most likely when a script fails to execute properly on the server. 502 is
when server is overloaded AKA it is very busy.

One request will not remember any information from previous request. transaction based
protocol not connection oriented. Everything is individualized.
Headers contain various kinds of info that are requested and responded, things like
passwords, version and OS System are displayed here.
Get Get required Resources
Post- Send form data to the server
Head- Return the header only
In HTTP request packet, the body will only be used if youre using POST method, but the
head of the HTTP request packet can be used if you use the methods HEAD, GET and
POST.
Header can contains, resource, HTTP version, environment variables, Cookies,
Authorization, caching info, form data.
In HTTP response packet, HEAD is not visible to your browser, body is visible. HEAD
contains status code, content type, content length, cookie, environment variables,
location. BODY contains requested files, or error page, or output from script, or directory

listing.
What separates the header and body is always a blank line.

Second week lecture


Each Perl script must begin with #!/usr/bin/perl.
Every Perl statement must end with a semi colon ;.
Spacing is not important in Perl.
Perl scripts must end with the extension .pl
To run the script, you must do ./nameofscript.pl
You have to make your script a CGI script in order to display it on the web, otherwise you
will get an internal server error.
The first line you send sends an HTTP request header via the Content-type line
printf Content-type:text/html\n\n
This header tells the browser that the following is text that should be interpreted as
HTML.
In CGI environment, the standard output is the browser while in a terminal environment,
such as BASH, it would be the terminal. If you just run the CGI script in the terminal, it
would display the HTML code but if you run it on a browser, it would actually interpret it
as HTML.
Furthermore, if you make the content-type plain instead of html like this: printf Contenttype:text/plain\n\n then it would not interpret the HTML and would just write the HTML
code in the browser.
3 Data Types in Perl:
Scalar
Arrays
Hashes
Scalar - Holds a single piece of data
Name start with $. An example of declaring a scalar variable would be $age=45; or
$name=Bob;
Arrays - Hold a list of values
Names of arrays start with @. An example of declaring an array would be
@months=(jan,feb,mar);
s@months qw/ January February March/ : qw will add quotations
This statement would work without parenthesis as well, because Perl syntax is flexible.
For example, you could do @months= jan,feb,mar;
The values in an array are called elements. When we load an array, the first element
starts at index 0. Index 1 in our example array would be feb.
To call out an element, we would do printf $months[1]; notice how we have to use $
instead of @ now when we call it out. The $ sign is used when we are calling out only

one element. If we want to call out multiple elements, we have to use @. For example,
printf @months[1,2];
Hashes - Hold pairs of data names (keys) and data values (values)
Names start with %. Example of declaring a hash, %aboutme={name,Brian,age,45};
You can have as many as you want, but it has to be in the order of key & value. In our
example, name is a key and Brian is a value. Age is a key and 45 is a value.
printf $aboutme{age}; this would look for the key age and print the value which is 45.
You can only call out one key at a time not multiple like in arrays.

-w when put after the shebang line (#!/usr/bin/perl -w) is an option that means display
any warnings.
To obtain input from user,we would do: $you= <STDIN>;
The function chomp removes the new-line character from the end of a piece of data.
We could use it after gathering input from user like this: chomp($you);
To comment out something in a Perl script, just use # symbol.
Any calculations put within quotes will be printed out literally. For example 5+3 would
literally print 5+3 not 8. Those outside the quotes would be calculated.
Incremental operator is for example: $a++; This would increment the a variable by 1.
The function localtime would print out the system date/time stamp.
To call out elements in an array, we could use a foreach loop.
This is the syntax: foreach (@colours)
{
printf $_\n;
}
$_ is the default variable name, you would use it if you dont specify any variables in
your foreach statement.
Another example of a foreach loop would be: foreach $i (@colors) { printf $i\n;}
The function pop in an array would remove the last element from an array and assign it
to a variable. For example, $lastelement=pop(@colors); You could use pop without
assigning to a variable but it would just remove the element and nothing else.
The function shift would remove the first element and assign it to a variable. Its the
opposite of pop. Example, $firstlement=shift(@colors)
The function push would add elements to the end of an array. For example:
@morecolours = (purple,teal,red);
push (@colors,@morecolours);
The function delete would delete certain elements from an array or keys from a hash.

You can do this by typing delete(@colours[2]); where colours is the name of the hash
and 2 is the number of the index in which in the element is in.
You can also write the name of the key like this delete(@colors{foo})
http://perl.about.com/od/programmingperl/qt/perldelete.htm
http://perldoc.perl.org/functions/delete.html

The function sort does NOT change the contents of an array, all it does is output the
contents of the array in sorted order. To have a changed array: you can use it like :
@sortedlist=sort(@colors);
To find the last element number in an array, you would do: printf the last element \# in
the array is , $#colours \n;
To find the actual length of an array ( how many elements are in it), use the scalar
function like this: scalar(@colors);

The function reverse would print the elements in reversed order. Example:
reverse(@colors)

The join function joins elements of array into a single string and you can specify the
separator. For example, join(", ",@colors) would print out "black, magenta, cyan, blue,
green". and
join (--,@colors); would do the same but it would put -- instead of a comma.
join (--,keys %courses)
You can do custom quoting in Perl, which basically helps when youre trying to print a
line that has double quotes in it but you dont want it to get mixed up with the double
quotes you use for the printf command. You can use the qq function for this like this:
printf qq~blablablablablafd~;
This basically makes the tilde symbol ~ act as a double quote so that the information
within does not get mixed up because we have multiple double quotes in our line. You
can use anything except the tilde. 7
The qw function would go and quote each word when youre declaring an array, so you
dont manually have to put a quote for each word and separate them by a comma. You
can ujst do @months = qw(firstelement secondelement thirdelement);
or @months = qw/firstelement secondelement thirdelement/;
When youre using brace expansions in Perl, such as [1..9] it has to be ascending and
wont work if its descending.

Perl syntax is flexible and you can do something in more than one way. For example,
when making hashes, you can either do %courses = ( int420, internet II, ndd430,
network diagnostics); OR %courses = ( int420 => internet II, ndd430 => network
diagnostics);

To loop through a has, do foreach $key (fred, beth, john)


{print $kEY,s page: $pages{$key}\n
To add a value to a key in a hash or create a key with a value, do $pages{steve} =
http://www.steve.com;
To check if a key exists, print \n\n,(exists $pages{beth}),\n\n; returns true if the key
exists. If its a non-zero result, its true.
You can also use if statement like: if (!exists $ingredients{toast})
{print toast dont exist\n; }
for each $item (keys %ingredients) {print $item, $ingredients{$item}
The directory u install apache in is called the server root. The root of the operating
system is called the system root. Within the server root, you have bunch of
subdirectories:
The conf directory is where our configuration files are. Inside of it, we have the
extra directory that contains additional configuration files for other services. We
also have the Original directory contains original configuration files from when
you installed the server. This acts sort of as a backup. Directives are in the
configuration file and tells the system crucial information. Such as
ServerRoot=directorylocationofserverroot and Listen port which
The ports.txt file in your home directory of your zenit account tells you which
ports your system has been assigned to. This is different in every machine and
each person in the class should use a unique port. Change the directive listen
port to your assigned port.
The bin directory contains scripts and binary files. Contains commands to start
and stop apache server. You can do this by doing bin/apacheftl stop,
bin/apachectl start, bin/apachectl restart. EVERYTIME we change the
configuration files, we need to restart apache.
The htdocs is the root of your website. This is referred to as the Document
Root. It is where our HTML files go. Inside htdocs, there is a default index.html
file.
The cgi-bin directory contains our cgi-scripts.
The logs directory 3 files. http-pid, error-log, and access-log. httpd -pid only
visible when apache is running. httpd-pid contains the process ID of the parent
process. If you stop apache, the file is deleted. error-log contains errors of any
apache-related problems such as unexpected shutdown, etc. access-log
contains records of every access requests that was sent and its results.

The error directory contains HTML error documents HTML errors such as 404
and 501.
The manual directory contains all apache documentation in HTML format.

To see if you can access your own apache server, type zenit.senecac.on.ca:8012 on
your browser but change the port to your own.
httpd.conf contains comments and directives. serverroot tells where apche was
listen specifies port numbers that apache server will listen.
if you want multiple port numbers, you can add multiple Listen directives. for example
having two lines, one being Listen 80 and another one being Listen 8083
User directive tells you which user is running apache and group directive
ServerAdmin directive is the email of server administrator.
The DocumentRoot directive identifies where the document root is.
Series of directory containers which give access to various parts of the file system using
HTTP.
First configured directory is system root, by default nobody is allowed to use HTTP to
access system root. never ever change that. 2 directives used in this container, order
directive and allow/deny. Order deny, allow. Tells us which order to apply the rules, deny
rules first or allow rules first. By default, it will apply the deny rules. So if you have deny
and allow after it, it will deny first, then allow one user. if you change it to apply allows
first, it will allow that user first and deny everything so in the end no one will have
access.
You can have multiple directory containers.
We can make a new one for our document root. Its rules are Order allow, deny. Allow
from all. This makes everyone able to access the htdocs directory which is the root of
our website. We want this.
ErrorLog directive tells you where error logs are stored.
LogFormat directive will allow you to customize the format in which your error logs are
outputted.
ScriptAlias directive tells us that if the user puts cgi-bin in the browser it will go to
/home/baabbasi/apachebg/cgi-bin and execute the script but not read the code. When
we access a script via ScriptAlias by default, they can only execute scripts not read
them.This alias is only for a whole directory not a file, so when you set it up they only
have to write the names of the scripts in the browser bar to execute them because the
system will go to the directory you specified in the ScriptAlias and grab the
corresponding script.
You would have to add in another directory container for the cgi-bin because by default
its denied for access by users.

To add an alias, add in the line: Alias /docs /home/brian.gray/newapache/manual to the


configuration file which can be found at conf/httpd.conf
To restart server, do bin/apachectl restart
ps x will show your server processes.
When making alias, you have to open up a directory. Example:
ScriptAlias /cgis/ home/baabbasi/apache/scripts/
<Directory home/baabbasi/apache/scripts>
Order allow, deny
Allow from all
</Directory>
^ Here, we made an alias for the scripts directory and opened up that directory using the
directory container.
You can access a script called printenv for example in this directory like this:
zenit.seneca.on.ca:14157/cgis/printenv
By default, when we start Apache, it will start 1 parent process and 5 child processes. If
all child processes are busy, the parent process will spawn a new one. Spawning one
would require time and would slow down the user experience. The act of having child
processes already running is called prefork mode because we want all the child
processes running so we can respond to the user faster.
Linux/Unix apache servers would run in prefork mode.
In other OS, such as Windows, this could be different.
In the prefork module in the configuration file conf/extra/httpd-mpm.conf, the
StartServers directive tells the server how many child processes to start with. By default
its 5. The MinSpareServers refers to how many spare child processes do we want to
keep available. Lets say 2 of processes are being used and our MinSpareServers are
set to 5, if 2 are set to use, 3 are available, so it will start 2 more so it can have 5 spare
processes in total. By default, its 5. MaxSpareServers will limit how many non-used
processes are running. By default, it will have a maximum of 10 child processes that are
not being used. MaxClient how many child processes can it start. By default its 150,
this means it can start 150 child processes to serve 150 requests.
MaxRequestsPerChild means maximum number of requests that a child process can
take. By default its unlimited which is marked by a 0. This means that the same child
process can be re-used for different requests over and over again.

Virtual hosting is when you host two different websites that are completely seperate. 3
ways to configure virtual hosting:

Using Host headers. We have a server running with an IP server on the network,
in the request, it tells the server which host its looking for. Theres an actual
header, beyond whats included in the URL, that tells the server which host the
client is looking for. For example, if you are hosting both McDonalds and Burger
King, your server will have the same IP for both. It is up to the host header to
determine which one of us these the client will go to. It will redirect the client to
the document root of the specific host.
IP address based virtual host. What we do in that scenario is we actually have
our server configured with multiple IPs via different interfaces. So depending on
which IP the user accesses through which interface, the server will decide which
document root they will access.
Port based virtual host. In here, the server can listen to multiple ports. So
depending on the port that client uses to connect to the server, the server will
give them access to the appropriate document root. This is the one we use in
Seneca because we cant do the other 2 methods because of Zenit restrictions.
Keep in mind that when youre doing virtual hosting, you would ideally have to create
separate directory for each of them like separate cgi-bin directory, error log directory,
document root, and access logs so their information does not get mixed up together.
To set up port-based virtual hosts, you have to include the supplemental configuration.
Just uncomment the line that has the line starting with Include and has something
about virtual hosts.
To configure virtual hosts, edit the conf/extra/httpd-vhosts.conf file
When we request a directory, it looks at the Directory directive and looks like for the
directory index file (index.html) and give you that one.
If it does not have an index file, it will look like options indexes, if the index option is
allowed, it will send a directory listing of whats in that directory.
If it does not have the options indexes, it will give error 403 code ( Forbidden ).

In order to add an options index to a directory so that the client can list the directory, add
the line Options Indexes to the directory container in the configuration file.
In conf/httpd.conf, the dir_module can change the behaviour of Apache when a client
requests a directory.

You can customize error pages via the main configuration file conf/httpd.conf.
First test is based on Week 2 and 3 labs. DO THEM.
While adding aliases in the httpd.conf file, the ScriptAlias would allow the scripts to be
executed.
When you just put Alias instead of ScriptAlias, it allows the scripts to be read my users.

Dont put any other thing other than Alias and ScriptAlias. NOTE: These are NOT the
alias names, but the alias types.
To set up port based vhosts:
Comment out the line NameVirtualHost *:80 at the top of the conf/extra/httpdvhosts.conf file then edit the lines below it. Change the document roots and make sure
you open up the directories of them below.
When you send zenit.senecac.on.ca/docs that means you want to look for a file named
docs. If docs doesnt exist as a file, it will try to look for a directory named docs. If you do
zenit.senecac.on.ca/docs/ then that will look for a directory named docs.
Perl stores environment variables in $ENV. For example, if you want to see the users
browser, do a printf and call out $ENV{HTTP_USER_AGENT}.
$ENV{QUERY_STRING} contains the element names and values that the user puts if
they are filling out a form. This is the data that would display on the top of the browser
after a question mark. For example, www.blabla.com/about.me.cgi?person=Brian
HYPERLINK "http://www.blabla.com/about.me.cgi?persin=Brian&sport=Basketball"
HYPERLINK "http://www.blabla.com/about.me.cgi?persin=Brian HYPERLINK
"http://www.blabla.com/about.me.cgi?persin=Brian&sport=Basketball"& HYPERLINK
"http://www.blabla.com/about.me.cgi?persin=Brian&sport=Basketball"sport=Basketball"
HYPERLINK "http://www.blabla.com/about.me.cgi?persin=Brian&sport=Basketball"&
HYPERLINK "http://www.blabla.com/about.me.cgi?persin=Brian&sport=Basketball"
HYPERLINK "http://www.blabla.com/about.me.cgi?persin=Brian HYPERLINK
"http://www.blabla.com/about.me.cgi?persin=Brian&sport=Basketball"& HYPERLINK
"http://www.blabla.com/about.me.cgi?persin=Brian&sport=Basketball"sport=Basketball"
HYPERLINK "http://www.blabla.com/about.me.cgi?
persin=Brian&sport=Basketball"sport=Basketball This is only useful when data is sent
via Get method. We have to use another syntax when we are sending via Post method.
When you use Use strict; at the beginning your script, the variable scope concept is
enforced which means variables inside a routine (aka function) are local which means
only visible in that variable. If you dont put Use strict; , this wont be enforced and all
variables will be global.

Only open directories for certain virtual hosts in the virtual host containers.
For parseing codes using POST:
sub praseform
{
read STDIN, $qstring, $ENV {CONTENT_LENGTH}

sub verifyform
{ $missing = 0
foreach (keys %form)
{ if ($form{$_} eq

Last name: <input type=text name=lname value=$form{lname}>$errors{lname}


Using GET to get the form. POST to proces form.
Logformat stuff name
To apply it, CustomLog Logs/access name

if the user is using GET method, it means first time viewing website.
If theyre using POST, it means theyve filled out the form.
So, do an if statement to check the type of their method. if its get, show form, if its POST,
process it.
Tained mode prevents external sources of information from executing system calls. To
untaint variables you need to basically take a few steps:
Recreate the tainted variables in your script by first checking them against
regular expressions and if you get a match, make a new variable to contain the
output. Then, call our that new variable that has now been verified against a
regular expression throughout the rest of your script. The system will now trust
this variable because it was built internally.
Make sure you put the absolute path (full location) of your system commands.
For example, instead of having the line `mail -s "message" $email <
message.txt`; , you would put `/usr/bin/mail -s "message" $email <
message.txt`;
Make sure you clear the PATH environment variable by doing the following
command, $ENV{PATH}="";
The crypt function encrypts password. When doing authentication, we have to encrypt
the user input for password and validate it against an encrypted entry in the SQL
database. That means that all passwords in the SQL database have to be encrypted and
not visible to the people who see your database. Youre basically going to be comparing
an encrypted password against another encrypted password.
A cookie is a piece of text that identifies certain pieces of information. For example the
domain the cookie is for (eg. senecac.on.ca). A path (eg: /) A UID (eg: 647). expire date
(___). If no expire date is supplied, the cookie will expire when browser is closed.
Cookies allow us to make the HTTP protocol remember who we are because HTTP is a
transactional protocol, meaning each request and response is different and does not
remember anything from the past requests. Cookies will make it remember. When a
server sends you a cookie, it is only used when you send it a request. Every subsequent
request will contain the cookies for that domain.

bin/htpasswd -c conf/.htpasswd brian


Should prompt you for a password. Once you enter these passwords, it will create a user
into the conf directory in a hidden file called .htpasswd named brian with the password
you supplied. You can configure apache to use this password file to enable
authentication on certain directories.
To enable authentication on a directory, in the httpd.conf file when youre opening up the
directory using the <Directory> directive, put in the parameters:
AuthUserFile /location/to/.htpasswd
AuthName Message To Appear In Dialog Box
AuthType Basic
Require valid-user
To give customers the rights to configure certain things about a directory, use the
AllowOverride parameter in the httpd.conf. You could create the
httpd/htdocs/private/.htaccess file and put in the parameters above in it. But, to use this
file and allow it to override the authentication configuration, you gotta edit httpd.conf and
put in the line AllowOverride AuthConfig in the corresponding directory container.
You can put other stuff like AllowOverride Limit to be able to put options like Order allow,
deny and Allow from all in the .htaccess file.
HTTPS ( HTTP over SSL which is Secure Sockets Layer)
Encryptions is taking your data and converting it using Encyrption algorithm to create
encrypted data to send over network.On the receiving end, we have an encryption key
that decrypts the encrypted data to give us the original data.
Encryption falls under two categories: symmetrical and asymmetrical
With symmetrical encryption, the same key is used to encrypt and decrypt.
Symmetrical encryption is fast & efficient and theres some strong encryption with it.
It would be better if we had it with a new key each time. So for example, today, when Im
communicating with Royal Bank Id have one key but tomorrow for the same connection
Id have a different key.
We refer to symmetric encryption as a shared secret. Both sides must know that secret
aka the key.
With an asymmetrical encryption, a pair of keys are used. When you encrypt with one,
you can only decrypt with its partner. We refer to this pair of keys as public/private
keys.The public key can be given to anyone, so the clients can have the public key. So
assume that we have A and B, where B is the server. B generates a pair of keys so now
it has public and private key. It gives public key to A. If A wants to send data to B, it will
send it using the public key that B provided and B can only decrypt it using his private
key. From B to A it wouldnt be as secure but it will allow for a method of authentication.
The reason its not secure because we encrypt it using Bs private key and so ANYONE
who has Bs public key could read it. This is called a digital signature.

For asymmetrical encryption, you cannot encrypt and decrypt using the same key, you
have to have the other pair.
Asymmetrical encryption is slower and more resource intensive.
Steps for server
First step: generate a pair of keys (We will use OpenSSL to do this)
Second step: Create a CSR document (Certified Signing Request)
CSR document will contain the public key, the domain, name of the administrator and
contact information of the server.
Third step: Submit the CSR to a certificate authority (CA). CAs are businesses that
make their money validating keys. They will try to validate the authenticity of companies
and their keys.
Fourth Step: CA validates into Signs Certificate.
Fifth Step: Admin installs certificate
So now on our server, we have our private key and a certificate which contains our
public key
Steps for client:
Step 1: Client requests SQL connection. We do this by requesting an HTTPS
connection.
Step 2: Server sends its certificate to the client. This certificate contains servers public
key.
Step 3: The client generates a one-time symmetric session key and encrypts it using the
servers public key. Then that encrypted symmetric session key is sent to server.
So were using both symmetrical and asymmetrical encryptions but we use assymetrical
to encrypt that symmetrical key.

Country Name (2 letter code) [AU]:CA


State or Province Name (full name) [Some-State]:ON
Locality Name (eg, city) []:Toronto
Organization Name (eg, company) [Internet Widgits Pty Ltd]:Seneca College
Organizational Unit Name (eg, section) []:
Common Name (eg, YOUR name) []:Babak Abbasi
Email Address []:baabbasi@myseneca.ca
Please enter the following 'extra' attributes
to be sent with your certificate request
A challenge password []:
An optional company name []:

Exam:
Multiple Choice (12 Questions)

SSL/Encryption
- Taint Mode/Tained Variables
-Cookies
HTTP Protocol
HTTP+Apache Config (17 Marks)
CGI scripting /using Perl (13) - Gives us a script and asks about this script
CGI Script Debugging - A script containing 8 errors. We would have to identify those
errors. Could be logic errors or simple errors like missing a semi colon.

Вам также может понравиться