Академический Документы
Профессиональный Документы
Культура Документы
Supplements
Published 2005-08-17
Table of Contents
1. Advanced Shell Scripting ...................................................................................................................... 4
Discussion .......................................................................................................................................... 4
Shell Scripting .......................................................................................................................... 4
Branches: if ... then ... [else ...] fi .................................................................. 4
Loops: for ... in ... do ... done .............................................................................. 8
Examples............................................................................................................................................ 9
Example 1. A Script for "Packing" Directories ........................................................................ 9
Online Exercises............................................................................................................................... 11
Specification ........................................................................................................................... 11
Deliverables ............................................................................................................................ 11
Questions.......................................................................................................................................... 11
2. Character Encoding and Internationalization.................................................................................. 16
Files .................................................................................................................................................. 16
What are Files? ....................................................................................................................... 16
What is a Byte?....................................................................................................................... 16
Data Encoding ........................................................................................................................ 17
Text Encoding .................................................................................................................................. 17
ASCII...................................................................................................................................... 17
ISO 8859 and Other Character Sets........................................................................................ 19
Unicode (UCS) ....................................................................................................................... 19
Unicode Transformation Format (UTF-8).............................................................................. 20
Text Encoding and the Open Source Community .................................................................. 21
Internationalization (i18n)................................................................................................................ 21
The LANG environment variable ............................................................................................. 22
Do I Really Have to Know All of This? ................................................................................. 23
3. The RPM Package Manager............................................................................................................... 25
Discussion ........................................................................................................................................ 25
RPM: The Red Hat Package Manager.................................................................................... 25
RPM Components .................................................................................................................. 25
Querying the RPM database................................................................................................... 26
Online Exercises............................................................................................................................... 30
Specification ........................................................................................................................... 31
Deliverables ............................................................................................................................ 31
Questions.......................................................................................................................................... 32
iii
Linux uses a general scripting mechanism, where executable text script files can be executed by an
interpreter specified on the initial line.
Within a bash script, any arguments provided when the script was invoked, are available as positional
parameters (i.e, the variables $1, $2, ...).
The read builtin command can be used to read input from the keyboard ("standard in").
The bash shell uses a if ... then ... [else ...] fi syntax to implement conditional
branches.
The test command is often used as the conditional command in if ... then branches.
The bash shell uses a for ... in ... do ... done syntax to implement loops.
Discussion
Shell Scripting
Earlier chapters of this workbook discussed the creation of simple shell scripts. These scripts did little
more than execute a series of commands, optionally accepting user input to define variables.
However, shell scripts are capable of much, much more of this. This chapter will add some valuable tools
to your arsenal, allowing your scripts to make basic if/then/else decisions and loop a set of actions
indefinitely.
...
fi
or
...
else
command(s)
...
fi
When using this syntax, carriage returns are important (i.e., the if and then must occur on separate
lines), but indentations are not.
What does bash expect as a condition? Unlike most programming languages, bash has no internal
syntax for making comparisons (such as $A == apple, or $B > 25). Instead, bash focuses on what shells
were designed to do: run commands. Any command can be used for the condition. The bash shell will
execute the command, and examine its return value. If the command "succeeds" (returns a return value of
0), the the first stanza of commands is executed. If the command fails (returns a return value not equal to
zero), the second stanza of commands is executed (if any).
The following modification to elviss script shut serves as an example.
[elvis@station elvis]$ ls
example.sh
shut
#!/bin/bash
# the first argument should be the name of the file to shut.
if ls $1
then
chmod 600 $1
else
echo "The file $1 does not exist."
fi
[elvis@station elvis]$ ./shut example.sh
example.sh
[elvis@station elvis]$ ./shut foo
In the first case, the ls command "succeeds" ((because the file example.sh exists, the return value from
the ls command is 0). As a result, the first stanza of the if ... then ... else ... fi clause is
executed. In the second case, the file foo does not exist, so the second (else) stanza of the clause was
executed.
rha030-3.0-0-en-2005-08-17T07:23:17-0400
Copyright (c) 2003-2005 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other
use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise
duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or
otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
#!/bin/bash
# the first argument should be the name of the file to shut.
if test -e $1
then
chmod 600 $1
else
echo "The file $1 does not exist."
fi
[elvis@station elvis]$ ./shut example.sh
[elvis@station elvis]$ ./shut foo
Notice that the test command tests for the existence of the file, but does not generate any messages to
distract the user.
The following table lists some of the more commonly switches for testing file attributes.
Table 1-1. test Expressions for Examining File Attributes
Expression
Condition
-d FILE
-e FILE
FILE exists
-f FILE
-r FILE
-w FILE
-x FILE
Condition
[-n] STRING
-z STRING
STRING1 = STRING2
rha030-3.0-0-en-2005-08-17T07:23:17-0400
Copyright (c) 2003-2005 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation
of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether in electronic or print
format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed please email
training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
Condition
STRING1 != STRING2
Lastly, the following table lists expressions that allow the test command to use compound logic.
Table 1-3. Logic Expressions for the test Command
Expression
Condition
EXPRESSION1 -a EXPRESSION2
EXPRESSION1 -o EXPRESSION2
! EXPRESSION
EXPRESSION is false.
These tables are meant to provide the student with a usable working set of expressions. For a complete
listing, consult the test(1) man page.
#!/bin/bash
# the first argument should be the name of the file to shut.
if [ -e $1 ]
then
chmod 600 $1
else
echo "The file $1 does not exist."
fi
Notice that the test command has been replaced with the alternate [ ... ] syntax.
When using the alternate syntax, care must be taken to include a space after the opening bracket, and
before the closing bracket. 1 For example, the following two constructions of the test command are
wrong.
[-e foo.sh ]
[ -e foo.sh]
rha030-3.0-0-en-2005-08-17T07:23:17-0400
Copyright (c) 2003-2005 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other
use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise
duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or
otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
#!/bin/bash
for PET in kitty doggy gerbil newt
do
echo "nice $PET."
done
[elvis@station elvis]$ ./nice
nice
nice
nice
nice
kitty.
doggy.
gerbil.
newt.
In this script, the shell variable PET is used as the iterator. With each iteration of the loop, the variable
takes on a different value.
More formally, for ... in ... do ... done loops in bash use the following syntax.
for iterator in list
do
command(s)
...
done
For each repetition of the loop, the variable iterator will evaluate to the individual words listed in the
expression list.
For a more practical example, we revisit elviss script shut. The user elvis would now like to modify his
script, so that he can specify multiple files on the command line. To implement this change, he
essentially takes his previous script, and wraps it inside a for ... in .. do ... done loop. Rather
than using the first positional parameter ($1) directly, elvis uses an iterator to iterate through all
arguments supplied on the command line.
[elvis@station elvis]$ cat shut
#!/bin/bash
# the first argument should be the name of the file to shut.
for FILE in $*
do
if [ -e $FILE ]
then
chmod 600 $FILE
else
rha030-3.0-0-en-2005-08-17T07:23:17-0400
Copyright (c) 2003-2005 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is
a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether
in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed
please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
In the following, elvis uses the script to modify the permissions on the files example.sh and nice.
[elvis@station elvis]$ ls -l
total 12
-rwxr-xr-x
-rwxr-xr-x
-rwxrwxr-x
1 elvis
1 elvis
1 elvis
elvis
elvis
elvis
212 Sep
77 Sep
188 Sep
3 10:56 example.sh
4 12:16 nice
4 12:31 shut
total 12
-rw-------rw-------rwxrwxr-x
1 elvis
1 elvis
1 elvis
elvis
elvis
elvis
212 Sep
77 Sep
188 Sep
3 10:56 example.sh
4 12:16 nice
4 12:31 shut
Notice the use of the $* variable to generate the list. The following table suggests other commonly used
tricks of the trade.
Table 1-4. Common Techniques for Generating Iteration Lists
When you use...
for i in $*
for i in /etc/*.conf
for i in $(command )
Examples
Example 1. A Script for "Packing" Directories
The user elvis finds that he is often "tarring up" (archiving) directories he is not actively using. He
decides to create a script called pack which will help him archive directories more quickly.
The pack script expects one or more directories to be listed as arguments. For each directory, the script
will create an archive named after the directory, with the extension .tgz appended. If, and only if, the
creation of the archive is successful, the script will then remove the original directory.
As elvis thinks through the directories that users could specify, he realizes that the directories . and ..
could cause problems (why?), so he adds an exclusion for them.
[elvis@station elvis]$ cat pack
#!/bin/bash
rha030-3.0-0-en-2005-08-17T07:23:17-0400
Copyright (c) 2003-2005 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is
a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether
in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed
please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
if [ -d $DIR ]
then
if [ "$DIR" == "." -o "$DIR" == ".." ]
then
echo "skipping directory $DIR"
else
tar cvzf $DIR.tgz $DIR && rm -fr $DIR
fi
else
echo "skipping non directory $DIR"
fi
done
The script loops through all of the provided command line arguments.
The script confirms that the argument exists, and that it refers to a directory.
The script here checks that the user has not specified the directory . or ... In practice, there are still
some directory names that can cause problems. Can you think of any?
Finally, here is the line that does the hard work. Notice that the original directory is removed only if
the tar command succeeds.
.:
pack
test1
test2
./test1:
four one
three
two
./test2:
four one
three
two
test1/
test1/one
test1/two
test1/three
test1/four
test2/
test2/one
test2/two
test2/three
test2/four
[elvis@station elvis]$ ls -R
.:
pack
test1.tgz
rha030-3.0-0-en-2005-08-17T07:23:17-0400
test2.tgz
10
Copyright (c) 2003-2005 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other
use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise
duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or
otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
Online Exercises
Lab Exercise
Objective: Use shell scripting to automate the rotation of images.
Estimated Time: 30 mins.
Specification
Create a script called rotate_cw, which can be used to rotate images 90 degrees. In order to perform
the rotation, you should use the convert command (examine the convert(1) man page, paying particular
attention to the rotate option). The following provides an example of using the convert command to
rotate an image.
[elvis@station elvis]$ cp /usr/share/pixmaps/redhat-main-menu.png .
[elvis@station elvis]$ convert -rotate 90 redhat-main-menu.png /tmp/redhat-mainmenu.png
Deliverables
1. An executable bash script called ~/rotate_cw, which will rotate images in the local directory whose
filenames (without directory components) are passed as arguments.
11
rha030-3.0-0-en-2005-08-17T07:23:17-0400
Copyright (c) 2003-2005 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation
of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether in electronic or print
format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed please email
training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
Questions
1. In order to execute a script, what permission(s) must a user have?
( ) a. read permission
( ) b. write permission
( ) c. execute permission
( ) d. A and C
( ) e. All of the above
2. In order to write an executable bash script, what must the first word of the first line look like?
( ) a. ##bash
( ) b. #!bash
( ) c. !!/bin/bash
( ) d. crunch-bang /bin/bash
( ) e. None of the above
3. When using the Linux scripting mechanism, what may be used as an interpreter?
( ) a. Any executable file within the /usr/bin/ directory
( ) b. Any executable file
( ) c. Only executable files which are listed in the file /etc/interpreters
( ) d. Only executables that ignore lines beginning with #.
( ) e. Only files that meet conditions C and D
4. Which of the following are mechanisms for passing information into shell scripts?
( ) a. Invoking the script with command line arguments.
( ) b. Configuring environment variables before invoking the script.
( ) c. Designing the script to read input from the keyboard (standard in).
( ) d. A and B
( ) e. All of the above
#!/bin/bash
for i in $*
do
if [ -r $i -a -f $i ]
rha030-3.0-0-en-2005-08-17T07:23:17-0400
12
Copyright (c) 2003-2005 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation of U.S.
and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether in electronic or print format without
prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed please email training@redhat.com or phone tollfree (USA) +1 866 626 2994 or +1 (919) 754 3700.
5. Which of the following lines could replace the line labeled 1 above, with no effect on script execution?
( ) a. if test -r $i -a -f $i
( ) b. test -r $i -a -f $i
( ) c. if [ -r $i -o -f $i ]
( ) d. if [ -e $i ]
( ) e. None of the above
6. What syntax error exists in the script?
( ) a. The words for and do must occur on the same line.
( ) b. There must be no spaces between [ and -r on the line starting if.
( ) c. The gzip command must be specified using an absolute reference.
( ) d. The last line contains the misspelled word fi.
( ) e. The last two lines (containing done and fi) need to be transposed.
7. What does the variable i iterate through (assuming the syntax error mentioned above is fixed).
( ) a. All (non-hidden) files in the local directory
( ) b. All files in the local directory
( ) c. All files which were previously defined in the environment variable named *.
( ) d. All of the command line arguments provided when the script was invoked.
( ) e. None of the above
The following text is found in the file /etc/bashrc.
if [ "id -gn" = "id -un" -a id -u -gt 99 ]; then
umask 002
else
umask 022
fi
13
rha030-3.0-0-en-2005-08-17T07:23:17-0400
Copyright (c) 2003-2005 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a
violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether in
electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed
please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
14
rha030-3.0-0-en-2005-08-17T07:23:17-0400
Copyright (c) 2003-2005 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other
use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise
duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or
otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
Notes
1. In order to explore why this is the case, note that there is actually a file called /usr/bin/[. What
does this imply?
15
rha030-3.0-0-en-2005-08-17T07:23:17-0400
Copyright (c) 2003-2005 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other
use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise
duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or
otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
When storing text, computers transform characters into a numeric representation. This process is
referred to as encoding the text.
In order to accommodate the demands of a variety of languages, several different encoding techniques
have been developed. These techniques are represented by a variety of character sets.
The most sophisticated encoding technique is known as the Universal Character Set (UCS), or
Unicode.
The default encoding technique in Red Hat Enterprise Linux is referred to as UTF-8, which allows the
flexibility of Unicode but retains ASCII compatibility.
The LANG environment variable is used to specify a users preferred language and character encoding.
Files
What are Files?
Linux, like most operating systems, stores information that needs to be preserved outside of the context
of any individual process in files. (In this context, and for most of this Workbook, the term file is meant in
the sense of regular file). Linux (and Unix) files store information using a simple model: information is
stored as a single, ordered array of bytes, starting from at first and ending at the last. The number of bytes
in the array is the length of the file. 1
What type of information is stored in files? Here are but a few examples.
The characters that compose the book report you want to store until you can come back and finish it
tomorrow are stored in a file called (say) ~/bookreport.txt.
The individual colors that make up the picture you took with your digital camera are stored in the file
(say) /mnt/camera/dcim/100nikon/dscn1203.jpg.
The characters which define the usernames of users on a Linux system (and their home directories,
etc.) are stored in the file /etc/passwd.
The specific instructions which tell an x86 compatible CPU how to use the Linux kernel to list the files
in a given directory are stored in the file /bin/ls.
16
What is a Byte?
At the lowest level, computers can only answer one type of question: is it on or off? What is it? When
dealing with disks, it is a magnetic domain which is oriented up or down. When dealing with memory
chips, it is a transistor which either has current or doesnt. Both of these are too difficult to mentally
picture, so we will speak in terms of light switches that can either be on or off. To your computer, the
contents of your file is reduced to what can be thought of as an array of (perhaps millions of) light
switches. Each light switch can be used to store one bit of information (is it on, or is it off).
Using a single light switch, you cannot store much information. To be more useful, an early convention
was established: group the light switches into bunches of 8. Each series of 8 light switches (or magnetic
domains, or transistors, ...) is a byte. More formally, a byte consists of 8 bits. Each permutation of ons and
offs for a group of 8 switches can be assigned a number. All switches off, well assign 0. Only the first
switch on, well assign 1; only the second switch on, 2; the first and second switch on, 3; and so on. How
many numbers will it take to label each possible permutation for 8 light switches? A mathematician will
quickly tell you the answer is 2^8, or 256. After grouping the light switches into groups of eight, your
computer views the contents of your file as an array of bytes, each with a value ranging from 0 to 255.
Data Encoding
In order to store information as a series of bytes, the information must be somehow converted into a
series of values ranging from 0 to 255. Converting information into such a format is called data encoding.
Whats the best way to do it? There is no single best way that works for all situations. Developing the
right technique to encode data, which balances the goals of simplicity, efficiency (in terms of CPU
performance and on disk storage), resilience to corruption, etc., is much of the art of computer science.
As one example, consider the picture taken by a digital camera mentioned above. One encoding
technique would divide the picture into pixels (dots), and for each pixel, record three bytes of
information: the pixels "redness", "greenness", and "blueness", each on a scale of 0 to 255. The first
three bytes of the file would record the information for the first pixel, the second three bytes the second
pixel, and so on. A picture format known as "PNM" does just this (plus some header information, such as
how many pixels are in a row). Many other encoding techniques for images exist, some just as simple,
many much more complex.
Text Encoding
Perhaps the most common type of data which computers are asked to store is text. As computers have
developed, a variety of techniques for encoding text have been developed, from the simple in concept
(which could encode only the Latin alphabet used in Western languages) to complicated but powerful
techniques that attempt to encode all forms of human written communication, even attempting to include
historical languages such as Egyptian hieroglyphics. The following sections discuss many of the
encoding techniques commonly used in Red Hat Enterprise Linux.
rha030-3.0-0-en-2005-08-17T07:23:17-0400
17
Copyright (c) 2003-2005 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other
use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise
duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or
otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
ASCII
One of the oldest, and still most commonly used techniques for encoding text is called ASCII encoding.
ASCII encoding simply takes the 26 lowercase and 26 uppercase letters which compose the Latin
alphabet, 10 digits, and common English punctuation characters (those found on a keyboard), and maps
them to an integer between 0 and 255, as outlined in the following table.
Table 2-1. ASCII Encoding of Printable Characters
Integer Range
Character
33-47
Punctuation: !"#$%&;*(*+,-./
48-57
58-64
Punctuation: :;<=?>@
65-90
91-96
Punctuation: [\]^_
97-122
123-126
Punctuation: {|}~
What about the integers 0 - 32? These integers are mapped to special keys on early teletypes, many of
which have to do with manipulating the spacing on the page being typed on. The following characters are
commonly called "whitespace" characters.
Table 2-2. ASCII Encoding of Whitespace Characters
Integer
Character
Common Name
Common
Representation
BS
Backspace
\b
HT
Tab
\t
10
LF
Line Feed
\n
12
FF
Form Feed
\f
13
CR
Carriage Return
\r
32
SPACE
Space Bar
127
DEL
Delete
Others of the first 32 integers are mapped to keys which did not directly influence the "printed page", but
instead sent "out of band" control signals between two teletypes. Many of these control signals have
special interpretations within Linux (and Unix).
Table 2-3. ASCII Encoding of Control Signals
Integer
Character
Common Name
EOT
End of Transmission
BEL
27
ESC
Escape
rha030-3.0-0-en-2005-08-17T07:23:17-0400
Common
Representation
\a
18
Copyright (c) 2003-2005 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy.
Any other use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or
otherwise duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being
used, copied, or otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
What about the values 128-255? ASCII encoding does not use them. The ASCII standard only defines
the first 128 values of a byte, leaving the remaining 128 values to be defined by other schemes.
Formal Name
Description
Latin-1
ISO 8859-1
Latin-2
ISO 8859-2
Arabic
ISO 8859-6
Latin/Arabic
Greek
ISO 8859-7
Latin/Greek
Latin-9
ISO 8859-15
All of these character encoding schemes use a common technique. They preserve the first 128 values of a
byte to encode traditional ASCII, and use the remaining 128 values to encode glyphs unique to the
particular encoding. For example, ISO 8859-1 (Latin-1) uses the value 196 to encode a Latin capital A
with an umlaut (), while ISO-8859-7 (Greek) uses the value 196 to encode the Greek capital letter
Delta (), but both use the value 101 to encode a Latin lowercase e.
Notice a couple of implications about ISO 8859 encoding.
1. Each of the alternate encodings map a single glyph to a single byte, so that the number of letters
encoded in a file equals the number of bytes which are required to encode them.
2. Choosing a particular character set extends the range of characters that can be encoded, but you
cannot encode characters from different character sets simultaneously. For example, you could not
encode both a Latin capital A with a grave and a Greek letter Delta simultaneously.
rha030-3.0-0-en-2005-08-17T07:23:17-0400
19
Copyright (c) 2003-2005 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other
use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise
duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or
otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
Unicode (UCS)
In order to overcome the limitations of ASCII and ISO 8859 based encoding techniques, a Universal
Character Set has been developed, commonly referred to as UCS, or Unicode. The Unicode standard
acknowledges the fact that one byte of information, with its ability to encode 256 different values, is
simply not enough to encode the variety of glyphs found in human communication. Instead, the Unicode
standard uses 4 bytes to encode each character. Think of 4 bytes as 32 light switches. If we were to again
label each permutation of on and off for 32 switches with integers, the mathematician would tell you that
you would need 4,294,967,296 (over 4 billion) integers. Thus, Unicode can encode over 4 billion glyphs
(nearly enough for every person on the earth to have their own unique glyph; the user prince would
approve).
What are some of the features and drawbacks of Unicode encoding?
Scale
The Unicode standard will easily be able to encode the variety of glyphs used in human
communication for a long time to come.
Simplicity
The Unicode standard does have the simplicity of a sledgehammer. The number of bytes required to
encode a set of characters is simply the number of characters multiplied by 4.
Waste
While the Unicode standard is simple in concept, it is also very wasteful. The ability to encode 4
billion glyphs is nice, but in reality, much of the communication that occurs today uses less than a
few hundred glyphs. Of the 32 bits (light switches) used to encode each character, the first 20 or so
would always be "off".
ASCII Non-compatibility
For better or for worse, a huge amount of existing data is already ASCII encoded. In order to convert
fully to Unicode, that data, and the programs that expect to read it, would have to be converted.
The Unicode standard is an effective standard in principle, but in many respects it is ahead of its time,
and perhaps forever will be. In practice, other techniques have been developed which attempt to preserve
the scale and versatility of Unicode, while minimizing waste and maintaining ASCII compatibility. What
must be sacrificed? Simplicity.
rha030-3.0-0-en-2005-08-17T07:23:17-0400
20
Copyright (c) 2003-2005 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other
use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise
duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or
otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
Internationalization (i18n)
As this Workbook continues to discuss many tools and techniques for searching, sorting, and
manipulating text, the topic of internationalization cannot be avoided. In the open source community,
internationalization is often abbreviated as i18n, a shorthand for saying "i-n with 18 letters in between".
Applications which have been internationalized take into account different languages. In the Linux (and
Unix) community, most applications look for the LANG environment variable to determine which
language to use.
At the simplest, this implies that programs will emit messages in the users native language.
[elvis@station elvis]$ echo $LANG
rha030-3.0-0-en-2005-08-17T07:23:17-0400
21
Copyright (c) 2003-2005 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is
a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether
in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed
please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
chmod: Beim Setzen der Zugriffsrechte fr /etc/passwd: Die Operation ist nicht erlaubt
More subtly, the choice of a particular language has implications for sorting orders, numeric formats, text
encoding, and other issues.
Role
LL
CC
enc
The locale command can be used to examine your current configuration (as can echo $LANG), while
locale -a will list all settings currently supported by your system. The extent of the support for any given
language will vary.
The following tables list some selected language codes, country codes, and code set specifications.
Table 2-6. Selected ISO 639 Language Codes
Code
Language
de
German
el
Greek
en
English
es
Spanish
fr
French
ja
Japanese
zh
Chinese
Country
22
rha030-3.0-0-en-2005-08-17T07:23:17-0400
Copyright (c) 2003-2005 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation
of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether in electronic or print
format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed please email
training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
Country
CA
Canada
CN
China
DE
Germany
ES
Spain
FR
France
GB
Britain (UK)
GR
Greece
JP
Japan
NG
Nigeria
US
United States
Country
utf8
UTF-8
iso88591
iso885915
iso88596
iso88592
See the gettext info pages (info gettext, or pinfo gettext) for a complete listing.
rha030-3.0-0-en-2005-08-17T07:23:17-0400
23
Copyright (c) 2003-2005 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other
use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise
duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or
otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
Notes
1. While this may seem an obvious way to do things, some operating systems take more elaborate
approaches. The Macintosh operating system, for example, stores file using two arrays of
information, a data fork and a resource fork.
24
rha030-3.0-0-en-2005-08-17T07:23:17-0400
Copyright (c) 2003-2005 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other
use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise
duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or
otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
The rpm command is used to add or remove software from your system.
You must have root privileges to add and remove software with rpm.
Discussion
RPM: The Red Hat Package Manager
The Red Hat Package Manager is probably the element that most defines the Red Hat Enterprise Linux
distribution. The package manager allows developers a way to build and distribute software,
administrators a way to install and maintain software, and all users a way to query for information about
and verify the integrity of installed software.
RPM Components
When people speak of RPM, they are speaking of three components collectively: the RPM database, the
rpm executable, and package files.
25
Package Files
Package files are the means by which software is distributed. Packages file are generally named using the
following convention.
name-version-release.arch .rpm
For example, Red Hats first release of the package file for version 4.0.7 of the open source application
zsh compiled for the Intel x86 (and compatible) architecture would conventionally be named
zsh-4.0.7-1.i386.rpm.
Package files are essentially tar archives (though they more closely resemble less familiar cpio archives)
combined with header information which names, versions, and states dependencies for the package.
When people refer to the Red Hat distribution, they are generally referring to the collecting of RPM
package files which compose the software installed on a machine.
26
rha030-3.0-0-en-2005-08-17T07:23:17-0400
Copyright (c) 2003-2005 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation
of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether in electronic or print
format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed please email
training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
Specification
-a
package-name
-f filename
-p
package-file-name
Specification
(default)
-i
-l
--queryformat str
By choosing one option from the first table, and zero or more options from the second table, users can
formulate specific questions for the RPM database.
Query Examples
General Queries
For example, the -a command line switch performs a query against all installed packages. If no other
question is asked, by default rpm returns the package name. Thus rpm -qa will return a list of all
installed packages.
[prince@station prince]$ rpm -qa
basesystem-8.0-2
expat-1.95.5-2
libacl-2.2.3-1
popt-1.8-0.69
rootfiles-7.2-6
cpio-2.5-3
gzip-1.3.3-9
...
If a package name is specified, the rpm will query only that package. What information is returned?
Again, by default, the package name.
[prince@station prince]$ rpm -q bash
bash-2.05b-20.1
rha030-3.0-0-en-2005-08-17T07:23:17-0400
27
Copyright (c) 2003-2005 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is
a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether
in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed
please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
Name
: bash
Relocations: /usr
Version
: 2.05b
Vendor: Red Hat, Inc.
Release
: 20.1
Build Date: Wed 09 Apr 2003 09:02:36 AM EDT
Install Date: Tue 08 Jul 2003 09:29:33 AM EDT
Build Host: stripples.devel.redhat.com
Group
: System Environment/Shells
Source RPM: bash-2.05b-20.1.src.rpm
Size
: 1619204
License: GPL
Signature
: DSA/SHA1, Mon 09 Jun 2003 06:45:19 PM EDT, Key ID 219180cddb42a60e
Packager
: Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>
Summary
: The GNU Bourne Again shell (bash).
Description :
The GNU project Bourne Again shell (bash) is a shell or command
language interpreter that is compatible with the Bourne shell
(sh). Bash incorporates useful features from the Korn shell (ksh) and
the C shell (csh) and most sh scripts can be run by bash without
modification. Bash is the default shell for Red Hat Linux.
/bin/bash
/bin/bash2
/bin/sh
/etc/skel/.bash_logout
/etc/skel/.bash_profile
/etc/skel/.bashrc
/usr/bin/bashbug
/usr/lib/bash
/usr/share/doc/bash-2.05b
/usr/share/doc/bash-2.05b/CHANGES
...
hwcrypto-1.0-14
Name
:
Version
:
Release
:
Install Date:
rha030-3.0-0-en-2005-08-17T07:23:17-0400
hwcrypto
Relocations: (not relocateable)
1.0
Vendor: Red Hat, Inc.
14
Build Date: Tue 04 Feb 2003 06:20:37 AM EST
Tue 01 Apr 2003 11:27:43 AM EST
Build Host: sylvester.devel.redhat.com
28
Copyright (c) 2003-2005 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is
a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether
in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed
please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
Is there any documentation on the system that could tell you more about it? Add a -l to list the files
related to /etc/aep.conf.
[prince@station etc]$ rpm -qf /etc/aep.conf -l
/etc/aep
/etc/aep.conf
/etc/aep/aeptarg.bin
/etc/aeplog.conf
...
/usr/sbin/aepversion
/usr/share/doc/hwcrypto-1.0
/usr/share/doc/hwcrypto-1.0/hwcrypto.txt
/usr/share/doc/hwcrypto-1.0/readme.snmp
/usr/share/snmp/mibs/cnStatTrap.mib
In this case, not much, but maybe /usr/share/doc/hwcyrpto.txt will provide some help. Many
packages include man pages that can be read, or info pages that can be browsed. At least you can locate
some configuration files you might want to peruse to find out more.
Investigating an Unfamiliar Package File
What if you come across a package file which is not yet installed on your system? The rpm command
allows package files to be queried directly with the -p command line switch.
[prince@station RPMS]$ rpm -qil -p xsri-2.1.0-5.i386.rpm
Name
: xsri
Relocations: (not relocateable)
Version
: 2.1.0
Vendor: Red Hat, Inc.
Release
: 5
Build Date: Sat 25 Jan 2003 03:37:15 AM EST
Install Date: (not installed)
Build Host: porky.devel.redhat.com
Group
: Amusements/Graphics
Source RPM: xsri-2.1.0-5.src.rpm
Size
: 27190
License: GPL
Signature
: DSA/SHA1, Mon 24 Feb 2003 12:40:17 AM EST, Key ID 219180cddb42a60ePackager
Summary
: A program for displaying images on the background for X.
Description :
The xsri program allows the display of text, patterns, and images in
the root window, so users can customize the XDM style login screen
and/or the normal X background.
Install xsri if you would like to change the look of your X login
screen and/or X background. It is also used to display the default
background (Red Hat logo).
/usr/bin/xsri
/usr/share/doc/xsri-2.1.0
rha030-3.0-0-en-2005-08-17T07:23:17-0400
29
Copyright (c) 2003-2005 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is
a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether
in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed
please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
As mentioned in the table above, this is a fundamentally different type of query. The package file, which
might or might not be installed, is providing the information, not the RPM database.
Formatting Specific Information
What if you would like to generate a list of the 10 largest packages installed on your system? With rpm
-qai, the information header for every package would be displayed, which could be grepped down for
just the sizes, but then youd need names. You could add names, but then the name and size would be on
separate lines. You get the idea.
Fortunately, the rpm command allows users to compose very specific questions by specifying a query
format string. The string is composed of any ASCII text, but tokens of the form %{fieldname} will be
replaced with relevant information field. What can be used as filed names? For starters, any field found in
a packages information header, but theres more. The command rpm --querytags will return a complete
(and intimidating) list of available fields.
For the specific task at hand, prince performs the following query. (Note he needs to explicitly specify a
newline with \n).
[prince@station RPMS]$ rpm -qa --queryformat "%{size} %{name}\n"
0 basesystem
156498 expat
19248 libacl
111647 popt
1966 rootfiles
67679 cpio
162449 gzip
...
Just the information prince wanted. With a syntax of %width{fieldname}, an optional field width can be
specified. Using this to clean up his output, and piping to sort and head, prince generates a list of the 10
largest packages on his system fairly easily.
[prince@station RPMS]$ rpm -qa --queryformat "%10{size} %{name}\n" | sort -rn |
head
170890527
131431309
100436356
84371104
80018678
75838208
55166532
54674111
41939971
36762653
kernel-source
openoffice-i18n
openoffice-libs
gnucash
openoffice
rpmdb-redhat
Omni
tetex-doc
glibc-common
xorg
rha030-3.0-0-en-2005-08-17T07:23:17-0400
30
Copyright (c) 2003-2005 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other
use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise
duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or
otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
Online Exercises
Lab Exercise
Objective: Become familiar with RPM queries
Estimated Time: 15 mins.
Specification
1. Create the file ~/bash_files, which contains a list of all files which belong to the bash package,
listing one file per line using absolute references.
2. Create the file ~/sshd_man, which lists the three files which contain man pages associated with the
openssh-server package, one file per line using absolute references.
3. In the file ~/whatis_libcap, include the single word which best completes the following
sentence: The /lib/libcap.so.1.* library is used for getting and setting POSIX.1e __________.
(Do not be concerned if you do not fully understand the answer).
4. Create the file ~/license_counts, which tables the number of occurrences of packages which are
licensed under a given license, for the top 5 most commonly used licenses, sorted in numerically
descending order. If performed correctly, your file should be formatted similarly to the following.
(Do not be concerned if the actual counts or license names are different. Also, you might notice
logically similar licenses, such as LGPL/GPL and GPL/LGPL. Do not make any attempt to combine
them into a single entry.)
[prince@station prince]$ cat license_counts
355
147
53
47
18
GPL
LGPL
BSD
distributable
xorg
Deliverables
1. The file ~/bash_files, which contains a list of all files owned by the bash package, one file per line, using
absolute references.
2. The file ~/sshd_man, which contains a list of the three files which provide man pages for the openssh-server
package, one file per line, using absolute references.
3. The file ~/whatis_libcap, which contains the one word answer for what the library gets and sets.
4. The file ~/license_counts, which tables the various licenses under which packages are distributed,
rha030-3.0-0-en-2005-08-17T07:23:17-0400
31
Copyright (c) 2003-2005 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other
use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise
duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or
otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
Questions
1. How does almost every RPM query command line begin?
( ) a. rpmquery ...
( ) b. rpm -q ...
( ) c. qpackage ...
( ) d. lsrpm ...
( ) e. None of the above
2. Where is the RPM database located?
( ) a. /tmp/.rpmdb
( ) b. /usr/share/rpm
( ) c. At http://rpmdb.redhat.com
( ) d. /var/lib/rpm
( ) e. None of the above
3. What would be the conventional name of the package file for release 7 of version 2.0.8 of the bash package
compiled for the x86 architecture?
( ) a. bash.i386-2.0.8.7.rpm
( ) b. rpm-bash-2.0.8-7.i386
( ) c. bash-2.0.8-7.i386.rpm
( ) d. bash-2.0.8-i386.rpm
( ) e. None of the above
4. Which of the following command lines would list the package names for all installed packages?
( ) a. rpm -q --dump
( ) b. rpm -qa
( ) c. lsrpm -a
( ) d. rpm -q --name
( ) e. None of the above
rha030-3.0-0-en-2005-08-17T07:23:17-0400
32
Copyright (c) 2003-2005 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation
of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether in electronic or print
format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed please email
training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
5. Which of the following would generate an information header and file list for (only) the package xsnow?
( ) a. rpm -q --list -l xsnow
( ) b. rpm -qa -i -l
( ) c. rpm -q -i xsnow
( ) d. rpm -qil xsnow
( ) e. None of the above
6. Which of the following command lines would list all files which are contained in the same package as
/etc/pwdb.conf?
( ) a. rpm -ql /etc/pwdb.conf
( ) b. rpm -fql /etc/pwdb.conf
( ) c. rpm -qlf /etc/pwdb.conf
( ) d. rpm -qif /etc/pwdb.conf
( ) e. None of the above
7. Which of the following command lines would query the xsane-0.89-3.i386.rpm package file for a list of files
that it contains?
( ) a. rpm -q -p xsane-0.89-3.i386.rpm -l
( ) b. rpm -ql xsane-0.89-3.i386.rpm
( ) c. rpm -qp xsane-0.89-3.i386.rpm
( ) d. rpm -qip xsane-0.89-3.i386.rpm
( ) e. None of the above
8. Which of the following could be used to determine how much disk space the xscreensaver package consumes?
( ) a. rpm -q -i xscreensaver
( ) b. rpm -q -s xscreensaver
( ) c. rpm -qa xscreensaver
( ) d. rpm -q -l xscreensaver
( ) e. None of the above
33
rha030-3.0-0-en-2005-08-17T07:23:17-0400
Copyright (c) 2003-2005 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other use is a violation
of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise duplicated whether in electronic or print
format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or otherwise improperly distributed please email
training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.
rha030-3.0-0-en-2005-08-17T07:23:17-0400
34
Copyright (c) 2003-2005 Red Hat, Inc. All rights reserved. For use only by a student enrolled in a Red Hat Academy course taught at a Red Hat Academy. Any other
use is a violation of U.S. and international copyrights. No part of this publication may be photocopied, duplicated, stored in a retrieval system, or otherwise
duplicated whether in electronic or print format without prior written consent of Red Hat, Inc. If you believe Red Hat course materials are being used, copied, or
otherwise improperly distributed please email training@redhat.com or phone toll-free (USA) +1 866 626 2994 or +1 (919) 754 3700.