Вы находитесь на странице: 1из 327

Introduction to Python

David M. Beazley http://www.dabeaz.com

Edition: Thu May 28 19:33:40 2009

Copyright (C) 2009 David M Beazley All Rights Reserved

Introduction to Python : Table of Contents Introduction to Python 0. Course Setup 1. Introduction 2. Working with Data 3. Program Organization and Functions 1 6 48 85 115 152 169 193 212 229 257 272 286

Objects and Software Development 4. Modules and Libraries 5. Classes 6. Inside the Python Object Model 7. Documentation, Testing, and Debugging Python Systems Programming 8. Iterators and Generators 9. Working with Text 10. Binary Data Handling and File I/O 11. Working with Processes 12. Python Integration Primer Edition: Thu May 28 19:33:40 2009

Section 0

Course Setup

Copyright (C) 2008, http://www.dabeaz.com

0- 1

Required Files
Where to get Python (if not installed)
http://www.python.org

Exercises for this class


http://www.dabeaz.com/python/pythonclass.zip

Copyright (C) 2008, http://www.dabeaz.com

0- 2

Setting up Your Environment


Extract pythonclass.zip on your machine

This folder is where you will do your work


Copyright (C) 2008, http://www.dabeaz.com

0- 3

Class Exercises
Exercise descriptions are found in
PythonClass/Exercises/index.html

All exercises have solution code

Look for the link at the bottom!


Copyright (C) 2008, http://www.dabeaz.com

0- 4

Class Exercises
Please follow the lename suggestions
This is the lename you should be using

Using different names will break later exercises


Copyright (C) 2008, http://www.dabeaz.com

0- 5

General Tips
We will be writing a lot of programs that
access data les in PythonClass/Data

Make sure you save your programs in the

"PythonClass/" directory so that the names of these les are easy to access. Please copy code from the solution and study it if necessary.

Some exercises are more difcult than others.

Copyright (C) 2008, http://www.dabeaz.com

0- 6

Using IDLE
For this class, we will be using IDLE. IDLE is an integrated development environment
that comes prepackaged with Python

It's not the most advanced tool, but it works Follow the instructions on the next two slides to
start it in the correct environment

Copyright (C) 2008, http://www.dabeaz.com

0- 7

Running IDLE (Windows)


Find RunIDLE in the PythonClass/ folder Double-click to start the IDLE environment

Copyright (C) 2008, http://www.dabeaz.com

0- 8

Running IDLE (Mac/Unix)


Go into the PythonClass/ directory Type the following command in a command shell
% python RunIDLE.pyw

Note: Typing 'idle' at the shell might also work.


Copyright (C) 2008, http://www.dabeaz.com

0- 9

Section 1

Introduction to Python

Copyright (C) 2008, http://www.dabeaz.com

1- 1

Where to Get Python?


http://www.python.org

Downloads Documentation and tutorial Community Links Third party packages News and more
Copyright (C) 2008, http://www.dabeaz.com

1- 2

What is Python?
An interpreted high-level programming language. Similar to Perl, Ruby, Tcl, and other so-called
"scripting languages."

Created by Guido van Rossum around 1990. Named in honor of Monty Python
Copyright (C) 2008, http://www.dabeaz.com

1- 3

Why was Python Created?


"My original motivation for creating Python was the perceived need for a higher level language in the Amoeba [Operating Systems] project. I realized that the development of system administration utilities in C was taking too long. Moreover, doing these things in the Bourne shell wouldn't work for a variety of reasons. ... So, there was a need for a language that would bridge the gap between C and the shell." - Guido van Rossum
Copyright (C) 2008, http://www.dabeaz.com

1- 4

Python Inuences
C (syntax, operators, etc.) ABC (syntax, core data types, simplicity) Unix ("Do one thing well") Shell programming (but not the syntax) Lisp, Haskell, and Smalltalk (later features)
Copyright (C) 2008, http://www.dabeaz.com

1- 5

Some Uses of Python


Text processing/data processing Application scripting Systems administration/programming Internet programming Graphical user interfaces Testing Writing quick "throw-away" code
Copyright (C) 2008, http://www.dabeaz.com

1- 6

Python Non-Uses
Device drivers and low-level systems Computer graphics, visualization, and games Numerical algorithms/scientic computing
Comment : Python is still used in these application domains, but only as a high-level control language. Important computations are actually carried out in C, C++, Fortran, etc. For example, you would not implement matrixmultiplication in Python.
Copyright (C) 2008, http://www.dabeaz.com

1- 7

Getting Started
In this section, we will cover the absolute
basics of Python programming

How to start Python Python's interactive mode Creating and running simple programs Basic calculations and le I/O.
Copyright (C) 2008, http://www.dabeaz.com

1- 8

Running Python
Python programs run inside an interpreter The interpreter is a simple "console-based"
application that normally starts from a command shell (e.g., the Unix shell)
shell % python Python 2.5.1 (r251:54869, Apr 18 2007, 22:08:04) [GCC 4.0.1 (Apple Computer, Inc. build 5367)] on darwin Type "help", "copyright", "credits" or "license" >>>

Expert programmers usually have no problem


Copyright (C) 2008, http://www.dabeaz.com

using the interpreter in this way, but it's not so user-friendly for beginners
1- 9

IDLE
Python includes a simple integrated
development called IDLE (which is another Monty Python reference) but it's already installed and it works

It's not the most sophisticated environment Most working Python programmers tend to
Copyright (C) 2008, http://www.dabeaz.com

use something else, but it is ne for this class.

1- 10

10

IDLE on Windows
Look for it in the "Start" menu

Copyright (C) 2008, http://www.dabeaz.com

1- 11

IDLE on other Systems


Launch a terminal or command shell

Type the following command to launch IDLE


shell % python -m idlelib.idle
Copyright (C) 2008, http://www.dabeaz.com

1- 12

11

The Python Interpreter


When you start Python, you get an
"interactive" mode where you can experiment run immediately

If you start typing statements, they will No edit/compile/run/debug cycle In fact, there is no "compiler"
Copyright (C) 2008, http://www.dabeaz.com

1- 13

The interpreter runs a "read-eval" loop


>>> print "hello world" hello world >>> 37*42 1554 >>> for i in range(5): ... print i ... 0 1 2 3 4 >>>

Interactive Mode

Executes simple statements typed in directly Very useful for debugging, exploration
Copyright (C) 2008, http://www.dabeaz.com

1- 14

12

Interactive Mode
Some notes on using the interactive shell
>>> is the interpreter prompt for starting a new statement ... is the interpreter prompt for continuing a statement (it may be blank in some tools)
>>> print "hello world" hello world >>> 37*42 1554 >>> for i in range(5): ... print i ... 0 1 2 3 Enter a blank line to 4 nish typing and to run >>>

Copyright (C) 2008, http://www.dabeaz.com

1- 15

Interactive Mode in IDLE


Interactive shell plus added help features

syntax highlights
Copyright (C) 2008, http://www.dabeaz.com

usage information

1- 16

13

Getting Help
help(name) command
>>> help(range) Help on built-in function range in module __builtin__: range(...) range([start,] stop[, step]) -> list of integers Return a list containing an arithmetic progression of integers. range(i, j) returns [i, i+1, i+2, ..., j-1]; start (!) defaults to 0. ... >>>

Type help() with no name for interactive help Documentation at http://docs.python.org


Copyright (C) 2008, http://www.dabeaz.com

1- 17

Exercise 1.1

Time: 10 minutes

Copyright (C) 2008, http://www.dabeaz.com

1- 18

14

Creating Programs
Programs are put in .py les
# helloworld.py print "hello world"

Source les are simple text les Create with your favorite editor (e.g., emacs) Note: May be special editing modes Can also edit programs with IDLE or other
Python IDE (too many to list)
Copyright (C) 2008, http://www.dabeaz.com

1- 19

Creating Programs
Creating a new program in IDLE

Copyright (C) 2008, http://www.dabeaz.com

1- 20

15

Creating Programs
Editing a new program in IDLE

Copyright (C) 2008, http://www.dabeaz.com

1- 21

Saving a new Program in IDLE

Creating Programs

Copyright (C) 2008, http://www.dabeaz.com

1- 22

16

Running Programs (IDLE)


Select "Run Module" (F5)

Will see output in IDLE shell window


Copyright (C) 2008, http://www.dabeaz.com

1- 23

Running Programs
In production environments, Python may be
run from command line or a script
shell % python helloworld.py hello world shell %

Command line (Unix) Command shell (Windows)


C:\SomeFolder>helloworld.py hello world C:\SomeFolder>c:\python25\python helloworld.py hello world

Copyright (C) 2008, http://www.dabeaz.com

1- 24

17

The Sears Tower Problem

A Sample Program

You are given a standard sheet of paper which you fold in half. You then fold that in half and keep folding. How many folds do you have to make for the thickness of the folded paper to be taller than the Sears Tower? A sheet of paper is 0.1mm thick and the Sears Tower is 442 meters tall.

Copyright (C) 2008, http://www.dabeaz.com

1- 25

A Sample Program
# sears.py # How many times do you have to fold a piece of paper # for it to be taller than the Sears Tower? height = 442 thickness = 0.1*0.001 # Meters # Meters (0.1 millimeter)

numfolds = 0 while thickness <= height: thickness = thickness * 2 numfolds = numfolds + 1 print numfolds, thickness print numfolds, "folds required" print "final thickness is", thickness, "meters"

Copyright (C) 2008, http://www.dabeaz.com

1- 26

18

A Sample Program
Output
% python sears.py 1 0.0002 2 0.0004 3 0.0008 4 0.0016 5 0.0032 ... 20 104.8576 21 209.7152 22 419.4304 23 838.8608 23 folds required final thickness is 838.8608 meters

Copyright (C) 2008, http://www.dabeaz.com

1- 27

Exercise 1.2

Time: 10 minutes

Copyright (C) 2008, http://www.dabeaz.com

1- 28

19

The rst rule of using IDLE is that you will

An IDLE Caution

always open les using the "File > Open" or "File > Recent Files" menu options

Do not right click on .py les to open


Copyright (C) 2008, http://www.dabeaz.com

1- 29

The second rule of using IDLE is that you will


always open les using the "File > Open" or "File > Recent Files" menu options

An IDLE Caution

Ignore this rule at your own peril (IDLE will


operate strangely and you'll be frustrated)
20
Copyright (C) 2008, http://www.dabeaz.com

1- 30

Python 101 : Statements


A Python program is a sequence of statements Each statement is terminated by a newline Statements are executed one after the other
until you reach the end of the le. program stops

When there are no more statements, the


Copyright (C) 2008, http://www.dabeaz.com

1- 31

Python 101 : Comments


Comments are denoted by #
# This is a comment height = 442 # Meters

Extend to the end of the line There are no block comments in Python
(e.g., /* ... */).
1- 32

Copyright (C) 2008, http://www.dabeaz.com

21

Python 101: Variables


A variable is just a name for some value Variable names follow same rules as C
[A-Za-z_][A-Za-z0-9_]*
height = 442 height = 442.0 height = "Really tall"

You do not declare types (int, oat, etc.)


# An integer # Floating point # A string

Differs from C++/Java where variables have a


xed type that must be declared.
Copyright (C) 2008, http://www.dabeaz.com

1- 33

Python 101: Keywords


Python has a basic set of language keywords
and assert break class continue def del elif else except exec finally for from global if import in is lambda not or pass print raise return try while with yield

Variables can not have one of these names These are mostly C-like and have the same
meaning in most cases (later)
Copyright (C) 2008, http://www.dabeaz.com

1- 34

22

Python 101: Case Sensitivity


Python is case sensitive These are all different variables:
name = "Jake" Name = "Elwood" NAME = "Guido"

Language statements are always lower-case


print "Hello World" PRINT "Hello World" while x < 0: WHILE x < 0: # OK # ERROR # OK # ERROR

So, no shouting please...


Copyright (C) 2008, http://www.dabeaz.com

1- 35

Python 101: Looping


The while statement executes a loop
while thickness <= height: thickness = thickness * 2 numfolds = numfolds + 1 print numfolds, thickness

Executes the indented statements

underneath while the condition is true

Copyright (C) 2008, http://www.dabeaz.com

1- 36

23

Python 101 : Indentation


Indentation used to denote blocks of code Indentation must be consistent
while thickness <= height: thickness = thickness * 2 numfolds = numfolds + 1 print numfolds, thickness while thickness <= height: thickness = thickness * 2 numfolds = numfolds + 1 print numfolds, thickness

(ok)

(error)

Colon (:) always indicates start of new block


while thickness <= height:
Copyright (C) 2008, http://www.dabeaz.com

1- 37

Python 101 : Indentation


There is a preferred indentation style Always use spaces Use 4 spaces per level Avoid tabs Always use a Python-aware editor
Copyright (C) 2008, http://www.dabeaz.com

1- 38

24

Python 101 : Conditionals


If-else
if a < b: print "Computer says no" else: print "Computer says yes"

If-elif-else
if a == '+': op = PLUS elif a == '-': op = MINUS elif a == '*': op = TIMES else: op = UNKNOWN

Copyright (C) 2008, http://www.dabeaz.com

1- 39

Python 101 : Relations


Relational operators
< > <= >= == !=

Boolean expressions (and, or, not)


if b >= a and b <= c: print "b is between a and c" if not (b < a or b > c): print "b is still between a and c"

Non-zero numbers, non-empty objects also


evaluate as True.
x = 42 if x: # x is nonzero
Copyright (C) 2008, http://www.dabeaz.com

1- 40

25

Python 101 : Printing


The print statement
print print print print x x,y,z "Your name is", name x, # Omits newline

Produces a single line of text Items are separated by spaces Always prints a newline unless a trailing
comma is added after last item
Copyright (C) 2008, http://www.dabeaz.com

1- 41

Python 101 : pass statement


Sometimes you will need to specify an
empty block of code
if name in namelist: # Not implemented yet (or nothing) pass else: statements

pass is a "no-op" statement It does nothing, but serves as a placeholder


for statements (possibly to be added later)
Copyright (C) 2008, http://www.dabeaz.com

1- 42

26

Python 101 : Long Lines


Sometimes you get long statements that you
want to break across multiple lines

Use the line continuation character (\)


if product=="game" and type=="pirate memory" \ and age >= 4 and age <= 8: print "I'll take it!"

However, not needed for code in (), [], or {}


if (product=="game" and type=="pirate memory" and age >= 4 and age <= 8): print "I'll take it!"

Copyright (C) 2008, http://www.dabeaz.com

1- 43

Exercise 1.3

Time: 15 minutes

Copyright (C) 2008, http://www.dabeaz.com

1- 44

27

Basic Datatypes
Python only has a few primitive types of data Numbers Strings (character text)

Copyright (C) 2007, http://www.dabeaz.com

1- 45

Numbers
Python has 5 types of numbers Booleans Integers Long integers Floating point Complex (imaginary numbers)
Copyright (C) 2007, http://www.dabeaz.com

1- 46

28

Booleans (bool)
Two values: True, False Evaluated as integers with value 1,0 Although doing that in practice would be odd
Copyright (C) 2007, http://www.dabeaz.com

a = True b = False

c = 4 + True # c = 5 d = False if d == 0: print "d is False"

1- 47

Integers (int)
Signed integers up to machine precision
a b c d = = = = 37 -299392993 0x7fa8 0253 # Hexadecimal # Octal

Typically 32 bits Comparable to the C long type


Copyright (C) 2007, http://www.dabeaz.com

1- 48

29

Long Integers (long)


Arbitrary precision integers
a = 37L b = -126477288399477266376467L

Integers that overow promote to longs Can almost always be used interchangeably
with integers
Copyright (C) 2007, http://www.dabeaz.com

>>> 3 ** 73 67585198634817523235520443624317923L >>> a = 72883988882883812 >>> a 72883988882883812L >>>

1- 49

Integer Operations
+ * / // % ** << >> & | ^ ~ abs(x) pow(x,y[,z]) divmod(x,y) Add Subtract Multiply Divide Floor divide Modulo Power Bit shift left Bit shift right Bit-wise AND Bit-wise OR Bit-wise XOR Bit-wise NOT Absolute value Power with optional modulo (x**y)%z Division with remainder

Copyright (C) 2007, http://www.dabeaz.com

1- 50

30

Classic division (/) - truncates


>>> 5/4 1 >>> >>> 5//4 1 >>>

Integer Division

Floor division (//) - truncates (same) Future division (/) - Converts to oat Will change in some future Python version If truncation is intended, use //
Copyright (C) 2007, http://www.dabeaz.com

>>> from __future__ import division >>> 5/4 1.25

1- 51

Floating point (oat)


Use a decimal or exponential notation Represented as double precision using the
native CPU representation (IEEE 754)
17 digits of precision Exponent from -308 to 308 a = 37.45 b = 4e5 c = -1.345e-10

Same as the C double type


Copyright (C) 2007, http://www.dabeaz.com

1- 52

31

Floating point
Be aware that oating point numbers are
>>> a = 3.4 >>> a 3.3999999999999999 >>>

inexact when representing decimal values.

This is not Python, but the underlying


oating point hardware on the CPU.

When inspecting data at the interactive


Copyright (C) 2007, http://www.dabeaz.com

prompt, you will always see the exact representation (print output may differ)
1- 53

Floating Point Operators


+ * / % ** pow(x,y [,z]) abs(x) divmod(x,y) Add Subtract Multiply Divide Modulo (remainder) Power Power modulo (x**y)%z Absolute value Division with remainder

Additional functions are in the math module


import math a = math.sqrt(x) b = math.sin(x) c = math.cos(x) d = math.tan(x) e = math.log(x)

Copyright (C) 2007, http://www.dabeaz.com

1- 54

32

Type name can be used to convert Example:


a = int(x) b = long(x) c = float(x) >>> a = 3.14159 >>> int(a) 3 >>>

Converting Numbers
# Convert x to integer # Convert x to long # Convert x to float

Also work with strings containing numbers


>>> a = "3.14159" >>> float(a) 3.1415899999999999 >>> int("0xff",16) 255
Copyright (C) 2007, http://www.dabeaz.com

# Optional integer base

1- 55

Strings
Written in programs with quotes
a = "Yeah but no but yeah but..." b = 'computer says no' c = ''' Look into my eyes, look into my eyes, the eyes, the eyes, the eyes, not around the eyes, don't look around the eyes, look into my eyes, you're under. '''

Standard escape characters work (e.g., '\n') Triple quotes capture all literal text enclosed
Copyright (C) 2008, http://www.dabeaz.com

1- 56

33

String Escape Codes


In quotes, standard escape codes work
'\n' '\r' '\t' '\xhh' '\' '\\' Line feed Carriage return Tab Hexadecimal value Literal quote Backslash

Raw strings (dont interpret escape codes)


a = r"c:\newdata\test" # String exactly as specified

Leading r
1- 57

Copyright (C) 2007, http://www.dabeaz.com

String Representation
An ordered sequence of bytes (characters) Stores 8-bit data (ASCII) May contain binary data, control characters, etc. Strings are frequently used for both text and
for raw-data of any kind
1- 58

Copyright (C) 2007, http://www.dabeaz.com

34

String Representation
Strings work like an array : s[n] Slicing/substrings : s[start:end] Concatenation (+)
a = "Hello" + "World" b = "Say " + a
Copyright (C) 2007, http://www.dabeaz.com

a b c d

= = = =

"Hello world" a[0] # b = 'H' a[4] # c = 'o' a[-1] # d = 'd'

(Taken from end of string)

d e f g

= = = =

a[:5] a[6:] a[3:8] a[-5:]

# # # #

d e f g

= = = =

"Hello" "world" "lo wo" "world"

1- 59

More String Operations


Length (len) Membership test (in) Replication (s*n)
Copyright (C) 2007, http://www.dabeaz.com

>>> s = "Hello" >>> len(s) 5 >>> >>> 'e' in s True >>> 'x' in s False >>> "ello" in s True

>>> s = "Hello" >>> s*5 'HelloHelloHelloHelloHello' >>>

1- 60

35

String Methods
Strings have "methods" that perform various
operations with the string data.

Stripping any leading/trailing whitespace


t = s.strip()

Case conversion
t = s.lower() t = s.upper()

Replacing text
t = s.replace("Hello","Hallo")
Copyright (C) 2007, http://www.dabeaz.com

1- 61

More String Methods


s.endswith(suffix) s.find(t) s.index(t) s.isalpha() s.isdigit() s.islower() s.isupper() s.join(slist) s.lower() s.replace(old,new) s.rfind(t) s.rindex(t) s.split([delim]) s.startswith(prefix) s.strip() s.upper() # # # # # # # # # # # # # # # # Check if string ends with suffix First occurrence of t in s First occurrence of t in s Check if characters are alphabetic Check if characters are numeric Check if characters are lower-case Check if characters are upper-case Joins lists using s as delimeter Convert to lower case Replace text Search for t from end of string Search for t from end of string Split string into list of substrings Check if string starts with prefix Strip leading/trailing space Convert to upper case

Copyright (C) 2007, http://www.dabeaz.com

1- 62

36

String Mutability
Strings are "immutable" (read only) Once created, the value can't be changed All operations and methods that manipulate
string data always create new strings
Copyright (C) 2007, http://www.dabeaz.com

>>> s = "Hello World" >>> s[1] = 'a' Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: 'str' object does not support item assignment >>>

1- 63

String Conversions
To convert any object to string Produces the same text as print Actually, print uses str() for output
s = str(obj) >>> x = 42 >>> str(x) '42' >>>

Copyright (C) 2007, http://www.dabeaz.com

1- 64

37

Exercise 1.4

Time: 10 minutes

Copyright (C) 2007, http://www.dabeaz.com

1- 65

String Splitting
Strings often represent elds of data To work with each eld, split into a list
>>> line = 'GOOG 100 490.10' >>> fields = line.split() >>> fields ['GOOG', '100', '490.10'] >>>

Example: When reading data from a le, you


might read each line and then split the line into columns.
1- 66

Copyright (C) 2007, http://www.dabeaz.com

38

Lists
A array of arbitrary values Can contain mixed data types Adding new items (append, insert)
items = [ "Elwood", 39, 1.5 ] names = [ "Elwood", "Jake", "Curtis" ] nums = [ 39, 38, 42, 65, 111]

Concatenation : s + t
s = [1,2,3] t = ['a','b'] s + t
Copyright (C) 2007, http://www.dabeaz.com

items.append("that") items.insert(2,"this")

# Adds at end # Inserts in middle

[1,2,3,'a','b']

1- 67

Lists (cont)
Lists are indexed by integers (starting at 0)
names = [ "Elwood", "Jake", "Curtis" ] names[0] names[1] names[2] "Elwood" "Jake" "Curtis"

Negative indices are from the end


names[-1] "Curtis"

Changing one of the items


names[1] = "Joliet Jake"
Copyright (C) 2007, http://www.dabeaz.com

1- 68

39

More List Operations


Length (len) Membership test (in) Replication (s*n)
Copyright (C) 2007, http://www.dabeaz.com

>>> s = ['Elwood','Jake','Curtis'] >>> len(s) 3 >>>

>>> 'Elwood' in s True >>> 'Britney' in s False >>>

>>> s = [1,2,3] >>> s*3 [1,2,3,1,2,3,1,2,3] >>>

1- 69

List Removal
Removing an item
names.remove("Curtis")

Deleting an item by index


del names[2]

Removal results in items moving down to ll


the space vacated (i.e., no "holes").

Copyright (C) 2007, http://www.dabeaz.com

1- 70

40

Exercise 1.5

Time: 10 minutes

Copyright (C) 2007, http://www.dabeaz.com

1- 71

File Input and Output


Opening a le To read data
f = open("foo.txt","r") g = open("bar.txt","w") # Open for reading # Open for writing line = f.readline() data = f.read() # Read a line of text # Read all data

To write text to a le To print to a le


Copyright (C) 2007, http://www.dabeaz.com

g.write("some text")

print >>g, "Your name is", name

1- 72

41

Looping over a le
Reading a le line by line Alternatively
the le
f = open("foo.txt","r") for line in f: # Process the line ... f.close()

for line in open("foo.txt","r"): # Process the line ...

This reads all lines until you reach the end of


Copyright (C) 2007, http://www.dabeaz.com

1- 73

Exercise 1.6

Time: 10 minutes

Copyright (C) 2008, http://www.dabeaz.com

1- 74

42

Type Conversion
In Python, you must be careful about
x = '37' y = '42' z = x + y x = 37 y = 42 z = x + y # Strings # z = '3742' (concatenation)

converting data to an appropriate type

# z = 79

(integer +)

This differs from Perl where "+" is assumed to


be numeric arithmetic (even on strings)
$x = '37'; $y = '42'; $z = $x + $y;
Copyright (C) 2007, http://www.dabeaz.com

# $z = 79

1- 75

Simple Functions
Use functions for code you want to reuse
def sumcount(n): total = 0 while n > 0: total += n n -= 1 return total

Calling a function
a = sumcount(100)

A function is just a series of statements that


perform some task and return a result
Copyright (C) 2007, http://www.dabeaz.com

1- 76

43

Library Functions
Python comes with a large standard library Library modules accessed using import
import math x = math.sqrt(10) import urllib u = urllib.urlopen("http://www.python.org/index.html") data = u.read()

Will cover in more detail later


Copyright (C) 2007, http://www.dabeaz.com

1- 77

Exception Handling
Errors are reported as exceptions An exception causes the program to stop
>>> f = open("file.dat","r") Traceback (most recent call last): File "<stdin>", line 1, in <module> IOError: [Errno 2] No such file or directory: 'file.dat' >>>

For debugging, message describes what


Copyright (C) 2007, http://www.dabeaz.com

happened, where the error occurred, along with a traceback.


1- 78

44

Exceptions
Exceptions can be caught and handled To catch, use try-except statement
try: f = open(filename,"r") except IOError: print "Could not open", filename

Name must match the kind of error you're trying to catch


>>> f = open("file.dat","r") Traceback (most recent call last): File "<stdin>", line 1, in <module> IOError: [Errno 2] No such file or directory: 'file.dat' >>>

Copyright (C) 2008, http://www.dabeaz.com

1- 79

Exceptions
To raise an exception, use the raise statement
raise RuntimeError("What a kerfuffle")

Will cause the program to abort with an


% python foo.py Traceback (most recent call last): File "foo.py", line 21, in <module> raise RuntimeError("What a kerfuffle") RuntimeError: What a kerfuffle

exception traceback (unless caught by tryexcept)

Copyright (C) 2008, http://www.dabeaz.com

1- 80

45

dir() function
dir(obj) returns all names dened in obj
>>> import sys >>> dir(sys) ['__displayhook__', '__doc__', '__excepthook__', '__name__', '__stderr__', '__stdin__', '__stdout__', '_current_frames', '_getframe', 'api_version', 'argv', 'builtin_module_names', 'byteorder', 'call_tracing', 'callstats', 'copyright', 'displayhook', 'exc_clear', 'exc_info', 'exc_type', 'excepthook', 'exec_prefix', 'executable', 'exit', 'getcheckinterval', ... 'version_info', 'warnoptions']

Useful for exploring, inspecting objects, etc.


Copyright (C) 2008, http://www.dabeaz.com

1- 81

dir() Example
dir() will also tell you what methods/
attributes are available on an object.
>>> a = "Hello World" >>> dir(a) ['__add__','__class__', ..., 'capitalize', 'center', 'count', 'decode', 'encode', 'endswith', 'expandtabs', 'find', 'index', 'isalnum', 'isalpha', 'isdigit', 'islower', 'isspace', 'istitle', 'isupper', 'join', ... 'rstrip', 'split', 'splitlines', 'startswith', 'strip', 'swapcase', 'title', 'translate', 'upper', 'zfill'] >>>

Note: you may see a lot of names with


Copyright (C) 2008, http://www.dabeaz.com

leading/trailing underscores. These have special meaning to Python--ignore for now.


1- 82

46

Summary
This has been an overview of simple Python Enough to write basic programs Just have to know the core datatypes and a
few basics (loops, conditions, etc.)

Copyright (C) 2008, http://www.dabeaz.com

1- 83

Exercise 1.7

Time: 15 minutes

Copyright (C) 2008, http://www.dabeaz.com

1- 84

47

Section 2

Working with Data

Overview
Most programs work with data In this section, we look at how Python

programmers represent and work with data

Common programming idioms How to (not) shoot yourself in the foot


Copyright (C) 2008, http://www.dabeaz.com

2- 2

48

Primitive Datatypes
Python has a few primitive types of data Integers Floating point numbers Strings (text) Obviously, all programs use these
Copyright (C) 2008, http://www.dabeaz.com

2- 3

None type
Nothing, nil, null, nada
logfile = None

This is often used as a placeholder for


optional program settings or values
if logfile: logfile.write("Some message")

If you don't assign logle to something, the


Copyright (C) 2008, http://www.dabeaz.com

above code would crash (undened variable)

2- 4

49

Data Structures
Real programs have more complex data Example: A holding of stock
100 shares of GOOG at $490.10

An "object" with three parts Name ("GOOG", a string) Number of shares (100, an integer) Price (490.10, a oat)
Copyright (C) 2008, http://www.dabeaz.com

2- 5

Tuples
A collection of values grouped together Example:
s = ('GOOG',100,490.10)

Sometimes the () are omitted in syntax


s = 'GOOG', 100, 490.10

Special cases (0-tuple, 1-tuple)


t = () w = ('GOOG',) # An empty tuple # A 1-item tuple

Copyright (C) 2008, http://www.dabeaz.com

2- 6

50

Tuple Use
Tuples are usually used to represent simple
records and data structures
contact = ('David Beazley','dave@dabeaz.com') stock = ('GOOG', 100, 490.10) host = ('www.python.org', 80)

Basically, a simple "object" of multiple parts


2- 7

Copyright (C) 2008, http://www.dabeaz.com

Tuples (cont)
Tuple contents are ordered (like an array)
s = ('GOOG',100, 490.10) name = s[0] # 'GOOG' shares = s[1] # 100 price = s[2] # 490.10

However, the contents can't be modied


>>> s[1] = 75 TypeError: object does not support item assignment

Although you can always create a new tuple


t = (s[0], 75, s[2])
Copyright (C) 2008, http://www.dabeaz.com

2- 8

51

Tuple Packing
Tuples are really focused on packing and Packing multiple values into a tuple
s = ('GOOG', 100, 490.10)

unpacking data into variables , not storing distinct items in a list

The tuple is then easy to pass around to


Copyright (C) 2008, http://www.dabeaz.com

other parts of a program as a single object

2- 9

Tuple Unpacking
To use the tuple elsewhere, you typically
unpack its parts into variables

Unpacking values from a tuple


(name, shares, price) = s print "Cost", shares*price

Note: The () syntax is sometimes omitted


name, shares, price = s
Copyright (C) 2008, http://www.dabeaz.com

2- 10

52

Tuple Commentary
Tuples are a fundamental part of Python Used for critical parts of the interpreter Highly optimized Key point : Compound data, but not used
to represent a list of distinct "objects"

Copyright (C) 2008, http://www.dabeaz.com

2- 11

Tuples vs. Lists


Can't you just use a list instead of a tuple?
s = ['GOOG', 100, 490.10]

Well, yes, but it's not quite as efcient The implementation of lists is optimized for
growth using the append() method.

There is often a little extra space at the end So using a list requires a bit more memory
Copyright (C) 2008, http://www.dabeaz.com

2- 12

53

Dictionaries
A hash table or associative array A collection of values indexed by "keys" The keys are like eld names Example:
s = { 'name' : 'GOOG', 'shares' : 100, 'price' : 490.10 }

Copyright (C) 2008, http://www.dabeaz.com

2- 13

Dictionaries
Getting values: Just use the key names Adding/modifying values : Assign to key names
>>> s['shares'] = 75 >>> s['date'] = '6/6/2007' >>> >>> print s['name'],s['shares'] GOOG 100 >>> s['price'] 490.10 >>>

Deleting a value
>>> del s['date'] >>>
Copyright (C) 2008, http://www.dabeaz.com

2- 14

54

Dictionaries
When to use a dictionary as a data structure Data has many different parts The parts will be modied/manipulated Example: If you were reading data from a
database and each row had 50 elds, a dictionary could store the contents of each row using descriptive eld names.

Copyright (C) 2008, http://www.dabeaz.com

2- 15

Exercise 2.1

Time : 10 minutes

Copyright (C) 2008, http://www.dabeaz.com

2- 16

55

Containers
Programs often have to work many objects A portfolio of stocks Spreadsheets and matrices Three choices: Lists (ordered data) Dictionaries (unordered data) Sets (unordered collection)
Copyright (C) 2008, http://www.dabeaz.com

2- 17

Lists as a Container
Use a list when the order of data matters Lists can hold any kind of object Example: A list of tuples
portfolio = [ ('GOOG',100,490.10), ('IBM',50,91.10), ('CAT',150,83.44) ] portfolio[0] portfolio[1] ('GOOG',100,490.10) ('IBM',50,91.10)

Copyright (C) 2008, http://www.dabeaz.com

2- 18

56

Dicts as a Container
Dictionaries are useful if you want fast
random lookups (by key name)
prices = { 'GOOG' 'CAT' 'IBM' 'MSFT' ... }

Example: A dictionary of stock prices


: : : : 513.25, 87.22, 93.37, 44.12

>>> prices['IBM'] 93.37 >>> prices['GOOG'] 513.25 >>>


Copyright (C) 2008, http://www.dabeaz.com

2- 19

Dict : Looking For Items


To test for existence of a key
if key in d: # Yes else: # No

Looking up a value that might not exist Example:


name = d.get(key,default) >>> prices.get('IBM',0.0) 93.37 >>> prices.get('SCOX',0.0) 0.0 >>>
Copyright (C) 2008, http://www.dabeaz.com

2- 20

57

Dicts and Lists


You often get input data as a list of pairs
le
"GOOG",513.25 "CAT",87.22 "IBM",93.37 "MSFT",44.12 ... [

tuples

dict() promotes a list of pairs to a dictionary


pricelist = [('GOOG',513.25),('CAT',87.22), ('IBM',93.37),('MSFT',44.12)] prices = dict(pricelist) print prices['IBM'] print prices['MSFT']
Copyright (C) 2008, http://www.dabeaz.com

('GOOG',513.25), ('CAT',87.22), ('IBM',93.37), ('MSFT',44.12), ... ]

2- 21

Sets
Sets
a = set([2,3,4])

Holds collection of unordered items No duplicates, support common set ops


>>> a = set([2,3,4]) >>> b = set([4,8,9]) >>> a | b # Union set([2,3,4,8,9]) >>> a & b # Intersection set([4]) >>> a - b # Difference set([2,3]) >>>

Copyright (C) 2008, http://www.dabeaz.com

2- 22

58

Exercise 2.2

Time : 20 minutes

Copyright (C) 2008, http://www.dabeaz.com

2- 23

Formatted Output
When working with data, you often want
Name Shares Price ---------- ---------- ---------AA 100 32.20 IBM 50 91.10 CAT 150 83.44 MSFT 200 51.23 GE 95 40.37 MSFT 50 65.10 IBM 100 70.44

to produce structured output (tables, etc.).

Copyright (C) 2008, http://www.dabeaz.com

2- 24

59

String Formatting
Formatting operator (%) Requires single item or a tuple on right Commonly used with print Format codes are same as with C printf()
Copyright (C) 2008, http://www.dabeaz.com

>>> "The value is %d" % 3 'The value is 3' >>> "%5d %-5d %10d" % (3,4,5) ' 3 4 5' >>> "%0.2f" % (3.1415926,) 3.14

print "%d %0.2f %s" % (index,val,label)

2- 25

Format Codes
%d %u %x %f %e %g %s %c %% %10d %-10d %0.2f %40s %-40s Decimal integer Unsigned integer Hexadecimal integer Float as [-]m.dddddd Float as [-]m.dddddde+-xx Float, but selective use of E notation String Character Literal % Decimal in a 10-character field (right align) Decimal in a 10-character field (left align) Float with 2 digit precision String in a 40-character field (right align) String in a 40-character field (left align)

Copyright (C) 2008, http://www.dabeaz.com

2- 26

60

String Formatting
Formatting with elds in a dictionary
>>> stock = { ... 'name' : 'GOOG', ... 'price' : 490.10, ... 'shares' : 100 } >>> print "%(name)8s %(shares)10d %(price)10.2f" % stock GOOG 100 490.10 >>>

Useful if performing replacements with a


large number of named values

This is Python's version of "string


Copyright (C) 2008, http://www.dabeaz.com

interpolation" ($variables in other langs)


2- 27

Exercise 2.3

Time : 20 minutes

Copyright (C) 2008, http://www.dabeaz.com

2- 28

61

Working with Sequences


Python has three "sequence" datatypes Sequences are ordered : s[n]
a[0] b[-1] c[1] 'H' 5 100 a = "Hello" b = [1,4,5] c = ('GOOG',100,490.10) # String # List # Tuple

Sequences have a length : len(s)


len(a) len(b) len(c) 5 3 3
Copyright (C) 2008, http://www.dabeaz.com

2- 29

Working with Sequences


Sequences can be replicated : s * n
>>> a = 'Hello' >>> a * 3 'HelloHelloHello' >>> b = [1,2,3] >>> b * 2 [1, 2, 3, 1, 2, 3] >>>

Similar sequences can be concatenated : s + t


>>> a = (1,2,3) >>> b = (4,5) >>> a + b (1,2,3,4,5) >>>

Copyright (C) 2008, http://www.dabeaz.com

2- 30

62

Sequence Slicing
Slicing operator : s[start:end]
a = [0,1,2,3,4,5,6,7,8] a[2:5] a[-5:] a[:3] [2,3,4] [4,5,6,7,8] [0,1,2] 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8

Indices must be integers Slices do not include end value If indices are omitted, they default to the
beginning or end of the list.
Copyright (C) 2008, http://www.dabeaz.com

2- 31

Extended slicing: s[start:end:step]


a = [0,1,2,3,4,5,6,7,8] a[:5] a[-5:] a[0:5:2] a[::-2] a[6:2:-1] [0,1,2,3,4] [4,5,6,7,8] [0,2,4] [8,6,4,2,0] [6,5,4,3]

Extended Slices
0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8

step indicates stride and direction end index is not included in result
Copyright (C) 2008, http://www.dabeaz.com

2- 32

63

Sequence Reductions
sum(s)
>>> s = [1, 2, 3, 4] >>> sum(s) 10 >>>

min(s), max(s)
>>> min(s) 1 >>> max(s) 4 >>> max(t) 'World' >>>
Copyright (C) 2008, http://www.dabeaz.com

2- 33

Iterating over a Sequence


The for-loop iterates over sequence data
>>> s = [1, 4, 9, 16] >>> for i in s: ... print i ... 1 4 9 16 >>>

On each iteration of the loop, you get new


item of data to work with.
Copyright (C) 2008, http://www.dabeaz.com

2- 34

64

Iteration Variables
Each time through the loop, a new value is
placed into an iteration variable
for x in s: statements:

iteration variable

Overwrites the previous value (if any) After the loop nishes, the variable has the
value from the last iteration of the loop
x = 42 for x in s: statements print x
Copyright (C) 2008, http://www.dabeaz.com

# Overwrites any previous x

# Prints value from last iteration

2- 35

break and continue


Breaking out of a loop (exiting) Jumping to the start of the next iteration
for line in lines: if not line: continue # More statements ... for name in namelist: if name == username: break

These statements only apply to the innermost loop that is active


Copyright (C) 2008, http://www.dabeaz.com

2- 36

65

Looping over integers


If you simply need to count, use xrange() xrange([start,] end [,step])
for i in xrange(100): # i = 0,1,...,99 for j in xrange(10,20): # j = 10,11,..., 19 for k in xrange(10,50,2): # k = 10,12,...,48

Note: The ending value is never included


(this mirrors the behavior of slices)
Copyright (C) 2008, http://www.dabeaz.com

2- 37

Caution with range()


range([start,] end [,step])
x = range(100) y = range(10,20) z = range(10,50,2) # x = [0,1,...,99] # y = [10,11,...,19] # z = [10,12,...,48]

range() creates a list of integers You will sometimes see this code If you are only looping, use xrange() instead.
It computes its values on demand instead of creating a list.
Copyright (C) 2008, http://www.dabeaz.com

for i in range(N): statements

2- 38

66

enumerate() Function
Provides a loop counter value
names = ["Elwood","Jake","Curtis"] for i,name in enumerate(names): # Loops with i = 0, name = 'Elwood' # i = 1, name = 'Jake' # i = 2, name = 'Curtis' ...

Example: Keeping a line number


for linenumber,line in enumerate(open(filename)): ...

Copyright (C) 2008, http://www.dabeaz.com

2- 39

enumerate() Function
enumerate() is a nice shortcut Compare to:
i = 0 for x in s: statements i += 1 for i,x in enumerate(s): statements

Less typing and enumerate() runs slightly faster


Copyright (C) 2008, http://www.dabeaz.com

2- 40

67

for and tuples


Looping with multiple iteration variables
points = [ (1,4),(10,40),(23,14),(5,6),(7,8) ] for x,y in points: # Loops with x = 1, y = 4 # x = 10, y = 40 # x = 23, y = 14 # ... tuples are expanded

Here, each tuple is unpacked into a set of


iteration variables.
Copyright (C) 2008, http://www.dabeaz.com

2- 41

zip() Function
Combines multiple sequences into tuples
a = [1,4,9] b = ['Jake','Elwood','Curtis'] x = zip(a,b) # x = [(1,'Jake'),(4,'Elwood'), ...]

One use: Looping over multiple lists


a = [1,4,9] b = ['Jake','Elwood','Curtis'] for num,name in zip(a,b): # Statements ...

Copyright (C) 2008, http://www.dabeaz.com

2- 42

68

zip() Function
zip() always stops with shortest sequence
a = [1,2,3,4,5,6] b = ['Jake','Elwood'] x = zip(a,b) # x = [(1,'Jake'),(2,'Elwood')]

You may combine as many sequences as needed


a = [1, 2, 3, 4, 5, 6] b = ['a', 'b', 'c'] c = [10, 20, 30] x = zip(a,b,c) # x = [(1,'a',10),(2,'b',20),...] y = zip(a,zip(b,c)) # y = [(1,('a',10)),(2,('b',20)),...]

Copyright (C) 2008, http://www.dabeaz.com

2- 43

Using zip()
zip() is also a nice programming shortcut
for a,b in zip(s,t): statements

Compare to:
i = 0 while i < len(s) and i < len(t): a = s[i] b = t[i] statements

Caveat: zip() constructs a list of tuples. Be


careful when working with large data
Copyright (C) 2008, http://www.dabeaz.com

2- 44

69

Exercise 2.4

Time : 10 minutes

Copyright (C) 2008, http://www.dabeaz.com

2- 45

List Sorting
Lists can be sorted "in-place" (sort method)
s = [10,1,7,3] s.sort() # s = [1,3,7,10]

Sorting in reverse order Sorting works with any ordered type


s = ["foo","bar","spam"] s.sort() # s = ["bar","foo","spam"] s = [10,1,7,3] s.sort(reverse=True) # s = [10,7,3,1]

Copyright (C) 2008, http://www.dabeaz.com

2- 46

70

List Sorting
Sometimes you need to perform extra
processing while sorting

Example: Case-insensitive string sort Here, we might like to x the order


Copyright (C) 2008, http://www.dabeaz.com

>>> s = ["hello","WORLD","test"] >>> s.sort() >>> s ['WORLD','hello','test'] >>>

2- 47

List Sorting
Sorting with a key function:
>>> def tolower(x): ... return x.lower() ... >>> s = ["hello","WORLD","test"] >>> s.sort(key=tolower) >>> s ['hello','test','WORLD'] >>>

The key function is a "callback function" that


the sort() method applies to each item determines the sort order

The value returned by the key function


Copyright (C) 2008, http://www.dabeaz.com

2- 48

71

Sequence Sorting
sorted() function Turns any sequence into a sorted list
>>> sorted("Python") ['P', 'h', 'n', 'o', 't', 'y'] >>> sorted((5,1,9,2)) [1, 2, 5, 9]

This is not an in-place sort--a new list is


returned as a result.

The same key and reverse options can be


supplied if necessary
Copyright (C) 2008, http://www.dabeaz.com

2- 49

Exercise 2.5

Time : 10 Minutes

Copyright (C) 2008, http://www.dabeaz.com

2- 50

72

List Processing
Working with lists is very common Python is very adept at processing lists Have already seen many examples:
>>> >>> 15 >>> [1, >>> [1, >>> 1 >>> a = [1, 2, 3, 4, 5] sum(a) a[0:3] 2, 3] a * 2 2, 3, 4, 5, 1, 2, 3, 4, 5] min(a)

Copyright (C) 2008, http://www.dabeaz.com

2- 51

List Comprehensions
Creates a new list by applying an operation
to each element of a sequence.
>>> a = [1,2,3,4,5] >>> b = [2*x for x in a] >>> b [2,4,6,8,10] >>>

Another example:

>>> names = ['Elwood','Jake'] >>> a = [name.lower() for name in names] >>> a ['elwood','jake'] >>>

Copyright (C) 2008, http://www.dabeaz.com

2- 52

73

List Comprehensions
A list comprehension can also lter Another example
>>> a = [1, -5, 4, 2, -2, 10] >>> b = [2*x for x in a if x > 0] >>> b [2,8,4,20] >>>

>>> f = open("stockreport","r") >>> goog = [line for line in f if 'GOOG' in line] >>>

Copyright (C) 2008, http://www.dabeaz.com

2- 53

List Comprehensions
General syntax
[expression for x in s if condition]

What it means
result = [] for x in s: if condition: result.append(expression)

Can be used anywhere a sequence is expected


>>> a = [1,2,3,4] >>> sum([x*x for x in a]) 30 >>>

Copyright (C) 2008, http://www.dabeaz.com

2- 54

74

List Comp: Examples


List comprehensions are hugely useful Collecting the values of a specic eld
stocknames = [s['name'] for s in stocks]

Performing database-like queries


a = [s for s in stocks if s['price'] > 100 and s['shares'] > 50 ]

Quick mathematics over sequences


cost = sum([s['shares']*s['price'] for s in stocks])
Copyright (C) 2008, http://www.dabeaz.com

2- 55

Historical Digression
List comprehensions come from Haskell
a = [x*x for x in s if x > 0] a = [x*x | x <- s, x > 0] # Python # Haskell

And this is motivated by sets (from math)


a = { x2 | x ! s, x > 0 }

But most Python programmers would


Copyright (C) 2008, http://www.dabeaz.com

probably just view this as a "cool shortcut"

2- 56

75

List Comp. and Awk


For Unix hackers, there is a certain similarity
between list comprehensions and short one-line awk commands
# A Python List Comprehension totalcost = sum([shares*price for name, shares, price in portfolio]) # A Unix awk command totalcost = `awk '{ total += $2*$3 } END { print total }' portfolio.dat`

Applying an operation to every line of a le


Copyright (C) 2008, http://www.dabeaz.com

2- 57

Big Idea: Being Declarative


List comprehensions encourage a more
"declarative" style of programming when processing sequences of data. a series of statements that perform various operations on it.

Data can be manipulated by simply "declaring"

Copyright (C) 2008, http://www.dabeaz.com

2- 58

76

Exercise 2.6

Time : 15 Minutes

Copyright (C) 2008, http://www.dabeaz.com

2- 59

More details on objects


So far: a tour of the most common types Have skipped some critical details Memory management Copying Type checking
Copyright (C) 2008, http://www.dabeaz.com

2- 60

77

Variable Assignment
Variables in Python are names for values A variable name does not represent a xed
memory location into which values are stored (like C, C++, Fortran, etc.)

Assignment is just a naming operation


Copyright (C) 2008, http://www.dabeaz.com

2- 61

Variables and Values


At any time, a variable can be redened to
refer to a new value
a = 42 ... a = "Hello" "a" "Hello" "a" 42

Variables are not restricted to one data type Assignment doesn't overwrite the previous
value (e.g., copy over it in memory)

It just makes the name point elsewhere


Copyright (C) 2008, http://www.dabeaz.com

2- 62

78

Names, Values, Types


Names do not have a "type"--it's just a name However, values do have an underlying type type() function will tell you what it is The type name is usually a function that
creates or converts a value to that type
>>> str(42) '42'
Copyright (C) 2008, http://www.dabeaz.com

>>> a = 42 >>> b = "Hello World" >>> type(a) <type 'int'> >>> type(b) <type 'str'>

2- 63

Variable assignment never copies anything! Instead, it just updates a reference count
a = 42 b = a c = [1,2] c.append(b) "a" "b" "c" [x, x, x] ref = 3 42

Reference Counting

So, different variables might be referring to the


same object (check with the is operator)
>>> a is b True >>> a is c[2] True
Copyright (C) 2008, http://www.dabeaz.com

2- 64

79

Reference Counting
Reassignment never overwrites memory, so you
normally don't notice any of the sharing
a = 42 b = a "a" 42 "b" ref = 1 a = 37 "b" "a" 42 ref = 1 37 ref = 2

When you reassign a variable, the name is just


made to point to the new value.
Copyright (C) 2008, http://www.dabeaz.com

2- 65

The Hidden Danger


"Copying" mutable objects such as lists and dicts
>>> a = [1,2,3,4] >>> b = a >>> b[2] = -10 >>> a [1,2,-10,4] "a" "b" [1,2,-10,4]

Changes affect both variables! Reason: Different variable names are


referring to exactly the same object

Yikes!
Copyright (C) 2008, http://www.dabeaz.com

2- 66

80

You have to take special steps to copy data It's a new list, but the list items are shared
>>> a[2].append(102) >>> b[2] [100,101,102] >>> a 2 3 b 100 101 102 4 >>> a = [2,3,[100,101],4] >>> b = list(a) >>> a is b False # Make a copy

Making a Copy

This inner list is still being shared

Known as a "shallow copy"


Copyright (C) 2008, http://www.dabeaz.com

2- 67

Deep Copying
Sometimes you need to makes a copy of an
object and all objects contained within it
>>> a = [2,3,[100,101],4] >>> import copy >>> b = copy.deepcopy(a) >>> a[2].append(102) >>> b[2] [100,101] >>>

Use the copy module

Copyright (C) 2008, http://www.dabeaz.com

2- 68

81

Everything is an object
Numbers, strings, lists, functions,
exceptions, classes, instances, etc...

All objects are said to be "rst-class" Meaning: All objects that can be named can
be passed around as data, placed in containers, etc., without any restrictions.

There are no "special" kinds of objects


Copyright (C) 2008, http://www.dabeaz.com

2- 69

First Class Objects


A simple example:
>>> import math >>> items = [abs, math, ValueError ] >>> items [<built-in function abs>, <module 'math' (builtin)>, <type 'exceptions.ValueError'>] >>> items[0](-45) 45 >>> items[1].sqrt(2) 1.4142135623730951 >>> try: x = int("not a number") except items[2]: print "Failed!" Failed! >>>
Copyright (C) 2008, http://www.dabeaz.com

A list containing a function, a module, and an exception.

You can use items in the list in place of the original names

2- 70

82

Type Checking
How to tell if an object is a specic type
if type(a) is list: print "a is a list" if isinstance(a,list): print "a is a list" # Preferred

Checking for one of many types


if isinstance(a,(list,tuple)): print "a is a list or tuple"

Copyright (C) 2008, http://www.dabeaz.com

2- 71

Summary
Have looked at basic principles of working
with data in Python programs

Brief look at part of the object-model A big part of understanding most Python
programs.

Copyright (C) 2008, http://www.dabeaz.com

2- 72

83

Exercise 2.7

Time : 15 Minutes

Copyright (C) 2008, http://www.dabeaz.com

2- 73

84

Section 3

Program Organization and Functions

Overview
How to organize larger programs More details on program execution Dening and working with functions Exceptions and Error Handling

Copyright (C) 2008, http://www.dabeaz.com

3- 2

85

Observation
A large number of Python programmers spend
most of their time writing short "scripts"

One-off problems, prototyping, testing, etc. Python is good at this! And it what draws many users to Python
Copyright (C) 2008, http://www.dabeaz.com

3- 3

What is a "Script?"
A "script" is a program that simply runs a
series of statements and stops
# program.py statement1 statement2 statement3 ...

We've been writing scripts to this point


Copyright (C) 2008, http://www.dabeaz.com

3- 4

86

Problem
If you write a useful script, it will grow features You may apply it to other related problems Over time, it might become a critical application And it might turn into a huge tangled mess So, let's get organized...
Copyright (C) 2008, http://www.dabeaz.com

3- 5

Program Structure
Recall: programs are a sequence of statements
height = 442 thickness = 0.1*(0.001) # Meters # Meters (0.1 millimeter) numfolds = 0 while thickness <= height: thickness = thickness * 2 numfolds = numfolds + 1 print numfolds, thickness print numfolds, "folds required" print "final thickness is", thickness, "meters"

Programs run by executing the statements in


the same order that they are listed.
Copyright (C) 2008, http://www.dabeaz.com

3- 6

87

Dening Things
You must always dene things before they
get used later on in a program.
a = 42 b = a + 2 def square(x): return x*x z = square(b) # Requires square to be defined # Requires that a is already defined

The order is important You almost always put the denitions of


Copyright (C) 2008, http://www.dabeaz.com

variables and functions near the beginning


3- 7

Dening Functions
It is a good idea to put all of the code related
to a single "task" all in one place
def read_prices(filename): pricesDict = {} for line in open(filename): fields = line.split(',') name = fields[0].strip('"') pricesDict[name] = float(fields[1]) return pricesDict

A function also simplies repeated operations


oldprices = read_prices("oldprices.csv") newprices = read_prices("newprices.csv")

Copyright (C) 2008, http://www.dabeaz.com

3- 8

88

What is a function?
A function is a sequence of statements
def funcname(args): statement statement ... statement

Any Python statement can be used inside


def foo(): import math print math.sqrt(2) help(math)

There are no "special" statements in Python


Copyright (C) 2008, http://www.dabeaz.com

3- 9

Function Denitions
Functions can be dened in any order
def foo(x): bar(x) def bar(x): statements def bar(x) statements def foo(x): bar(x)

Functions must only be dened before they


are actually used during program execution
foo(3) # foo must be defined already

Stylistically, it is more common to see


Copyright (C) 2008, http://www.dabeaz.com

functions dened in a "bottom-up" fashion


3- 10

89

Bottom-up Style
Functions are treated as building blocks The smaller/simpler blocks go rst
# myprogram.py def foo(x): ... def bar(x): ... foo(x) ... def spam(x): ... bar(x) ... spam(42)

Later functions build upon earlier functions Code that uses the functions appears at the end
# Call spam() to do something

Copyright (C) 2008, http://www.dabeaz.com

3- 11

A Denition Caution
Functions can be redened!
def foo(x): return 2*x print foo(2) def foo(x,y): return x*y print foo(2,3) print foo(2) # Prints 4 # Redefine foo(). # foo() above. This replaces

# Prints 6 # Error : foo takes two arguments

A repeated function denition silently replaces


the previous denition

No overloaded functions (vs. C++, Java).


Copyright (C) 2008, http://www.dabeaz.com

3- 12

90

Exercise 3.1

Time : 15 minutes

Copyright (C) 2008, http://www.dabeaz.com

3- 13

Function Design
Functions should be easy to use The whole point of a function is to dene
code for repeated operations

Some things to think about: Function inputs and output Side-effects


Copyright (C) 2008, http://www.dabeaz.com

3- 14

91

Function Arguments
Functions operate on passed arguments
def square(x): return x*x

argument

Argument variables receive their values


when the function is called
a = square(3)

The argument names are only visible inside


the function body (are local to function)
Copyright (C) 2008, http://www.dabeaz.com

3- 15

Default Arguments
Sometimes you want an optional argument If a default value is assigned, the argument is
optional in function calls
d = read_prices("prices.csv") e = read_prices("prices.dat",' ') def read_prices(filename,delimiter=','): ...

Note : Arguments with defaults must appear


at the end of the argument list (all nonoptional arguments go rst)
Copyright (C) 2008, http://www.dabeaz.com

3- 16

92

Calling a Function
Consider a simple function
def read_prices(filename,delimiter): ...

Calling with "positional" args


prices = read_prices("prices.csv",",")

This places values into the arguments in the

same order as they are listed in the function denition (by position)
3- 17

Copyright (C) 2008, http://www.dabeaz.com

Keyword Arguments
Calling with "keyword" arguments
prices = read_prices(filename="price.csv", delimiter=",")

Here, you explicitly name and assign a value


to each argument optional features

Common use case : functions with many


def sort(data,reverse=False,key=None): statements sort(s,reverse=True)

Copyright (C) 2008, http://www.dabeaz.com

3- 18

93

Mixed Arguments
Positional and keyword arguments can be
mixed together in a function call
read_prices("prices.csv",delimiter=',')

The positional arguments must go rst. Basic rules: All required arguments get values No duplicate argument values
Copyright (C) 2008, http://www.dabeaz.com

3- 19

Design Tip
Always give short, but meaningful names to
function arguments

Someone using a function may want to use


the keyword calling style
d = read_prices("prices.csv", delimiter=",")

Python development tools will show the

names in help features and documentation

Copyright (C) 2008, http://www.dabeaz.com

3- 20

94

Design Tip
Don't write functions that take a huge
number of input parameters

You are not going to remember how to call


your 20-argument function (nor will you probably want to). (fewer than 4 as a rule of thumb)

Functions should only take a few inputs


3- 21

Copyright (C) 2008, http://www.dabeaz.com

Design Tip
Don't transform function inputs It limits the exibility of the function Here is a better version
def read_data(f): for line in f: ... # Sample use - pass in a different kind of file import gzip f = gzip.open("portfolio.gz") portfolio = read_data(f)
Copyright (C) 2008, http://www.dabeaz.com

def read_data(filename): f = open(filename) for line in f: ...

Transforms lename into an open le

3- 22

95

Return Values
return statement returns a value
def square(x): return x*x

If no return value, None is returned


def bar(x): statements return a = bar(4) # a = None

Return value is discarded if not assigned/used


square(4) # Calls square() but discards result
Copyright (C) 2008, http://www.dabeaz.com

3- 23

Multiple Return Values


A function may return multiple values by
returning a tuple
def divide(a,b): q = a // b r = a % b return q,r # Quotient # Remainder # Return a tuple

Usage examples:
x = divide(37,5) # x = (7,2) # x = 7, y = 2 x,y = divide(37,5)

Copyright (C) 2008, http://www.dabeaz.com

3- 24

96

Side Effects
When you call a function, the arguments
are simply names for passed values

If mutable data types are passed (e.g., lists,


dicts), they can be modied "in-place"
def square_list(s): for i in xrange(len(s)): s[i] = s[i]*s[i] a = [1, 2, 3] square_list(a) print a

Modies the input object

# [1, 4, 9]

Such changes are an example of "side effects"


Copyright (C) 2008, http://www.dabeaz.com

3- 25

Design Tip
Don't modify function inputs When data moves around around in a And it will be really hard to debug Caveat: It depends on the problem
Copyright (C) 2008, http://www.dabeaz.com

program, it always does so by reference. parts of the program you didn't expect

If you modify something, it may affect other

3- 26

97

Understanding Variables
Programs assign values to variables
x = value def foo(): y = value # Global variable # Local variable

Variable assignments occur outside and


inside function denitions

Variables dened outside are "global" Variables inside a function are "local"
Copyright (C) 2008, http://www.dabeaz.com

3- 27

Local Variables
Variables inside functions are private
def read_portfolio(filename): portfolio = [] for line in open(filename): fields = line.split() s = (fields[0],int(fields[1]),float(fields[2])) portfolio.append(s) return portfolio

Values not retained or accessible after return


>>> stocks = read_portfolio("stocks.dat") >>> fields Traceback (most recent call last): File "<stdin>", line 1, in ? NameError: name 'fields' is not defined >>>

Don't conict with variables found elsewhere


Copyright (C) 2008, http://www.dabeaz.com

3- 28

98

Global Variables
Functions can access the values of globals
delimiter = ',' def read_portfolio(filename): ... for line in open(filename): fields = line.split(delimiter) ...

This is a useful technique for dealing with


(otherwise the program becomes fragile)

general program settings and other details

However, you don't want to go overboard


Copyright (C) 2008, http://www.dabeaz.com

3- 29

Modifying Globals
One quirk: Functions can't modify globals
delimiter = ',' def set_delimiter(newdelimiter): delimiter = newdelimiter

Example:
>>> delimiter ',' >>> set_delimiter(':') >>> delimiter ',' >>>

Notice no change

All assignments in functions are local


Copyright (C) 2008, http://www.dabeaz.com

3- 30

99

Modifying Globals
If you want to modify a global variable you
must declare it as such in the function
delimiter = ',' def set_delimiter(newdelimiter): global delimiter delimiter = newdelimiter

global declaration must appear before use Only necessary for globals that will be
modied (globals are already readable)
Copyright (C) 2008, http://www.dabeaz.com

3- 31

Design Tip
If you use the global declaration a lot, you
are probably making things difcult that modify lots of global variables

It's generally a bad idea to write programs Better off using a user dened class (later)
Copyright (C) 2008, http://www.dabeaz.com

3- 32

100

Exercise 3.2

Time : 20 minutes

Copyright (C) 2008, http://www.dabeaz.com

3- 33

More on Functions
There are additional details Variable argument functions Error checking Callback functions Anonymous functions Documentation strings
Copyright (C) 2008, http://www.dabeaz.com

3- 34

101

Variable Arguments
Function that accepts any number of args
def foo(x,*args): ...

Here, the arguments get passed as a tuple


foo(1,2,3,4,5) def foo(x,*args):

(2,3,4,5)

Copyright (C) 2008, http://www.dabeaz.com

3- 35

Variable Arguments
Example:
def print_headers(*headers): for h in headers: print "%10s" % h, print print ("-"*10 + " ")*len(headers)

Usage:
>>> print_headers('Name','Shares','Price') Name Shares Price ---------- ---------- --------->>>

Copyright (C) 2008, http://www.dabeaz.com

3- 36

102

Variable Arguments
Function that accepts any keyword args
def foo(x,y,**kwargs): ...

Extra keywords get passed in a dict


foo(2,3,flag=True,mode="fast",header="debug") def foo(x,y,**kwargs): ... { 'flag' : True, 'mode' : 'fast', 'header' : 'debug' }

Copyright (C) 2008, http://www.dabeaz.com

3- 37

Variable Keyword Use


Accepting any number of keyword args is
useful when there is a lot of complicated conguration features
def plot_data(data, **opts): color = opts.get('color','black') symbols = opts.get('symbol',None) ... plot_data(x, color="blue", bgcolor="white", title="Energy", xmin=-4.0, xmax=4.0, ymin=-10.0, ymax=10.0, xaxis_title="T", yaxis_title="KE", ... )
Copyright (C) 2008, http://www.dabeaz.com

3- 38

103

Variable Arguments
A function that takes any arguments
def foo(*args,**kwargs): statements

This will accept any combination of


positional or keyword arguments

Sometimes used when writing wrappers or


when you want to pass arguments through to another function

Copyright (C) 2008, http://www.dabeaz.com

3- 39

Passing Tuples and Dicts


Tuples can be expand into function args
args = (2,3,4) foo(1, *args) # Same as foo(1,2,3,4)

Dictionaries can expand to keyword args


kwargs = { 'color' : 'red', 'delimiter' : ',', 'width' : 400 } foo(data, **kwargs) # Same as foo(data,color='red',delimiter=',',width=400)

These are not commonly used except when


writing library functions.
104
Copyright (C) 2008, http://www.dabeaz.com

3- 40

Error Checking
Python performs no checking or validation of
function argument types or values

A function will work on any data that is Example:


def add(x,y): return x + y add(3,4) add("Hello","World") add([1,2],[3,4]) # 7 # "HelloWorld" # [1,2,3,4]

compatible with the statements in the function

Copyright (C) 2008, http://www.dabeaz.com

3- 41

Error Checking
If there are errors in a function, they will
show up at run time (as an exception)
def add(x,y): return x+y >>> add(3,"hello") Traceback (most recent call last): ... TypeError: unsupported operand type(s) for +: 'int' and 'str' >>>

Example:

To verify code, there is a strong emphasis on


testing (covered later)
105
Copyright (C) 2008, http://www.dabeaz.com

3- 42

Error Checking
Python also performs no checking of
function return values

Inconsistent use does not result in an error Example:


def foo(x,y): if x: return x + y else: return

Inconsistent use of return (not checked)

Copyright (C) 2008, http://www.dabeaz.com

3- 43

Callback Functions
Sometimes functions are written to rely on
a user-supplied function that performs some kind of special processing.
def lowerkey(s): return s.lower() names.sort(key=lowerkey)

Example: Case-insensitive sort Here, the sort() operation calls the lowerkey
function as part of its processing (a callback).
Copyright (C) 2008, http://www.dabeaz.com

3- 44

106

Anonymous Functions
Since callbacks are often short expressions,
there is a shortcut syntax for it

lambda statement
names.sort(key=lambda s: s.lower())

Creates a function that evaluates an expression


lowerkey = lambda s: s.lower() # Same as def lowerkey(s): return s.lower()

lambda is highly restricted. It can only be a


single Python expression.
Copyright (C) 2008, http://www.dabeaz.com

3- 45

First line of function may be string


>>> divide(9/5) (1,4) >>> divide(15,4) (3,3) >>> """ q = a // b r = a % b return (q,r)

Documentation Strings
def divide(a,b): """Divides a by b and returns a quotient and remainder. For example:

Including a docstring is considered good style Why?


Copyright (C) 2008, http://www.dabeaz.com

3- 46

107

Docstring Benets
Online help
>>> help(divide) Help on function divide in module foo: divide(a, b) Divides a by b and returns a quotient and remainder. For example: >>> divide(9/5) (1,4) >>> divide(15,4) (3,3) >>>

Many IDEs/tools look at docstrings


Copyright (C) 2008, http://www.dabeaz.com

3- 47

Docstring Benets
Testing
def divide(a,b): """Divides a by b and returns a quotient and remainder. For example: >>> divide(9/5) (1,4) >>> divide(15,4) (3,3) >>>

Testing tools look at interaction session Example: doctest module (later)


Copyright (C) 2008, http://www.dabeaz.com

"""

3- 48

108

Exercise 3.3

Time : 10 minutes

Copyright (C) 2008, http://www.dabeaz.com

3- 49

Exceptions
Used to signal errors Raising an exception (raise)
if name not in names: raise RuntimeError("Name not found")

Catching an exception (try)


try: authenticate(username) except RuntimeError,e: print e

Copyright (C) 2008, http://www.dabeaz.com

3- 50

109

Exceptions propagate to rst matching except


def foo(): try: bar() except RuntimeError,e: ... def bar(): try: spam() except RuntimeError,e: ... def spam(): grok() def grok(): ... raise RuntimeError("Whoa!")
Copyright (C) 2008, http://www.dabeaz.com

Exceptions

3- 51

About two-dozen built-in exceptions


ArithmeticError AssertionError EnvironmentError EOFError ImportError IndexError KeyboardInterrupt KeyError MemoryError NameError ReferenceError RuntimeError SyntaxError SystemError TypeError ValueError

Builtin-Exceptions

Consult reference
Copyright (C) 2008, http://www.dabeaz.com

3- 52

110

Exception Values
Most exceptions have an associated value More information about what's wrong Passed to variable supplied in except
raise RuntimeError("Invalid user name")

It's an instance of the exception type, but


often looks like a string
except RuntimeError,e: print "Failed : Reason", e

try: ... except RuntimeError,e: ...

Copyright (C) 2008, http://www.dabeaz.com

3- 53

Catching Multiple Errors


Can catch different kinds of exceptions
try: ... except LookupError,e: ... except RuntimeError,e: ... except IOError,e: ... except KeyboardInterrupt,e: ...

Alternatively, if handling is same


Copyright (C) 2008, http://www.dabeaz.com

try: ... except (IOError,LookupError,RuntimeError),e: ...

3- 54

111

Catching All Errors


Catching any exception
try: ... except Exception: print "An error occurred"

Ignoring an exception (pass)


try: ... except RuntimeError: pass

Copyright (C) 2008, http://www.dabeaz.com

3- 55

Exploding Heads
The wrong way to use exceptions:
try: go_do_something() except Exception: print "Couldn't do something"

This swallows all possible errors that might


occur.

May make it impossible to debug if code is


Copyright (C) 2008, http://www.dabeaz.com

failing for some reason you didn't expect at all (e.g., uninstalled Python module, etc.)
3- 56

112

A Better Approach
This is a somewhat more sane approach
try: go_do_something() except Exception, e: print "Couldn't do something. Reason : %s\n" % e

Reports a specic reason for the failure It is almost always a good idea to have some

mechanism for viewing/reporting errors if you are writing code that catches all possible exceptions
3- 57

Copyright (C) 2008, http://www.dabeaz.com

nally statement
Species code that must run regardless of
whether or not an exception occurs
lock = Lock() ... lock.acquire() try: ... finally: lock.release()

# release the lock

Commonly use to properly manage


Copyright (C) 2008, http://www.dabeaz.com

resources (especially locks, les, etc.)


3- 58

113

Program Exit
Program exit is handle through exceptions
raise SystemExit raise SystemExit(exitcode)

An alternative sometimes seen Catching keyboard interrupt (Control-C)


try: statements except KeyboardInterrupt: statements import sys sys.exit() sys.exit(exitcode)

Copyright (C) 2008, http://www.dabeaz.com

3- 59

Exercise 3.4

Time : 10 minutes

Copyright (C) 2008, http://www.dabeaz.com

3- 60

114

Section 4

Modules and Libraries

Overview
How to place code in a module Useful applications of modules Some common standard library modules Installing third party libraries
4- 2

Copyright (C) 2008, http://www.dabeaz.com

115

Modules
Any Python source le is a module
# foo.py def grok(a): ... def spam(b): ...

import statement loads and executes a module


import foo a = foo.grok(2) b = foo.spam("Hello") ...

Copyright (C) 2008, http://www.dabeaz.com

4- 3

Namespaces
A module is a collection of named values It's said to be a "namespace" The names correspond to global variables and
functions dened in the source le
>>> import foo >>> foo.grok(2) >>>

You use the module name to access Module name is tied to source (foo -> foo.py)
Copyright (C) 2008, http://www.dabeaz.com

4- 4

116

Module Execution
When a module is imported, all of the
statements in the module execute one after another until the end of the le is reached tasks (printing, creating les, etc.), they will run on import of the global names that are still dened at the end of this execution process
4- 5

If there are scripting statements that carry out The contents of the module namespace are all
Copyright (C) 2008, http://www.dabeaz.com

Globals Revisited
Everything dened in the "global" scope is
what populates the module namespace
# foo.py x = 42 def grok(a): ... # bar.py x = 37 def spam(a): ...

These denitions of x are different

Different modules can use the same names


and those names don't conict with each other (modules are isolated)
Copyright (C) 2008, http://www.dabeaz.com

4- 6

117

Source les can run in two "modes"


shell % python foo.py import foo

Running as "Main"
# Running as main # Loaded as a module

Sometimes you want to check


if __name__ == '__main__': print "Running as the main program" else: print "Imported as a module using import"

__name__ is the name of the module The main program (and interactive
interpreter) module is __main__
Copyright (C) 2008, http://www.dabeaz.com

4- 7

Module Loading
Each module loads and executes once Repeated imports just return a reference to
the previously loaded module

sys.modules is a dict of all loaded modules


>>> import sys >>> sys.modules.keys() ['copy_reg', '__main__', 'site', '__builtin__', 'encodings', 'encodings.encodings', 'posixpath', ...] >>>

Copyright (C) 2008, http://www.dabeaz.com

4- 8

118

Locating Modules
When looking for modules, Python rst
module search path is consulted
>>> import sys >>> sys.path ['', '/Library/Frameworks/ Python.framework/Versions/2.5/lib/ python25.zip', '/Library/Frameworks/ Python.framework/Versions/2.5/lib/ python2.5', ... ]
Copyright (C) 2008, http://www.dabeaz.com

looks in the current working directory of the main program

If a module can't be found there, an internal

4- 9

Module Search Path


sys.path contains search path Can manually adjust if you need to
import sys sys.path.append("/project/foo/pyfiles")

Paths also added via environment variables


% env PYTHONPATH=/project/foo/pyfiles python Python 2.4.3 (#1, Apr 7 2006, 10:54:33) [GCC 4.0.1 (Apple Computer, Inc. build 5250)] on darwin >>> import sys >>> sys.path ['','/project/foo/pyfiles', '/Library/Frameworks ...]

Copyright (C) 2008, http://www.dabeaz.com

4- 10

119

Import Process
Import looks for specic kinds of les
name.pyc name.pyo name.py name.dll # # # # Compiled Python Compiled Python (optimized) Python source file Dynamic link library (C/C++)

If module is a .py le, it is rst compiled into


bytecode and a .pyc/.pyo le is created the original source le change

.pyc/.pyo les regenerated as needed should


Copyright (C) 2008, http://www.dabeaz.com

4- 11

Exercise 4.1

Time : 15 Minutes

Copyright (C) 2008, http://www.dabeaz.com

4- 12

120

Modules as Objects
When you import a module, the module
itself is a kind of "object"

You can assign it to variables, place it in lists,


change it's name, and so forth
import math m = math x = m.sqrt(2) # Assign to a variable # Access through the variable

You can even store new things in it


math.twopi = 2*math.pi
Copyright (C) 2008, http://www.dabeaz.com

4- 13

What is a Module?
A module is just a thin-layer over a
dictionary (which holds all of the contents)
>>> import foo >>> foo.__dict__.keys() ['grok','spam','__builtins__','__file__', '__name__', '__doc__'] >>> foo.__dict__['grok'] <function grok at 0x69230> >>> foo.grok <function grok at 0x69230> >>>

Any time you reference something in a


module, it's just a dictionary lookup
Copyright (C) 2008, http://www.dabeaz.com

4- 14

121

import as statement
Changing the name of the loaded module
# bar.py import fieldparse as fp fields = fp.split(line)

This is identical to import except that a

different name is used for the module object le (other modules can still import the library using its original name)

The new name only applies to this one source


Copyright (C) 2008, http://www.dabeaz.com

4- 15

import as statement
There are other practical uses of changing
the module name on import

It makes it easy to load and seamlessly use


different versions of a library module
if debug: import foodebug as foo else: import foo foo.grok(2) # Load debugging version

# Uses whatever module was loaded

You can make a program extensible by simply


Copyright (C) 2008, http://www.dabeaz.com

importing a plugin module as a common name


4- 16

122

from module import


Selected symbols can be lifted out of a module
and placed into the caller's namespace
# bar.py from foo import grok grok(2)

This is useful if a name is used repeatedly and


you want to reduce the amount of typing

And, it also runs faster if it gets used a lot


Copyright (C) 2008, http://www.dabeaz.com

4- 17

from module import *


Takes all symbols from a module and places
them into the caller's namespace
# bar.py from foo import * grok(2) spam("Hello") ...

However, it only applies to names that don't


start with an underscore (_)

_name often used when dening nonimported values in a module.


Copyright (C) 2008, http://www.dabeaz.com

4- 18

123

from module import *


Using this form of import requires some
care and attention

It will overwrite any previously dened


that you don't really want (or expect)

objects that share the same name as an imported symbol

It may pollute the namespace with symbols


Copyright (C) 2008, http://www.dabeaz.com

4- 19

Commentary
As a general rule, it is better to make proper
use of namespaces and to use the normal import statement
import foo

It is more "pythonic" and reduces the


amount of clutter in your code. "Namespaces are one honking great idea -- let's do more of those!" - import this
Copyright (C) 2008, http://www.dabeaz.com

4- 20

124

Exercise 4.2

Time : 15 Minutes

Copyright (C) 2008, http://www.dabeaz.com

4- 21

Standard Library
Python includes a large standard library Several hundred modules System, networking, data formats, etc. All accessible via import A quick tour of most common modules
Copyright (C) 2008, http://www.dabeaz.com

4- 22

125

Builtin Functions
About 70 functions/objects Always available in global namespace Contained in a module __builtins__ Have seen many of these functions already

Copyright (C) 2008, http://www.dabeaz.com

4- 23

Built-in mathematical functions


abs(x) divmod(x,y) max(s1, s2, ..., sn) min(s1, s2, ..., sn) pow(x,y [,z]) round(x, [,n]) sum(s [,initial])

Builtins: Math

Examples

>>> max(2,-10,40,7) 40 >>> min([2,-10,40,7]) -10 >>> sum([2,-10,40,7],0) 39 >>> round(3.141592654,2) 3.1400000000000001 >>>

Copyright (C) 2008, http://www.dabeaz.com

4- 24

126

Conversion functions and types


str(s) repr(s) int(x [,base]) long(x [,base]) float(x) complex(x) hex(x) oct(x) chr(n) ord(c) # # # # # # # # # #

Builtins: Conversions
Convert to string String representation Integer Long Float Complex Hex string Octal string Character from number Character code

Examples

>>> hex(42) '0x2a' >>> ord('c') 99 >>> chr(65) 'A' >>>

Copyright (C) 2008, http://www.dabeaz.com

4- 25

Functions commonly used with for statement Examples


>>> x = >>> for ... 4 -10 1 >>> for 0 1 1 -10 2 4 >>> for ... -10 1 4 enumerate(s) range([i,] j [,stride]) reversed(s) sorted(s [,cmp [,key [,reverse]]]) xrange([i,] j [,stride]) zip(s1,s2,...)

Builtins: Iteration

[1,-10,4] i in reversed(x): print i,

i,v in enumerate(x): print i,v

i in sorted(x): print i,

Copyright (C) 2008, http://www.dabeaz.com

4- 26

127

sys module
Information related to environment Version information System limits Command line options Module search paths Standard I/O streams
Copyright (C) 2008, http://www.dabeaz.com

4- 27

sys: Command Line Opts


Parameters passed on Python command line Example:
sys.argv # opt.py import sys print sys.argv % python opt.py -v 3 "a test" foobar.txt ['opt.py', '-v', '3', 'a test', 'foobar.txt'] %

Copyright (C) 2008, http://www.dabeaz.com

4- 28

128

Command Line Args


For simple programs, can manually process
import sys if len(sys.argv) == 2: filename = sys.argv[1] out = open(filename,"w") else: out = sys.stdout

For anything more complicated, processing


becomes tediously annoying

Soln: optparse module


Copyright (C) 2008, http://www.dabeaz.com

4- 29

optparse Module
A whole framework for processing command
line arguments

Might be overkill, but implements most of the

annoying code you would have to write yourself

A relatively recent addition to Python Impossible to cover all details, will show example
Copyright (C) 2008, http://www.dabeaz.com

4- 30

129

optparse Example
import optparse p = optparse.OptionParser() p.add_option("-d",action="store_true",dest="debugmode") p.add_option("-o",action="store",type="string",dest="outfile") p.add_option("--exclude",action="append",type="string", dest="excluded") p.set_defaults(debugmode=False,outfile=None,excluded=[]) opt, args = p.parse_args() debugmode = opt.debugmode outfile = opt.outfile excluded = opt.excluded

Copyright (C) 2008, http://www.dabeaz.com

4- 31

sys: Standard I/O


Standard I/O streams
sys.stdout sys.stderr sys.stdin

By default, print is directed to sys.stdout Input read from sys.stdin Can redene or use directly
sys.stdout = open("out.txt","w") print >>sys.stderr, "Warning. Unable to connect"
Copyright (C) 2008, http://www.dabeaz.com

4- 32

130

math module
Contains common mathematical functions
math.sqrt(x) math.sin(x) math.cos(x) math.tan(x) math.atan(x) math.log(x) ... math.pi math.e

Example:
Copyright (C) 2008, http://www.dabeaz.com

import math c = 2*math.pi*math.sqrt((x1-x2)**2 + (y1-y2)**2)

4- 33

random Module
Generation of random numbers Example:
random.randint(a,b) random.random() random.uniform(a,b) random.seed(x) >>> import random >>> random.randint(10,20) 16 >>> random.random() 0.53763273926379385 >>> random.uniform(10,20) 13.62148074612832 >>> # Random integer a<=x<b # Random float 0<=x<1 # Random float a<=x<b

Copyright (C) 2008, http://www.dabeaz.com

4- 34

131

Use to make shallow/deep copies of objects


copy.copy(obj) copy.deepcopy(obj) # Make shallow copy # Make deep copy

copy Module

Example:

>>> a = [1,2,3] >>> b = ['x',a] >>> import copy >>> c = copy.copy(b) >>> c ['x', [1, 2, 3]] >>> c[1] is a True >>> d = copy.deepcopy(b) >>> d ['x', [1, 2, 3]] >>> d[1] is a False

a only copied by reference. b and c contain the same a.

a is also copied. b and d contain different lists

Copyright (C) 2008, http://www.dabeaz.com

4- 35

os Module
Contains operating system functions Example: Executing a system command Walking a directory tree
>>> for path,dirs,files in os.walk("/home"): ... for name in files: ... print "%s/%s" % (path,name) >>> import os >>> os.system("mkdir temp") >>>

Copyright (C) 2008, http://www.dabeaz.com

4- 36

132

Environment Variables
Environment variables (typically set in shell) os.environ dictionary contains values
import os home = os.environ['HOME'] os.environ['HOME'] = '/home/user/guest' % setenv NAME dave % setenv RSH ssh % python prog.py

Changes are reected in Python and any


subprocesses created later
Copyright (C) 2008, http://www.dabeaz.com

4- 37

Getting a Directory Listing


os.listdir() function glob module
systems)
>>> files = os.listdir("/some/path") >>> files ['foo','bar','spam'] >>>

glob understands Unix shell wildcards (on all


Copyright (C) 2008, http://www.dabeaz.com

>>> txtfiles = glob.glob("*.txt") >>> datfiles = glob.glob("Dat[0-5]*") >>>

4- 38

133

os.path Module
Portable management of path names and les Examples:
>>> import os.path >>> os.path.basename("/home/foo/bar.txt") 'bar.txt' >>> os.path.dirname("/home/foo/bar.txt") '/home/foo' >>> os.path.join("home","foo","bar.txt") 'home/foo/bar.txt' >>>

Copyright (C) 2008, http://www.dabeaz.com

4- 39

Testing if a le exists

File Tests

Testing if a lename is a regular le Testing if a lename is a directory


>>> os.path.isdir("foo.txt") False >>> os.path.isdir("/usr") True >>>
Copyright (C) 2008, http://www.dabeaz.com

>>> os.path.exists("foo.txt") True >>>

>>> os.path.isfile("foo.txt") True >>> os.path.isfile("/usr") False >>>

4- 40

134

Getting the le size

File Metadata

Getting the last modication/access time Note: To decode times, use time module
>>> time.ctime(os.path.getmtime("foo.txt")) 'Thu Apr 5 05:36:56 2007' >>> >>> os.path.getmtime("foo.txt") 1175769416.0 >>> os.path.getatime("foo.txt") 1175769491.0 >>>

>>> os.path.getsize("foo.txt") 1344L >>>

Copyright (C) 2008, http://www.dabeaz.com

4- 41

Shell Operations (shutil)


Copying a le
>>> shutil.copy("source","dest")

Moving a le (renaming)
>>> shutil.move("old","new")

Copying a directory tree


>>> shutil.copytree("srcdir","destdir")

Removing a directory tree


>>> shutil.rmtree("dir")
Copyright (C) 2008, http://www.dabeaz.com

4- 42

135

Exercise 4.3

Time : 10 Minutes

Copyright (C) 2008, http://www.dabeaz.com

4- 43

pickle Module
A module for serializing objects Saving an object to a le Loading an object from a le
f = open("data","rb") someobj = pickle.load(f) import pickle ... f = open("data","wb") pickle.dump(someobj,f)

This works with almost all objects except for


those involving system state (open les, network connections, threads, etc.)
136
Copyright (C) 2008, http://www.dabeaz.com

4- 44

pickle Module
Pickle can also turn objects into strings
import pickle # Convert to a string s = pickle.dumps(someobj) ... # Load from a string someobj = pickle.loads(s)

Useful if you want to send an object elsewhere Examples: Send over network, store in a
database, etc.
Copyright (C) 2008, http://www.dabeaz.com

4- 45

Used to create a persistent dictionary


import shelve s = shelve.open("data","c") # Put an object in the shelf s['foo'] = someobj # Get an object out of the shelf someobj = s['foo']

shelve module

Supports many dictionary operations


s[key] = obj data = s[key] del s[key] s.has_key(key) s.keys()
Copyright (C) 2008, http://www.dabeaz.com

s.close()

# # # # #

Store an object Get an object Delete an object Test for key Return list of keys

4- 46

137

shelve module
Shelve open() ags
s s s s = = = = shelve.open("file",'c') shelve.open("file",'r') shelve.open("file",'w') shelve.open("file",'n') # # # # RW. Create if not exist Read-only Read-write RW. Force creation

Keys must be strings


s = shelve.open("data","w") s['foo'] = 4 # okay s[2] = 'Hello' # Error. bad key

Stored values may be any object compatible


with the pickle module
Copyright (C) 2008, http://www.dabeaz.com

4- 47

CongParser Module
A module that can read parameters out of
# A Comment ; A Comment (alternative) [section1] var1 = 123 var2 = abc [section2] var1 = 456 var2 = def ...

conguration les based on the .INI format

File consists of named sections each with a


series of variable declarations
138
Copyright (C) 2008, http://www.dabeaz.com

4- 48

Parsing INI Files


Creating and parsing a conguration le
import ConfigParser cfg = ConfigParser.ConfigParser() cfg.read("config.ini")

Retrieving the value of an option


v = cfg.get("section","varname")

This returns the value as a string or raises an


exception if not found.
Copyright (C) 2008, http://www.dabeaz.com

4- 49

Sample .INI le

Parsing Example

; simple.ini [files] infile = stocklog.dat outfile = summary.dat [web] host = www.python.org port = 80

Interactive session

>>> cfg = ConfigParser.ConfigParser() >>> cfg.read("simple.ini") ['simple.ini'] >>> cfg.get("files","infile") 'stocklog.dat' >>> cfg.get("web","host") 'www.python.org' >>>

Copyright (C) 2008, http://www.dabeaz.com

4- 50

139

Tkinter
A Library for building simple GUIs
>>> >>> ... >>> >>> >>> from Tkinter import * def pressed(): print "You did it!" b = Button(text="Do it!",command=pressed) b.pack() b.mainloop()

Clicking on the button....


You did it! You did it! ...
Copyright (C) 2008, http://www.dabeaz.com

4- 51

More on Tkinter
The only GUI packaged with Python itself Based on Tcl/Tk. Popular open-source
(Perl, Ruby, etc.) scripting language/GUI widget set developed by John Ousterhout (90s)

Tk used in a wide variety of other languages Cross-platform (Unix/Windows/MacOS) It's small (~25 basic widgets)
Copyright (C) 2008, http://www.dabeaz.com

4- 52

140

Sample Tkinter Widgets

Copyright (C) 2008, http://www.dabeaz.com

4- 53

Tkinter Programming
GUI programming is a big topic Frankly, it's mostly a matter of looking at the
docs (google 'Tkinter' for details) loop with callback functions
>>> >>> ... >>> >>> >>>

Execution model is based entirely on an event


from Tkinter import * def pressed(): Callback print "You did it!" b = Button(text="Do it!",command=pressed) b.pack() b.mainloop() Run the event loop

Copyright (C) 2008, http://www.dabeaz.com

4- 54

141

Tkinter Commentary
Your mileage with Tkinter will vary It comes with Python, but it's also dated And it's a lot lower-level than many like Some other GUI toolkits: wxPython PyQT
Copyright (C) 2008, http://www.dabeaz.com

4- 55

Exercise 4.4

Time : 30 Minutes

Copyright (C) 2008, http://www.dabeaz.com

4- 56

142

Useful New Datatypes


The standard library also contains new kinds of
datatypes useful for certain kinds of problems

Examples: Decimal datetime deque Will briey illustrate


Copyright (C) 2008, http://www.dabeaz.com

4- 57

decimal module
A module that implements accurate base-10
decimals of arbitrary precision
>>> from decimal import Decimal >>> x = Decimal('3.14') >>> y = Decimal('5.2') >>> x + y Decimal('8.34') >>> x / y Decimal('0.6038461538461538461538461538') >>>

Controlling the precision


Copyright (C) 2008, http://www.dabeaz.com

>>> from decimal import getcontext >>> getcontext().prec = 2 >>> x / y Decimal('0.60') >>>

4- 58

143

datetime module
A module for representing and manipulating
dates and times
>>> from datetime import datetime >>> inauguration = datetime(2009,1,20) >>> inauguration datetime.datetime(2009, 1, 20, 0, 0) >>> today = datetime.today() >>> d = inauguration - today >>> d datetime.timedelta(92, 54189, 941608) >>> d.days 92 >>>

There are many more features not shown


Copyright (C) 2008, http://www.dabeaz.com

4- 59

collections module
A module with a few useful data structures collections.deque - A doubled ended queue
>>> from collections import deque >>> q = deque() >>> q.append("this") >>> q.appendleft("that") >>> q deque(['that', 'this']) >>>

A deque is like a list except that it's highly Implemented in C and nely tuned
Copyright (C) 2008, http://www.dabeaz.com

optimized for insertion/deletion on both ends

4- 60

144

Standardized APIs
API (Application Programming Interface) Some parts of the Python library are focused Example : The Python Database API A standardized way to access relational
Copyright (C) 2008, http://www.dabeaz.com

on providing a common programming interface for accessing third-party systems

database systems (Oracle, Sybase, MySQL, etc.)

4- 61

Connecting to a DB, executing a query

Database Example

Output

from somedbmodule import connect connection = connect("somedb", user="dave",password="12345") cursor = connection.cursor() cursor.execute("select name,address from people") for row in cursor: print row connection.close()

('Elwood','1060 W Addison') ('Jack','4402 N Broadway') ...

It's actually not much more than this...


Copyright (C) 2008, http://www.dabeaz.com

4- 62

145

Forming Queries
There is one tricky bit concerning the
formation of query strings
select address from people where name='Elwood'

As a general rule, you should not use Python


string operators to create queries
query = """select address from people where name='%s'""" % (name,)

Leaves code open to an SQL injection attack


Copyright (C) 2008, http://www.dabeaz.com

4- 63

Forming Queries
select address from people where name='Elwood'

(http://xkcd.com/327/)

Copyright (C) 2008, http://www.dabeaz.com

4- 64

146

Value Substitutions
DB modules also provide their own
mechanism for substituting values in queries
cur.execute("select address from people where name=?", (name,)) cur.execute("select address from people where name=? and state=?",(name,state))

This is somewhat similar to string formatting Unfortunately, the exact mechanism varies
slightly by DB module (must consult docs)
Copyright (C) 2008, http://www.dabeaz.com

4- 65

Exercise 4.5

Copyright (C) 2008, http://www.dabeaz.com

4- 66

147

Third Party Modules


Python has a large library of built-in modules
("batteries included")

There are even more third party modules Python Package Index (PyPi)
http://pypi.python.org/

Google

(Fill in with whatever you're looking for)

Copyright (C) 2008, http://www.dabeaz.com

4- 67

Installing Modules
Installation of a module is likely to take three
different forms (depends on the module)

Platform-native installer Manual Installation Python Eggs


Copyright (C) 2008, http://www.dabeaz.com

4- 68

148

Platform Native Install


You downloaded a third party module as
an .exe (PC) or .dmg (Mac) le

Just run this le and follow installation

instructions like you do for installing other software the more major third-party extensions

You are only likely to see these installers for


Copyright (C) 2008, http://www.dabeaz.com

4- 69

Manual Installation
You downloaded a Python module or package
using a standard le format such as a .gz, .tar.gz, .tgz, .bz2, or .zip le the resulting folder

Unpack the le and look for a setup.py le in Run Python on that setup.py le
% python setup.py install Installation messages ... %

Copyright (C) 2008, http://www.dabeaz.com

4- 70

149

Python Eggs
An emerging packaging format for Python
modules based on setuptools

Currently requires a third-party install


http://pypi.python.org/pypi/setuptools

Adds a command-line tool easy_install


% easy_install packagename (or) % easy_install packagename-1.2.3-py2.5.egg

In theory, this is just "automatic"


Copyright (C) 2008, http://www.dabeaz.com

4- 71

Commentary
Installing third party modules is always a
delicate matter

More advanced modules may involve C/C++ May have dependencies on other modules In theory, Eggs are supposed to solve these
problems.... in theory.
Copyright (C) 2008, http://www.dabeaz.com

code which has to be compiled to native code on your platform.

4- 72

150

Summary
Have looked at module/package mechanism Some of the very basic built-in modules How to install third party modules We will focus on more of the built-in
modules later in the course

Copyright (C) 2008, http://www.dabeaz.com

4- 73

151

Section 5

Classes and Objects

Overview
How to dene new objects How to customize objects (inheritance) How to combine objects (composition) Python special methods and customization
features

Copyright (C) 2008, http://www.dabeaz.com

5- 2

152

OO in a Nutshell
A programming technique where code is
organized as a collection of "objects"

An "object" consists of Data (attributes) Methods (functions applied to object) Example: A "Circle" object Data: radius Methods: area(), perimeter()
Copyright (C) 2008, http://www.dabeaz.com

5- 3

The class statement


A denition for a new object
class Circle(object): def __init__(self,radius): self.radius = radius def area(self): return math.pi*(self.radius**2) def perimeter(self): return 2*math.pi*self.radius

What is a class? It's a collection of functions that perform


various operations on "instances"
Copyright (C) 2008, http://www.dabeaz.com

5- 4

153

Instances
Created by calling the class as a function Each instance has its own data You invoke methods on instances to do things
>>> c.area() 50.26548245743669 >>> d.perimeter() 31.415926535897931 >>>
Copyright (C) 2008, http://www.dabeaz.com

>>> c = Circle(4.0) >>> d = Circle(5.0) >>>

>>> c.radius 4.0 >>> d.radius 5.0 >>>

5- 5

__init__ method
This method initializes a new instance Called whenever a new object is created
>>> c = Circle(4.0) class Circle(object): def __init__(self,radius): self.radius = radius newly created object

__init__ is example of a "special method" Has special meaning to Python interpreter


Copyright (C) 2008, http://www.dabeaz.com

5- 6

154

Instance Data
Each instance has its own data (attributes)
class Circle(object): def __init__(self,radius): self.radius = radius

Inside methods, you refer to this data using self


def area(self): return math.pi*(self.radius**2)

In other code, you just use the variable that


you're using to name the instance
>>> c = Circle(4.0) >>> c.radius 4.0
Copyright (C) 2008, http://www.dabeaz.com

5- 7

Functions applied to instances of an object


class Circle(object): ... def area(self): return math.pi*self.radius**2

Methods

The object is always passed as rst argument


>>> c.area() def area(self): ...

By convention, the instance is called "self"


The name is unimportant---the object is always passed as the rst argument. It is simply Python programming style to call this argument "self." C++ programmers might prefer to call it "this."
Copyright (C) 2008, http://www.dabeaz.com

5- 8

155

Calling Other Methods


Methods call other methods via self
class Circle(object): def area(self): return math.pi*self.radius**2 def print_area(self): print self.area()

A caution : Code like this doesn't work


class Circle(object): ... def print_area(self): print area() # ! Error

This merely calls a global function area()


Copyright (C) 2008, http://www.dabeaz.com

5- 9

Exercise 5.1

Time : 15 Minutes

Copyright (C) 2008, http://www.dabeaz.com

5- 10

156

Inheritance
A tool for specializing objects
class Parent(object): ... class Child(Parent): ...

New class called a derived class or subclass Parent known as base class or superclass Parent is specied in () after class name
Copyright (C) 2008, http://www.dabeaz.com

5- 11

Inheritance
What do you mean by "specialize?" Take an existing class and ... Add new methods Redene some of the existing methods Add new attributes to instances
Copyright (C) 2008, http://www.dabeaz.com

5- 12

157

Inheritance Example
In bill #246 of the 1897 Indiana General
Assembly, there was text that dictated a new method for squaring a circle, which if adopted, would have equated ! to 3.2. observant mathematician took notice...

Fortunately, it was never adopted because an But, let's make a special Indiana Circle anyways...
Copyright (C) 2008, http://www.dabeaz.com

5- 13

Inheritance Example
Specializing a class
>>> c = INCircle(4.0) >>> c.radius 4.0 >>> c.area() 51.20 >>> c.perimeter() 25.132741228718345 >>> class INCircle(Circle): def area(self): return 3.2*self.radius**2

Using the specialized version


# Calls Circle.__init__ # Calls INCircle.area # Calls Circle.perimeter

It's the same as Circle except for area()


Copyright (C) 2008, http://www.dabeaz.com

5- 14

158

Using Inheritance
Inheritance often used to organize objects
class Shape(object): ... class Circle(Shape): ... class Rectangle(Shape): ...

Think of a logical hierarchy or taxonomy


Copyright (C) 2008, http://www.dabeaz.com

5- 15

object base class


If a class has no parent, use object as base
class Foo(object): ...

object is the parent of all objects in Python Note : Sometimes you will see code where

classes are dened without any base class. That is an older style of Python coding that has been deprecated for almost 10 years. When dening a new class, you always inherit from something.
5- 16

Copyright (C) 2008, http://www.dabeaz.com

159

Inheritance and __init__


With inheritance, you must initialize parents
class Shape(object): def __init__(self): self.x = 0.0 self.y = 0.0 ... class Circle(Shape): def __init__(self,radius): Shape.__init__(self) # init base self.radius = radius

Copyright (C) 2008, http://www.dabeaz.com

5- 17

Inheritance and methods


Calling the same method in a parent
class Foo(object): def spam(self): ... ... class Bar(Foo): def spam(self): ... r = Foo.spam(self) ...

Useful if subclass only makes slight change to


method in base class (or wraps it)
Copyright (C) 2008, http://www.dabeaz.com

5- 18

160

Calling Other Methods


With inheritance, the correct method gets called
if overridden (depends on the type of self)
class Circle(object): def area(self): return math.pi*self.radius**2 def print_area(self): print self.area() class INCircle(Circle): def area(self): return 3.2*self.radius**2

if self is an instance of INCircle

Example:
>>> c = INCircle(4) >>> c.print_area() 51.2 >>>
Copyright (C) 2008, http://www.dabeaz.com

5- 19

Multiple Inheritance
You can specifying multiple base classes
class Foo(object): ... class Bar(object): ... class Spam(Foo,Bar): ...

The new class inherits features from both parents But there are some really tricky details (later) Rule of thumb : Avoid multiple inheritance
Copyright (C) 2008, http://www.dabeaz.com

5- 20

161

Exercise 5.2

Time : 30 Minutes

Copyright (C) 2008, http://www.dabeaz.com

5- 21

Special Methods
Classes may dene special methods Have special meaning to Python interpreter Always preceded/followed by __ There are several dozen special methods Will show a few examples
Copyright (C) 2008, http://www.dabeaz.com

class Foo(object): def __init__(self): ... def __del__(self): ...

5- 22

162

Converting object into string representation


str(x) repr(x) __str__(x) __repr__(x)

Methods: String Conv.

__str__ used by print statement __repr__ used by repr() and interactive mode
class Foo(object): def __str__(self): s = "some string for Foo" return s def __repr__(self): s = "Foo(args)"

Note: The convention for __repr__() is to return a string that, when fed to eval() , will recreate the underlying object. If this is not possible, some kind of easily readable representation is used instead.
Copyright (C) 2008, http://www.dabeaz.com

5- 23

Methods: Item Access


Methods used to implement containers
len(x) x[a] x[a] = v del x[a] __len__(x) __getitem__(x,a) __setitem__(x,a,v) __delitem__(x,a)

Use in a class

class Foo(object): def __len__(self): ... def __getitem__(self,a): ... def __setitem__(self,a,v): ... def __delitem__(self,a): ...

Copyright (C) 2008, http://www.dabeaz.com

5- 24

163

Methods: Containment
Containment operators
a in x b not in x __contains__(x,a) not __contains__(x,b)

Example:

class Foo(object): def __contains__(self,a): # if a is in self, return True # otherwise, return False)

Copyright (C) 2008, http://www.dabeaz.com

5- 25

Mathematical operators
a + b a - b a * b a / b a // b a % b a << b a >> b a & b a | b a ^ b a ** b -a ~a abs(a)

Methods: Mathematics
__add__(a,b) __sub__(a,b) __mul__(a,b) __div__(a,b) __floordiv__(a,b) __mod__(a,b) __lshift__(a,b) __rshift__(a,b) __and__(a,b) __or__(a,b) __xor__(a,b) __pow__(a,b) __neg__(a) __invert__(a) __abs__(a)

Consult reference for further details


Copyright (C) 2008, http://www.dabeaz.com

5- 26

164

Odds and Ends


Dening new exceptions Bound and unbound methods Alternative attribute lookup

Copyright (C) 2008, http://www.dabeaz.com

5- 27

Dening Exceptions
User-dened exceptions are dened by classes
class NetworkError(Exception): pass

Exceptions always inherit from Exception For simple exceptions, you can just make an
empty class as shown above

If you are storing special attributes in the


exception, there are additional details (consult a reference)
Copyright (C) 2008, http://www.dabeaz.com

5- 28

165

Method Invocation
Invoking a method is a two-step process Lookup: The . operator Method call: The () operator
class Foo(object): def bar(self,x): ... >>> f = Foo() Lookup >>> b = f.bar >>> b <bound method Foo.bar of <__main__.Foo object at 0x590d0>> >>> b(2)

Method call
Copyright (C) 2008, http://www.dabeaz.com

5- 29

Unbound Methods
Methods can be accessed directly through
the class
class Foo(object): def bar(self,a): ... >>> Foo.bar <unbound method Foo.bar>

To use it, you just have to supply an instance


>>> f = Foo() >>> Foo.bar(f,2)

Copyright (C) 2008, http://www.dabeaz.com

5- 30

166

Attribute Accesss
These functions may be used to manipulate
attributes given an attribute name string
getattr(obj,"name") setattr(obj,"name",value) delattr(obj,"name") hasattr(obj,"name") # # # # Same as obj.name Same as obj.name = value Same as del obj.name Tests if attribute exists

Example: Probing for an optional attribute


if hasattr(obj,"x"): x = getattr(obj,"x"): else: x = None

Note: getattr() has a useful default value arg


x = getattr(obj,"x",None)
Copyright (C) 2008, http://www.dabeaz.com

5- 31

Summary
A high-level overview of classes How to dene classes How to dene methods Creating and using objects Special python methods Overloading
Copyright (C) 2008, http://www.dabeaz.com

5- 32

167

Exercise 5.3

Time : 15 Minutes

Copyright (C) 2008, http://www.dabeaz.com

5- 33

168

Section 6

Inside the Python Object Model

Overview
A few more details about how objects work How objects are represented Details of attribute access Data encapsulation Memory management
Copyright (C) 2008, http://www.dabeaz.com

6- 2

169

Dictionaries Revisited
A dictionary is a collection of named values
stock = { 'name' : 'GOOG', 'shares' : 100, 'price' : 490.10 }

Dictionaries are commonly used for simple


data structures (shown above)

However, they are used for critical parts of the


interpreter and may be the most important type of data in Python
Copyright (C) 2008, http://www.dabeaz.com

6- 3

Dicts and Functions


When a function executes, a dictionary
holds all of the local variables
def read_prices(filename): prices = { } for line in open(filename): fields = line.split() prices[fields[0]] = float(fields[1]) return prices

locals()
{ 'filename' 'line' 'prices' 'fields' }
Copyright (C) 2008, http://www.dabeaz.com

: : : :

'prices.dat', 'GOOG 523.12', { ... }, ['GOOG', '523.12']

6- 4

170

Dicts and Modules


In a module, a dictionary holds all of the
global variables and functions
# foo.py x = 42 def bar(): ... def spam(): ...

foo.__dict__ or globals()
{ 'x' : 42, 'bar' : <function bar>, 'spam' : <function spam> }
Copyright (C) 2008, http://www.dabeaz.com

6- 5

Dicts and Objects


User-dened objects also use dictionaries Instance data Class members In fact, the entire object system is mostly
just an extra layer that's put on top of dictionaries

Let's take a look...


Copyright (C) 2008, http://www.dabeaz.com

6- 6

171

Dicts and Instances


A dictionary holds instance data (__dict__)
>>> s = Stock('GOOG',100,490.10) >>> s.__ dict_ _ {'name' : 'GOOG','shares' : 100, 'price': 490.10 }

You populate this dict when assigning to self


class Stock(object): def _ _ init__ (self,name,shares,price): self.name = name self.shares = shares self.price = price { self.__ dict__ }
Copyright (C) 2008, http://www.dabeaz.com

'name' : 'GOOG', 'shares' : 100, 'price' : 490.10

instance data

6- 7

Dicts and Instances


Critical point : Each instance gets its own
private dictionary
{ s = Stock('GOOG',100,490.10) t = Stock('AAPL',50,123.45) } 'name' : 'GOOG', 'shares' : 100, 'price' : 490.10

So, if you created 100


instances of some class, there are 100 dictionaries sitting around holding data

{ 'name' : 'AAPL', 'shares' : 50, 'price' : 123.45 }

Copyright (C) 2008, http://www.dabeaz.com

6- 8

172

Dicts and Classes


A dictionary holds the members of a class
class Stock(object): def _ _ init__ (self,name,shares,price): self.name = name self.shares = shares self.price = price def cost(self): return self.shares*self.price def sell(self,nshares): self.shares -= nshares { Stock._ _ dict__ } 'cost' : <function>, 'sell' : <function>, '__ init__ ' : <function>,

methods
Copyright (C) 2008, http://www.dabeaz.com

6- 9

Instances and Classes


Instances and classes are linked together __class__ attribute refers back to the class
>>> s = Stock('GOOG',100,490.10) >>> s.__ dict_ _ {'name':'GOOG','shares':100,'price':490.10 } >>> s.__ class__ <class '__ main_ _ .Stock'> >>>

The instance dictionary holds data unique to


Copyright (C) 2008, http://www.dabeaz.com

each instance whereas the class dictionary holds data collectively shared by all instances
6- 10

173

Instances and Classes


.__ dict_ _ .__ class__ {attrs} ._ _ dict__ ._ _ class__ {attrs} {attrs}

instances

._ _ dict__ ._ _ class__

class

._ _ dict__

{methods}

Copyright (C) 2008, http://www.dabeaz.com

6- 11

Attribute Access
When you work with objects, you access
data and methods using the (.) operator
x = obj.name obj.name = value del obj.name # Getting # Setting # Deleting

These operations are directly tied to the

dictionaries sitting underneath the covers

Copyright (C) 2008, http://www.dabeaz.com

6- 12

174

Modifying Instances
Operations that modify an object always
update the underlying dictionary
>>> s = Stock('GOOG',100,490.10) >>> s.__ dict_ _ {'name':'GOOG', 'shares':100, 'price':490.10 } >>> s.shares = 50 >>> s.date = "6/7/2007" >>> s.__ dict_ _ { 'name':'GOOG', 'shares':50, 'price':490.10, 'date':'6/7/2007'} >>> del s.shares >>> s.__ dict_ _ { 'name':'GOOG', 'price':490.10, 'date':'6/7/2007'} >>>

Copyright (C) 2008, http://www.dabeaz.com

6- 13

A Caution
Setting an instance attribute will silently
>>> s = Stock('GOOG',100,490.10) >>> s.cost() 49010.0 >>> s.cost = "a lot" >>> s.cost() Traceback (most recent call last): File "<stdin>", line 1, in ? TypeError: 'str' object is not callable >>> s.cost 'a lot' >>>

override class members with the same name

The value in the instance dictionary is hiding


the one in the class dictionary
175
Copyright (C) 2008, http://www.dabeaz.com

6- 14

Modifying Instances
It may be surprising that instances can be
extended after creation

You can freely change attributes at any time Again, you're just manipulating a dictionary Very different from C++/Java where the
structure of an object is rigidly xed
Copyright (C) 2008, http://www.dabeaz.com

>>> s = Stock('GOOG',100,490.10) >>> s.blah = "some new attribute" >>> del s.name >>>

6- 15

Reading Attributes
Suppose you read an attribute on an instance
x = obj.name

Attribute may exist in two places Local instance dictionary Class dictionary So, both dictionaries may be checked
Copyright (C) 2008, http://www.dabeaz.com

6- 16

176

Reading Attributes
First check in local _ _dict_ _ If not found, look in __dict__ of class
>>> s = Stock(...) s >>> s.name 'GOOG' >>> s.cost() 49010.0 Stock >>> ._ _ dict__ ._ _ class__ 1 {'name': 'GOOG', 'shares': 100 } ._ _ dict__ 2 {'cost': <func>, 'sell':<func>, '__ init__ ':..}

This lookup scheme is how the members of


a class get shared by all instances
Copyright (C) 2008, http://www.dabeaz.com

6- 17

Exercise 6.1

Time : 10 Minutes

Copyright (C) 2008, http://www.dabeaz.com

6- 18

177

How Inheritance Works


Classes may inherit from other classes
class A(B,C): ...

Bases are stored as a tuple in each class


>>> A._ _ bases__ (<class '__ main__ .B'>,<class '__ main__ .C'>) >>>

This provides a link to parent classes This link simply extends the search process
used to nd attributes
Copyright (C) 2008, http://www.dabeaz.com

6- 19

Reading Attributes
First check in local __dict_ _ If not found, look in __dict__ of class If not found in class, look in base classes
>>> s = Stock(...) s >>> s.name 'GOOG' >>> s.cost() 49010.0 Stock >>> ._ _ dict__ ._ _ class__ 1 {'name': 'GOOG', 'shares': 100 } ._ _ dict__ ._ _ bases__ 3 look in __ bases__
Copyright (C) 2008, http://www.dabeaz.com

{'cost': <func>, 'sell':<func>, '__ init__ ':..}

6- 20

178

Single Inheritance
In inheritance hierarchies, attributes are
class class class class class A(object): pass B(A): pass C(A): pass D(B): pass E(D): pass B D object A C

found by walking up the inheritance tree

With single
rst match

inheritance, there is a single path to the top


e = E() e.attr

You stop with the


Copyright (C) 2008, http://www.dabeaz.com

E e

instance
6- 21

Multiple Inheritance
Consider this hierarchy
class class class class class A(object): pass B(object): pass C(A,B): pass D(B): pass E(C,D): pass A C object B D

What happens here?


e = E() e.attr

A similar search process is carried out, but


there is an added complication in that there may be many possible search paths
179
Copyright (C) 2008, http://www.dabeaz.com

6- 22

Multiple Inheritance
For multiple inheritance, Python determines
a "method resolution order" that sets the order in which base classes get checked
>>> E._ _ mro__ (<class '__ main__ .E'>, <class '__ main__ .A'>, <class '__ main__ .B'>, >>> <class '_ _ main__ .C'>, <class '_ _ main__ .D'>, <type 'object'>)

The MRO is described in a 20 page math


paper (C3 Linearization Algorithm)

Note: This complexity is one reason why


multiple inheritance is often avoided
Copyright (C) 2008, http://www.dabeaz.com

6- 23

Classes and Encapsulation


One of the primary roles of a class is to
encapsulate data and internal implementation details of an object

However, a class also denes a "public"

interface that the outside world is supposed to use to manipulate the object details and the public interface is important
6- 24

This distinction between implementation


Copyright (C) 2008, http://www.dabeaz.com

180

A Problem
In Python, almost everything about classes
and objects is "open"

You can easily inspect object internals You can change things at will There's no strong notion of accesscontrol (i.e., private class members)

If you're trying to cleanly separate the


internal "implementation" from the "interface" this becomes an issue
Copyright (C) 2008, http://www.dabeaz.com

6- 25

Python Encapsulation
Python relies on programming conventions to
indicate the intended use of something

Typically, this is based on naming There is a general attitude that it is up to the

programmer to observe the rules as opposed to having the language enforce rules

Copyright (C) 2008, http://www.dabeaz.com

6- 26

181

Private Attributes
Any attribute name with leading __ is "private" Example
class Foo(object): def __ i nit_ _ (self): self.__ x = 0 >>> f = Foo() >>> f._ _ x AttributeError: 'Foo' object has no attribute '__ x' >>>

This is actually just a name mangling trick


>>> f = Foo() >>> f._Foo__ x 0 >>>

Copyright (C) 2008, http://www.dabeaz.com

6- 27

Private Methods
Private naming also applies to methods Example:
class Foo(object): def __ spam(self): print "Foo.__ spam" def callspam(self): self.__ spam() # Uses Foo._ _ spam

>>> f = Foo() >>> f.callspam() Foo.__ spam >>> f.__ spam() AttributeError: 'Foo' object has no attribute '__ spam' >>> f._Foo__ spam() Foo.__ spam >>>

Copyright (C) 2008, http://www.dabeaz.com

6- 28

182

Performance Commentary
Python name mangling in classes occurs at the
time a class is dened, not at run time using this feature

Thus, there is no performance overhead for It's actually a somewhat ingenious approach
since it avoids all runtime checks for "public" vs. "private" (so the interpreter runs faster overall)

Copyright (C) 2008, http://www.dabeaz.com

6- 29

Consider controlling access to internals


using method calls.
class Foo(object): def __ init_ _ (self,name): self.__ name = name def getName(self): return self._ _ name def setName(self,name): if not isinstance(name,str): raise TypeError("Expected a string") self.__ name = name

Accessor Methods

Methods give you more exibility and may


become useful as your program grows into a larger framework of objects
183

May make it easier to integrate the object


Copyright (C) 2008, http://www.dabeaz.com

6- 30

Properties
Accessor methods can optionally be turned
into "property" attributes:
class Foo(object): def _ _ init__ (self,name): self.__ name = name def getName(self): return self.__ name def setName(self,name): if not isinstance(name,str): raise TypeError("Expected a string") self.__ name = name name = property(getName,setName)

Properties look like normal attributes, but


implicitly call the accessor methods.
Copyright (C) 2008, http://www.dabeaz.com

6- 31

Properties
Example use:
>>> f = Foo("Elwood") >>> f.name # Calls f.getName() 'Elwood' >>> f.name = 'Jake' # Calls f.setName('Jake') >>> f.name = 45 # Calls f.setName(45) TypeError: Expected a string >>>

Comment : With properties, you can hide extra


processing behind access to data attributes-something that is useful in certain settings

Example : Type checking


Copyright (C) 2008, http://www.dabeaz.com

6- 32

184

Properties
Properties are also useful if you are creating
objects where you want to have a very consistent user interface
class Circle(object): def __ init_ _ (self,radius): self.radius = name def area(self): return math.pi*self.radius**2 area = property(area) def perimeter(self): return 2*math.pi*self.radius perimeter = property(perimeter)

Example : Computed data attributes

Copyright (C) 2008, http://www.dabeaz.com

6- 33

Properties
Example use:
>>> c = Circle(4) >>> c.radius 4 >>> c.area 50.26548245743669 >>> c.perimeter 25.132741228718345

Instance Variable Computed Properties

Commentary : Notice how there is no

obvious difference between the attributes as seen by the user of the object
6- 34

Copyright (C) 2008, http://www.dabeaz.com

185

Uniform Access
The last example shows how to put a more
>>> c = Circle(4.0) >>> a = c.area() >>> r = c.radius >>> # Method # Data attribute

uniform interface on an object. If you don't do this, an object might be confusing to use:

Why is the () required for the area, but not


for the radius?
Copyright (C) 2008, http://www.dabeaz.com

6- 35

It is very common for a property to replace


a method of the same name
Notice the identical names
class Circle(object): ... def area(self): return math.pi*self.radius**2 area = property(area) def perimeter(self): return 2*math.pi*self.radius perimeter = property(perimeter)

Properties

When you dene a property, you usually don't


want the user to explicitly call the method exposes the property attribute
186

So, this trick hides the method and only


Copyright (C) 2008, http://www.dabeaz.com

6- 36

There is an alternative way to make properties Use a "decorator"


class Circle(object): ... def area(self): return math.pi*self.radius**2 area = property(area) ...

Decorators

An advanced topic, but a "decorator" is


Copyright (C) 2008, http://www.dabeaz.com

class Circle(object): ... @property def area(self): return math.pi*self.radius**2 ...

basically a modier applied to functions


6- 37

__slots__ Attribute
You can restrict the set of attribute names Produces errors for other attributes
>>> f = Foo() >>> f.x = 3 >>> f.y = 20 >>> f.z = 1 Traceback (most recent call last): File "<stdin>", line 1, in ? AttributeError: 'Foo' object has no attribute 'z' class Foo(object): _ _ slots__ = ['x','y'] ...

Prevents errors, restricts usage of objects


Copyright (C) 2008, http://www.dabeaz.com

6- 38

187

Exercise 6.2

Time : 10 Minutes

Copyright (C) 2008, http://www.dabeaz.com

6- 39

Memory Management
Python has automatic memory management
and garbage collection

As a general rule, you do not have to worry However, there are a couple of additional
details to be aware of
Copyright (C) 2008, http://www.dabeaz.com

about cleaning up when you dene your own classes

6- 40

188

Instance Creation
Instances are actually created in two steps
>>> a = Stock('GOOG',100,490.10)

Performs these steps


a = Stock.__ new__ (Stock,'GOOG',100,490.10) a.__ init__ ('GOOG',100,490.10)

__new_ _() constructs the raw instance __init_ _() initializes it


Copyright (C) 2008, http://www.dabeaz.com

6- 41

__new_ _() method


Creates a new (empty) object If you see this dened in a class, it almost always
indicates the presence of heavy wizardry

Both topics are beyond the scope of this class Bottom line :You don't see this in "normal" code
Copyright (C) 2008, http://www.dabeaz.com

Inheritance from an immutable type Denition of a metaclass

6- 42

189

Instance Deletion
All objects are reference counted Destroyed when refcount=0 When an instance is destroyed, the reference
count of all instance data is also decreased don't worry about it

Normally, all of this just "works" and you


Copyright (C) 2008, http://www.dabeaz.com

6- 43

Object Deletion: Cycles


There are some issues with cycles Deletion
del a del b class A(object): pass class B(object): pass a = A() b = B() a.b = b b.a = a ref=2 .b .a ref=2

Objects live, but no


way to access
Copyright (C) 2008, http://www.dabeaz.com

ref=1 .b .a

ref=1

6- 44

190

Garbage Collection
An extra garbage collector looks for cycles It runs automatically, but you can control it
import gc gc.collect() gc.disable() gc.enable() # Run full garbage collection now # Disable garbage collection # Enable garbage collection

gc also has some diagnostic functions


Copyright (C) 2008, http://www.dabeaz.com

6- 45

_ _del_ _ method
Classes may dene a destructor method
class Stock(object): ... def __ del__ (self): # Cleanup

Called when the reference count reaches 0 A common confusion : __del__() is not
necessarily triggered by the del operator.
s = Stock('GOOG',100,490.10) t = s del s t = 42
Copyright (C) 2008, http://www.dabeaz.com

# Does not call s.__del__() # Calls s.__del__() (refcount=0)

6- 46

191

_ _del_ _ method
Don't dene __del__ without a good reason Typical uses: Proper shutdown of system resources
(e.g., network connections)

Avoid dening it for any other purpose


Copyright (C) 2008, http://www.dabeaz.com

Releasing locks (e.g., threading)

6- 47

Exercise 6.3

Optional

Copyright (C) 2008, http://www.dabeaz.com

6- 48

192

Documentation, Testing, and Debugging

Section 7

Overview
Documenting programs Testing (doctest, unittest) Error handling/diagnostics Debugging Proling
Copyright (C) 2008, http://www.dabeaz.com

7- 2

193

Documentation
Python has both comments and docstrings
# Compute the greatest common divisor. Uses clever # tuple hack to avoid an extra temporary variable. def gcd(x,y): """Compute the greatest common divisor of x and y. For example: >>> gcd(40,16) 8 >>> """ while x > 0: x,y = y%x,x return y

Copyright (C) 2008, http://www.dabeaz.com

7- 3

Documentation
Comments should be used to provide notes
to developers reading the source code documentation to end-users.

Doc strings should be used to provide Emphasize: Documentation strings are


considered good Python style.

Copyright (C) 2008, http://www.dabeaz.com

7- 4

194

Uses of Docstrings
Most Python IDEs use doc strings to provide
user information and help

Copyright (C) 2008, http://www.dabeaz.com

7- 5

Uses of Docstrings
Online help

Copyright (C) 2008, http://www.dabeaz.com

7- 6

195

Testing: doctest module


A module that runs tests from docstrings Look at docstrings for interactive sessions
# Compute the greatest common divisor. Uses clever # tuple hack to avoid an extra temporary variable. def gcd(x,y): """Compute the greatest common divisor of x and y. For example: >>> gcd(40,16) 8 >>> """ while x > 0: x,y = y%x,x return y

Copyright (C) 2008, http://www.dabeaz.com

7- 7

Using doctest
Create a separate le that loads the module
# testgcd.py import gcd import doctest

Add the following code to test a module


doctest.testmod(gcd)

To run the tests, do this


% python testgcd.py %

If successful, will get no output


Copyright (C) 2008, http://www.dabeaz.com

7- 8

196

Using doctest
Test failures produce a report
% python testgcd.py ******************************************************** File "/Users/beazley/pyfiles/gcd.py", line 7, in gcd.gcd Failed example: gcd(40,16) Expected: 8 Got: 7 ******************************************************** 1 items had failures: 1 of 1 in gcd.gcd ***Test Failed*** 1 failures. %

Copyright (C) 2008, http://www.dabeaz.com

7- 9

Self-testing
Modules may be set up to test themselves
# gcd.py def gcd(x,y): """ ... """ if __name__ == '__main__': import doctest doctest.testmod()

Will run tests if executed as main program


% python gcd.py
Copyright (C) 2008, http://www.dabeaz.com

7- 10

197

Exercise 7.1

Time : 10 Minutes

Copyright (C) 2008, http://www.dabeaz.com

7- 11

Testing: unittest
unittest module A more formal testing framework Class-based Better suited for exhaustive testing Can deal with more complex test cases
Copyright (C) 2008, http://www.dabeaz.com

7- 12

198

Using unittest
First, you create a separate le Then you dene a testing class
class TestGCDFunction(unittest.TestCase): ... # testgcd.py import gcd import unittest

Must inherit from unittest.TestCase


7- 13

Copyright (C) 2008, http://www.dabeaz.com

Using unittest
Dene testing methods
class TestGCDFunction(unittest.TestCase): def testsimple(self): # Test with simple integer arguments g = gcd.gcd(40,16) self.assertEqual(g,8) def testfloat(self): # Test with floats. Should be error self.assertRaises(TypeError,gcd.gcd,3.5,4.2) def testlong(self): # Test with long integers g = gcd.gcd(23748928388L, 6723884L) self.assertEqual(g,4L) self.assert_(type(g) is long)

Each method must start with "test..."


Copyright (C) 2008, http://www.dabeaz.com

7- 14

199

Using unittest
Each test works through assertions
# Assert that expr is True self.assert_(expr) # Assert that x == y self.assertEqual(x,y) # Assert that x != y self.assertNotEqual(x,y) # Assert x != y # Assert that callable(*args,**kwargs) raises a given # exception self.assertRaises(exc,callable,*args,**kwargs)

There are several other tests, but these are


the main ones
Copyright (C) 2008, http://www.dabeaz.com

7- 15

To run tests, you add the following code


# testgcd.py class TestGCDFunction(unittest.testcase): ... unittest.main()

Running unittests

Then run Python on the test le


% python testgcd.py ... --------------------------------------------------Ran 3 tests in 0.000s OK %

Copyright (C) 2008, http://www.dabeaz.com

7- 16

200

Setup and Teardown


Two additional methods can be dened
# testgcd.py class TestGCDFunction(unittest.testcase): def setUp(self): # Perform setup before running a test ... def tearDown(self): # Perform cleanup after running a test ...

Can be used to set up environment prior to


running a test

Called before and after each test method


Copyright (C) 2008, http://www.dabeaz.com

7- 17

unittest comments
There is an art to effective unit testing Can grow to be quite complicated for large
applications

The unittest module has a huge number of

options related to test runners, collection of results, and other aspects of testing (consult documentation for details)

Copyright (C) 2008, http://www.dabeaz.com

7- 18

201

Exercise 7.2

Time : 15 Minutes

Copyright (C) 2008, http://www.dabeaz.com

7- 19

Assertions
assert statement
assert expr [, "diagnostic message" ]

If expression is not true, raises


AssertionError exception

Should not be used to check user-input Use for internal program checking
7- 20

Copyright (C) 2008, http://www.dabeaz.com

202

Contract Programming
Consider assertions on all inputs and outputs Checking inputs will immediately catch callers
who aren't using appropriate arguments
def gcd(x,y): assert x > 0 assert y > 0 while x > 0: x,y = y%x,x assert y > 0 return y # Pre-assertions # Post-assertion

Checking outputs may catch buggy algorithms Both approaches prevent errors from
propagating to other parts of the program
Copyright (C) 2008, http://www.dabeaz.com

7- 21

Optimized mode
Python has an optimized run mode
python -O foo.py

This strips all assert statements Allows debug/release mode development Normal mode for full debugging Optimized mode for faster production runs
Copyright (C) 2008, http://www.dabeaz.com

7- 22

203

__debug__ variable
Global variable checked for debugging By default, __debug__ is True Set False in optimized mode (python -O) The implementation is efcient. The if
if __debug__: # Perform some kind of debugging code ...

statement is stripped in both cases and in -O mode, the debugging code is stripped entirely.
7- 23

Copyright (C) 2008, http://www.dabeaz.com

How to Handle Errors


Uncaught exceptions
% python blah.py (most recent call last): File "blah.py", line 13, in ? foo() File "blah.py", line 10, in foo bar() File "blah.py", line 7, in bar spam() File "blah.py", line 4, in spam x.append(3) AttributeError: 'int' object has no attribute 'append'

Program prints traceback, exits


Copyright (C) 2008, http://www.dabeaz.com

7- 24

204

Creating Tracebacks
How to create a traceback yourself Sending a traceback to a le
traceback.print_exc(file=f) import traceback try: ... except: traceback.print_exc()

Getting traceback as a string


err = traceback.format_exc()
Copyright (C) 2008, http://www.dabeaz.com

7- 25

Error Handling
Keeping Python alive upon termination
% python -i blah.py Traceback (most recent call last): File "blah.py", line 13, in ? foo() File "blah.py", line 10, in foo bar() File "blah.py", line 7, in bar spam() File "blah.py", line 4, in spam x.append(3) AttributeError: 'int' object has no attribute 'append' >>>

Python enters normal interactive mode Can use to examine global data, objects, etc.
Copyright (C) 2008, http://www.dabeaz.com

7- 26

205

pdb module Entering the debugger after a crash

The Python Debugger


% python -i blah.py Traceback (most recent call last): File "blah.py", line 13, in ? foo() File "blah.py", line 10, in foo bar() File "blah.py", line 7, in bar spam() File "blah.py", line 4, in spam x.append(3) AttributeError: 'int' object has no attribute 'append' >>> import pdb >>> pdb.pm() > /Users/beazley/python/blah.py(4)spam() -> x.append(3) (Pdb)

Copyright (C) 2008, http://www.dabeaz.com

7- 27

The Python Debugger


Launching the debugger inside a program
import pdb def some_function(): statements ... pdb.set_trace() ... statements # Enter the debugger

This starts the debugger at the point of the


set_trace() call
Copyright (C) 2008, http://www.dabeaz.com

7- 28

206

Python Debugger
Common debugger commands
(Pdb) (Pdb) (Pdb) (Pdb) (Pdb) (Pdb) (Pdb) (Pdb) (Pdb) (Pdb) help w(here) d(own) u(p) b(reak) loc s(tep) c(ontinue) l(ist) a(rgs) !statement # # # # # # # # # # Get help Print stack trace Move down one stack level Move up one stack level Set a breakpoint Execute one instruction Continue execution List source code Print args of current function Execute statement

For breakpoints, location is one of


(Pdb) (Pdb) (Pdb) (Pdb) b b b b 45 file.py:45 foo module.foo # # # #
Copyright (C) 2008, http://www.dabeaz.com

Line 45 in current file Line 34 in file.py Function foo() in current file Function foo() in a module

7- 29

Obtaining a stack trace

Debugging Example
(Pdb) w /Users/beazley/Teaching/python/blah.py(13)?() -> foo() /Users/beazley/Teaching/python/blah.py(10)foo() -> bar() /Users/beazley/Teaching/python/blah.py(7)bar() -> spam() > /Users/beazley/Teaching/python/blah.py(4)spam() -> x.append(3) (Pdb)

Examing a variable
(Pdb) print x -1 (Pdb)
Copyright (C) 2008, http://www.dabeaz.com

7- 30

207

(Pdb) list 1 # Compute the greatest common divisor. Uses clever 2 # tuple hack to avoid an extra temporary variable 3 def gcd(x,y): 4 -> while x > 0: x,y = y%x,x 5 return y [EOF] (Pdb)

Getting a source listing

Debugging Example

(Pdb) b 5 Breakpoint 1 at /Users/beazley/Teaching/gcd.py:5 (Pdb)

Setting a breakpoint

(Pdb) cont > /Users/beazley/Teaching/NerdRanch/Testing/gcd.py(5)gcd() -> return y


Copyright (C) 2008, http://www.dabeaz.com

Running until break or completion

7- 31

Debugging a function call


-> while x > 0: x,y = y%x,x (Pdb)

Debugging Example

>>> import gcd >>> import pdb >>> pdb.runcall(gcd.gcd,42,14)

Single stepping
(Pdb) print x,y 42 14 (Pdb) s > /Users/beazley/Teaching/python/gcd.py(4)gcd() -> while x > 0: x,y = y%x,x (Pdb) print x,y 14 42 (Pdb) s

Copyright (C) 2008, http://www.dabeaz.com

7- 32

208

Python Debugger
Running entire program under debugger
% python -m pdb someprogram.py

Automatically enters the debugger before

the rst statement (allowing you to set breakpoints and change the conguration)

Copyright (C) 2008, http://www.dabeaz.com

7- 33

Proling
prole module Collects performance statistics and prints a
report

Uses exec to execute the given command A command line alternative


% python -m profile someprogram.py
Copyright (C) 2008, http://www.dabeaz.com

import profile profile.run("command")

7- 34

209

Prole Sample Output


% python -m profile cparse.py 447981 function calls (446195 primitive calls) in 5.640 CPU seconds Ordered by: standard name ncalls tottime percall filename:lineno(function) 2 0.000 0.000 101599 0.470 0.000 56 0.000 0.000 4 0.000 0.000 1028 0.010 0.000 4 0.000 0.000 1 0.000 0.000 2 0.000 0.000 1 0.000 0.000 4 0.000 0.000 50 0.000 0.000 83102 0.430 0.000 ...
Copyright (C) 2008, http://www.dabeaz.com

cumtime 0.000 0.470 0.000 0.000 0.010 0.000 0.000 0.000 5.640 0.000 0.000 0.430

percall 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 5.640 0.000 0.000 0.000 :0(StringIO) :0(append) :0(callable) :0(close) :0(cmp) :0(compile) :0(digest) :0(exc_info) :0(execfile) :0(extend) :0(find) :0(get)

7- 35

Summary
Documentation strings Testing with doctest Testing with unittest Debugging (pdb) Proling
Copyright (C) 2008, http://www.dabeaz.com

7- 36

210

Exercise 7.3

Time : 10 Minutes

Copyright (C) 2008, http://www.dabeaz.com

7- 37

211

Section 8

Iterators and Generators

Overview
Iteration protocol and iterators Generator functions Generator expressions

Copyright (C) 2008, http://www.dabeaz.com

8- 2

212

Iteration
A simple denition: Looping over items
a = [2,4,10,37,62] # Iterate over a for x in a: ...

A very common pattern loops, list comprehensions, etc. Most programs do a huge amount of iteration
Copyright (C) 2008, http://www.dabeaz.com

8- 3

An inside look at the for statement


for x in obj: # statements

Iteration: Protocol

Underneath the covers


_iter = obj.__iter__() while 1: try: x = _iter.next() except StopIteration: break # statements ... # Get iterator object # Get next item # No more items

Any object that supports __iter__() and


next() is said to be "iterable."
Copyright (C) 2008, http://www.dabeaz.com

8- 4

213

Iteration: Protocol
Manual iteration over a list
>>> x = [1,2,3] >>> it = x.__iter__() >>> it <listiterator object at 0x590b0> >>> it.next() 1 >>> it.next() 2 >>> it.next() 3 >>> it.next() Traceback (most recent call last): File "<stdin>", line 1, in ? StopIteration >>>

Copyright (C) 2008, http://www.dabeaz.com

8- 5

Iterators Everywhere
Most built in types support iteration
a = "hello" for c in a: ... # Loop over characters in a b = { 'name': 'Dave', 'password':'foo'} for k in b: # Loop over keys in dictionary ... c = [1,2,3,4] for i in c: ...

# Loop over items in a list/tuple

f = open("foo.txt") for x in f: # Loop over lines in a file ...

Copyright (C) 2008, http://www.dabeaz.com

8- 6

214

Exercise 8.1

Time : 5 Minutes

Copyright (C) 2008, http://www.dabeaz.com

8- 7

Supporting Iteration
User-dened objects can support iteration Example: Counting down...
>>> for x in countdown(10): ... print x, ... 10 9 8 7 6 5 4 3 2 1 >>>

To make this work with your own object,


you just have to provide a few methods

Copyright (C) 2008, http://www.dabeaz.com

8- 8

215

Sample implementation

Supporting Iteration
class countdown(object): def __init__(self,start): self.start = start def __iter__(self): return countdown_iter(self.start) class countdown_iter(object): def __init__(self,start): self.count = start def next(self): if self.count <= 0: raise StopIteration r = self.count self.count -= 1 return r

Copyright (C) 2008, http://www.dabeaz.com

8- 9

Object that denes an iterator

Supporting Iteration
class countdown(object): def __init__(self,start): self.start = start def __iter__(self): return countdown_iter(self.start) class countdown_iter(object): def __init__(self,start): self.count = start def next(self): if self.count <= 0: raise StopIteration r = self.count self.count -= 1 return r Must dene __iter__ which creates an iterator object

Copyright (C) 2008, http://www.dabeaz.com

8- 10

216

Object that denes an iterator

Supporting Iteration
class countdown(object): def __init__(self,start): self.start = start def __iter__(self): return countdown_iter(self.start) class countdown_iter(object): def __init__(self,start): self.count = start def next(self): if self.count <= 0: raise StopIteration r = self.count self.count -= 1 return r Must dene a class which is the iterator object

Copyright (C) 2008, http://www.dabeaz.com

8- 11

Object that denes an iterator

Supporting Iteration
class countdown(object): def __init__(self,start): self.start = start def __iter__(self): return countdown_iter(self.start) class countdown_iter(object): def __init__(self,start): self.count = start def next(self): if self.count <= 0: raise StopIteration r = self.count self.count -= 1 return r

Iterator must dene next() to return items

Copyright (C) 2008, http://www.dabeaz.com

8- 12

217

Example use:
>>> >>> ... ... 5 4 >>> ... ... ... (5, (5, (5, ... (1, (1, >>>

Iteration Example
c = countdown(5) for i in c: print i, 3 2 1 for i in c: for j in c: print "(%d,%d)" % (i,j) 5) 4) 3) 2) 1)

Copyright (C) 2008, http://www.dabeaz.com

8- 13

Iteration Example
Why two classes? Iterator is object that operates on another
object (usually) (nested loops)

May be iterating in more than one place


countdown_iter countdown
.start = 5 .count = 4 __iter__()

countdown_iter
.count = 2
Copyright (C) 2008, http://www.dabeaz.com

8- 14

218

Exercise 8.2

Time : 15 Minutes

Copyright (C) 2008, http://www.dabeaz.com

8- 15

Dening Iterators
Is there an easier way to do this? Iteration is extremely useful Annoying to have to dene __iter__(),
create iterator objects, and manage everything

Copyright (C) 2008, http://www.dabeaz.com

8- 16

219

Generators
A function that denes an iterator
def countdown(n): while n > 0: yield n n -= 1 >>> for i in countdown(5): ... print i, ... 5 4 3 2 1 >>>

Any function that uses yield is a generator


Copyright (C) 2008, http://www.dabeaz.com

8- 17

Generator Functions
Behavior is totally different than normal func Calling a generator function creates an
generator object. It does not start running the function.
def countdown(n): print "Counting down from", n while n > 0: yield n Notice that no n -= 1 output was produced >>> x = countdown(10) >>> x <generator object at 0x58490> >>>

Copyright (C) 2008, http://www.dabeaz.com

8- 18

220

Generator Functions
Function only executes on next() yield produces a value, but suspends function Function resumes on next call to next()
>>> x.next() 9 >>> x.next() 8 >>>
Copyright (C) 2008, http://www.dabeaz.com

>>> x = countdown(10) >>> x <generator object at 0x58490> >>> x.next() Counting down from 10 10 >>>

Function starts executing here

8- 19

Generator Functions
When the generator returns, iteration stops
>>> x.next() 1 >>> x.next() Traceback (most recent call last): File "<stdin>", line 1, in ? StopIteration >>>

Copyright (C) 2008, http://www.dabeaz.com

8- 20

221

Generator Example
Follow a le, yielding new lines as added
def follow(f) f.seek(0,2) # Seek to end of a file while True: line = f.readline() if not line: time.sleep(0.1) # Sleep for a bit continue # and try again yield line

Observe an active log le


f = open("access-log") for line in follow(f): # Process the line print line

Copyright (C) 2008, http://www.dabeaz.com

8- 21

Generators vs. Iterators


A generator function is slightly different
than an object that supports iteration

A generator is a one-time operation. You An object that supports iteration can be


re-used over and over again (e.g., a list)
Copyright (C) 2008, http://www.dabeaz.com

can iterate over the generated data once, but if you want to do it again, you have to call the generator function again.

8- 22

222

Exercise 8.3

Time : 30 minutes

Copyright (C) 2008, http://www.dabeaz.com

8- 23

Generator Expressions
>>> a = [1,2,3,4] >>> b = (2*x for x in a) >>> b <generator object at 0x58760> >>> for i in b: print i, ... 2 4 6 8 >>>

A generator version of a list comprehension

Important differences

Does not construct a list. Only useful purpose is iteration Once consumed, can't be reused
8- 24

Copyright (C) 2008, http://www.dabeaz.com

223

General syntax

Generator Expressions
(expression for i in s if conditional)

Can also serve as a function argument


sum(x*x for x in a)

Can be applied to any iterable


>>> a = [1,2,3,4] >>> b = (x*x for x in a) >>> c = (-x for x in b) >>> for i in c: print i, ... -1 -4 -9 -16 >>>
Copyright (C) 2008, http://www.dabeaz.com

8- 25

Generator Expressions
Example: Sum a eld in a large input le Solution
823.1838823 233.128883 14.2883881 44.1787723 377.1772737 123.177277 143.288388 3884.78772 ... f = open("datfile.txt") # Strip all lines that start with a comment lines = (line for line in f if not line.startswith('#')) # Split the lines into fields fields = (s.split() for s in lines) # Sum up one of the fields print sum(float(f[2]) for f in fields)

Copyright (C) 2008, http://www.dabeaz.com

8- 26

224

Generator Expressions
Solution
f = open("datfile.txt") # Strip all lines that start with a comment lines = (line for line in f if not line.startswith('#')) # Split the lines into fields fields = (s.split() for s in lines) # Sum up one of the fields print sum(float(f[2]) for f in fields)

Each generator expression only evaluates


data as needed (lazy evaluation)

Example: Running above on a 6GB input le


only consumes about 60K of RAM
Copyright (C) 2008, http://www.dabeaz.com

8- 27

Exercise 8.4

Time : 15 Minutes

Copyright (C) 2008, http://www.dabeaz.com

8- 28

225

Why Use Generators?


Many problems are much more clearly
expressed in terms of iteration

Looping over a collection of items and


performing some kind of operation (searching, replacing, modifying, etc.) processing "pipelines"

Generators allow you to create


8- 29

Copyright (C) 2008, http://www.dabeaz.com

Why Use Generators?


Generators encourage code reuse Separate the "iteration" from code that
uses the iteration.

Means that various iteration patterns can


be dened more generally.

Copyright (C) 2008, http://www.dabeaz.com

8- 30

226

Why Use Generators?


Better memory efciency "Lazy" evaluation Only produce values when needed Contrast to constructing a big list of
values rst

Can operate on innite data streams


Copyright (C) 2008, http://www.dabeaz.com

8- 31

The itertools Module


A library module with various functions
itertools.chain(s1,s2) itertools.count(n) itertools.cycle(s) itertools.dropwhile(predicate, s) itertools.groupby(s) itertools.ifilter(predicate, s) itertools.imap(function, s1, ... sN) itertools.repeat(s, n) itertools.tee(s, ncopies) itertools.izip(s1, ... , sN)

designed to help with iterators/generators

All functions process data iteratively. Implement various kinds of iteration patterns
Copyright (C) 2008, http://www.dabeaz.com

8- 32

227

More Information
"Generator Tricks for Systems
Programmers" tutorial from PyCon'08 http://www.dabeaz.com/generators

More examples and more generator tricks


8- 33

Copyright (C) 2008, http://www.dabeaz.com

228

Section 9

Working With Text

Overview
This sections expands upon text processing Some common programming idioms Simple text parsing Regular expression pattern matching Text generation Text I/O
Copyright (C) 2008, http://www.dabeaz.com

9- 2

229

Text Parsing
Problem : Converting text to data This is an almost universal problem. Programs
need to process various sorts of le formats, extract data from les, etc.

Example : Reading column-oriented elds Example : Extracting data from XML/HTML


Copyright (C) 2008, http://www.dabeaz.com

9- 3

Text Splitting
String Splitting
s.split([separator [, maxsplit]]) s.rsplit([separator [, maxsplit]] separator maxsplit : : Text separating elements Number of splits to perform

Examples:
line = "foo/bar/spam" line.split('/') line.split('/',1) line.rsplit('/',1) ['foo','bar','spam'] ['foo','bar/spam'] ['foo/bar','spam']

Copyright (C) 2008, http://www.dabeaz.com

9- 4

230

Text Stripping
The following methods string text from the
beginning or end of a string
s.strip([chars]) s.lstrip([chars]) s.rstrip([chars]) # Strip begin/end # Strip on left # Strip on right

Examples:
s = "==Hello==" s.strip('=') s.lstrip('=') s.rstrip('=') "Hello" "Hello==" "==Hello"

Copyright (C) 2008, http://www.dabeaz.com

9- 5

Text Searching
Searching : s.nd(text [,start])
s = "<html><head><title>Hello World</title><head>..." title_start = s.find("<title>") if title_start >= 0: title_end = s.find("</title>",title_start)

Returns index of starting character or -1 if text


wasn't found.

Use a slice to extract text fragments


title_text = s[title_start+7:title_end]
Copyright (C) 2008, http://www.dabeaz.com

9- 6

231

Text Replacement
Replacement : s.replace(text, new [, count])
s = "Is Chicago not Chicago?" >>> s.replace('Chicago','Peoria') 'Is Peoria not Peoria?' >>> s.replace('Chicago','Peoria',1) 'Is Peoria not Chicago?'

Reminder: This always returns a new string


Copyright (C) 2008, http://www.dabeaz.com

9- 7

Commentary
Simple text parsing problems can often be
in C in the Python interpreter solved by just using various combinations of string splitting/stripping/nding operations

These low level operations are implemented Depending on what is being parsed, this kind
of approach may the fastest implementation
9- 8

Copyright (C) 2008, http://www.dabeaz.com

232

Exercise 9.1

Time : 15 Minutes

Copyright (C) 2008, http://www.dabeaz.com

9- 9

re Module
Regular expression pattern matching Searching for text patterns Extracting text from a document Replacing text Example: Extracting URLs from text
Go to http://www.python.org for more information on Python

Copyright (C) 2008, http://www.dabeaz.com

9- 10

233

re Module
A quick review of regular expressions
"foo" "(foo|bar)" "(foo)*" "(foo)+" "(foo)?" "[abcde]" "[a-z]" "[^a-z]" "." "\*" "\+" "\d" "\s" "\w" # # # # # # # # # # # # # # Matches the text "foo" Matches the text "foo" or "bar" Match 0 or more repetitions of foo Match 1 or more repetitions of foo Match 0 or 1 repetitions of foo Match one of the letters a,b,c,d,e Match one letter from a,b,...,z Match any character except a,b,...z Match any character except newline Match the * character Match the + character Match a digit Match whitespace Match alphanumeric character

Copyright (C) 2008, http://www.dabeaz.com

9- 11

re Module
Patterns supplied as strings Usually specied using raw strings Example
pat = r'(http://[\w-]+(?:\.[\w-]+)*(?:/[\w?#=&.,%_-]*)*)'

Raw strings don't interpret escapes (\)


Copyright (C) 2008, http://www.dabeaz.com

9- 12

234

re Usage
You start with some kind of pattern string
pat = r'<title>(.*?)</title>'

You then compile the pattern string


patc = re.compile(pat,re.IGNORECASE)

You then use the compiled pattern to


perform various matching/searching operations

Copyright (C) 2008, http://www.dabeaz.com

9- 13

re: Matching
How to match a string against a pattern
m = patc.match(somestring) if m: # Matched else: # No match

Example
>>> m = patc.match("<title>Python Introduction</title>") >>> m <_sre.SRE_Match object at 0x5da68> >>> m = patc.match("<h1>Python Introduction</h1>") >>> m >>>

Copyright (C) 2008, http://www.dabeaz.com

9- 14

235

re: Searching
How to search for a pattern in a string
m = patc.search(somestring) if m: # Found else: # Not found

Example
>>> m = patc.search("<body><title>Python Introduction</title>...") >>> m <_sre.SRE_Match object at 0x5da68> >>>

Copyright (C) 2008, http://www.dabeaz.com

9- 15

How to get the text that was matched How to get the location of the text matched Example
first = m.start() last = m.end() first,last = m.span() # Index of pattern start # Index of pattern end # Start,end indices together >>> m = patc.search("<title>Python Introduction</title>") >>> m.group() '<title>Python Introduction</title>' >>> m.start() 0 >>> m.end() 34 >>> m = patc.search(s) if m: text = m.group()

re: Match Objects

Copyright (C) 2008, http://www.dabeaz.com

9- 16

236

re: Groups
Regular expressions may dene groups
pat = r'<title>(.*?)</title>' pat = r'([\w-]+):(.*)'

Groups are assigned numbers


1 pat = r'<title>(.*?)</title>' pat = r'([\w-]+):(.*)'

Number determined left-to-right


Copyright (C) 2008, http://www.dabeaz.com

9- 17

re: Groups
When matching, groups can be extracted
>>> m = patc.match("<title>Python Introduction</title>") >>> m.group() '<title>Python Introduction</title>' >>> m.group(1) 'Python Introduction' >>>

Used to easily extract text for various parts


of a regex pattern

Copyright (C) 2008, http://www.dabeaz.com

9- 18

237

re: Search Example


Find all occurrences of a pattern
matches = patc.finditer(s) for m in matches: print m.group()

This returns a list of match objects for later


processing

Copyright (C) 2008, http://www.dabeaz.com

9- 19

re: Pattern Replacement


Find all patterns and replace Example:
t = patc.sub(make_link,s) def make_link(m): url = m.group() return '<a href="%s">%s</a>' % (url,url)

>>> patc.sub(make_link,"Go to http://www.python.org") 'Go to <a href="http://www.python.org">http:// www.python.org</a>' >>>

Copyright (C) 2008, http://www.dabeaz.com

9- 20

238

re: Comments
re module is very powerful I have only covered the essential basics Strongly inuenced by Perl However, regexs are not an operator Reference:
Jeffrey Friedl, "Mastering Regular Expressions", O'Reilly & Associates, 2006.

Copyright (C) 2008, http://www.dabeaz.com

9- 21

Exercise 9.2

Time : 15 Minutes

Copyright (C) 2008, http://www.dabeaz.com

9- 22

239

Generating Text
Programs often need to generate text Reports HTML pages XML Endless possibilities
Copyright (C) 2008, http://www.dabeaz.com

9- 23

String Concatenation
Strings can be concatenated using +
s = "Hello" t = "World" a = s + t # a = "HelloWorld"

Although (+) is ne for just a few strings,


it has horrible performance if you are concatenating many small chunks together to create a large string

Should not be used for generating output


Copyright (C) 2008, http://www.dabeaz.com

9- 24

240

String Joining
The fastest way to join many strings
chunks = ["chunk1","chunk2",..."chunkN"] result = separator.join(chunks)

Example:
chunks = ["Is","Chicago","Not","Chicago?"] " ".join(chunks) ",".join(chunks) "".join(chunks) "Is Chicago Not Chicago?" "Is,Chicago,Not,Chicago?" "IsChicagoNotChicago?"

Copyright (C) 2008, http://www.dabeaz.com

9- 25

String Joining Example


Don't do this:
s = "" for x in seq: ... s += "some text being produced" ...

Better:
chunks = [] for x in seq: ... chunks.append("some text being produced") ... s = "".join(chunks)

Copyright (C) 2008, http://www.dabeaz.com

9- 26

241

Printing to a String
StringIO module Provides a "le-like" object you can print to
import StringIO out = StringIO.StringIO() for x in seq: ... print >>out, "some text being produced" ... s = out.getvalue()

In certain situations, this object can be used in


place of a le
Copyright (C) 2008, http://www.dabeaz.com

9- 27

String Interpolation
In languages like Perl and Ruby, programmers
are used to string interpolation features
$name = "Dave"; $age = 39; print "$name is $age years old\n";

Python doesn't have a direct equivalent However, there are some alternatives
Copyright (C) 2008, http://www.dabeaz.com

9- 28

242

Dictionary Formatting
String formatting against a dictionary
fields = { 'name' : 'Dave', 'age' : 39 } print "%(name)s is %(age)s years old\n" % fields

The above requires a dictionary, but you can


easily get a dictionary of variables
name = "Dave" age = 39 print "%(name)s is %(age)s years old\n" % vars()

Copyright (C) 2008, http://www.dabeaz.com

9- 29

Template Strings
A special string that supports $substitutions
import string s = string.Template("$name is $age years old\n") fields = { 'name' : 'Dave', 'age' : 39 }

An alternative method for missing values


s.safe_substitute(fields)
Copyright (C) 2008, http://www.dabeaz.com

print s.substitute(fields)

9- 30

243

Advanced Formatting
Python 2.6/3.0 have new string formatting
print "{0} is {1:d} years old".format("Dave",39) print "{name} is {age:d} years old".format(name="Dave",age=39) stock = { 'name' : 'GOOG', 'shares' : 100, 'price' : 490.10 } print "{s[name]:10s} {s[shares]:10d} {s[price]:10.2f}"\ .format(s=stock)

Allows much more exible substitution and


lookup than the % operator
Copyright (C) 2008, http://www.dabeaz.com

9- 31

Exercise 9.3

Time : 20 Minutes

Copyright (C) 2008, http://www.dabeaz.com

9- 32

244

Text Input/Output
You frequently read/write text from les Example: Reading line-by-line Example: Writing a line of text
f = open("something.txt","w") f.write("Hello World\n") print >>f, "Hello World\n" f = open("something.txt","r") for line in f: ...

There are still a few issues to worry about


Copyright (C) 2008, http://www.dabeaz.com

9- 33

Line Handling
Question: What is a text line? It's different on different operating systems
some characters .......\n some characters .......\r\n (Unix) (Windows)

By default, Python uses the system's native line


ending when writing text les

However, it can get messy when reading text


les (especially cross platform)
Copyright (C) 2008, http://www.dabeaz.com

9- 34

245

Line Handling
Example: Reading a Windows text le on Unix
>>> f = open("test.txt","r") >>> f.readlines() ['Hello\r\n', 'World\r\n'] >>>

Notice how the lines include the extra


Windows '\r' character

This is a potential source of problems for


Copyright (C) 2008, http://www.dabeaz.com

programs that only expect '\n' line endings

9- 35

Universal Newline
Python has a special "Universal Newline" mode Converts all endings to standard '\n' character f.newlines records the actual newline
character that was used in the le
>>> f.newlines '\r\n' >>> >>> f = open("test.txt","U") >>> f.read() 'Hello World\n' >>>

Copyright (C) 2008, http://www.dabeaz.com

9- 36

246

Universal Newline
Example: Reading a Windows text le on Unix
>>> f = open("test.txt","r") >>> f.readlines() ['Hello\r\n', 'World\r\n'] >>> f = open("test.txt","U") >>> f.readlines() ['Hello\n', 'World\n'] >>> f.newlines '\r\n' >>>

Notice how non-native Windows newline '\r\n'


is translated to standard '\n'
Copyright (C) 2008, http://www.dabeaz.com

9- 37

Text Encoding
Question : What is a character? In Python 2, text consists of 8-bit characters
"Hello World" 48 65 6c 6c 6f 20 57 6f 72 6c 64

Characters are usually encoded in ASCII

Copyright (C) 2008, http://www.dabeaz.com

9- 38

247

International Characters
Problem : How to deal with characters from
international character sets?
"That's a spicy Jalapeo!"

Question: What is the character encoding? Historically, everyone made a different encoding Bloody hell!
Copyright (C) 2008, http://www.dabeaz.com

= 0x96 = 0xf1 = 0xa4

(MacRoman) (CP1252 - Windows) (CP437 - DOS)

9- 39

Unicode
For international characters, use Unicode In Python, there is a special syntax for literals
t = u"That's a spicy Jalape\u00f1o!"

Unicode

Unicode strings are just like regular strings


except that they hold Unicode characters

What is a Unicode character?


Copyright (C) 2008, http://www.dabeaz.com

9- 40

248

Unicode Characters
Unicode denes a standard numerical value
for every character used in all languages (except for ctional ones such as Klingon)

The numeric value is known as "code point" There are a lot of code points (>100,000)
! ! = = = = U+00F1 U+03B5 U+0A87 U+3304

Copyright (C) 2008, http://www.dabeaz.com

9- 41

Unicode Charts
http://www.unicode.org/charts

Copyright (C) 2008, http://www.dabeaz.com

9- 42

249

Using Unicode Charts

t = u"That's a spicy Jalape\u00f1o!"

\uxxxx - Embeds a Unicode code point in a string Code points specied in hex by convention
Copyright (C) 2008, http://www.dabeaz.com

9- 43

Using Unicode Charts


All code points also have descriptive names

\N{name} - Embeds a named character


t = u"Spicy Jalape\N{LATIN SMALL LETTER N WITH TILDE}o!"
Copyright (C) 2008, http://www.dabeaz.com

9- 44

250

Unicode Representation
Internally, Unicode characters are 16-bits
t = u"Jalape\u00f1o" 004a 0061 006c 0061 0070 0065 00f1 006f

Normally, you don't worry about this Except you have to perform I/O
u'J' u'J' --> 00 4a --> 4a 00 (Big Endian) (Little Endian)

How do characters get encoded in the le?


Copyright (C) 2008, http://www.dabeaz.com

9- 45

Unicode I/O
Unicode does not dene a standard le
encoding--it only denes character code values

There are many different le encodings Examples: UTF-8, UTF-16, etc. Most popular: UTF-8 (ASCII is a subset) So, how do you deal with these encodings?
Copyright (C) 2008, http://www.dabeaz.com

9- 46

251

Unicode File I/O


Unicode I/O handled using codecs module codecs.open(lename,mode,encoding)
>>> f = codecs.open("data.txt","w","utf-8") >>> f.write(u"Hello World\n") >>> f.close() >>> f = codecs.open("data.txt","w","utf-16") >>> f.write(data) >>>

Several hundred character codecs are provided Consult documentation for details
Copyright (C) 2008, http://www.dabeaz.com

9- 47

Unicode Encoding
Explicit encoding via strings
>>> a = u"Jalape\u00f1o" >>> enc_a = a.encode("utf-8") >>>

Explicit decoding from bytes


>>> enc_a = 'Jalape\xc3\xb1o' >>> a = enc_a.decode("utf-8") >>> a u'Jalape\xf1o' >>>

Copyright (C) 2008, http://www.dabeaz.com

9- 48

252

Encoding Errors
Encoding/Decoding text is often messy May encounter broken/invalid data The default behavior is to raise an
UnicodeError Exception
>>> a = u"Jalape\xf1o" >>> b = a.encode("ascii") Traceback (most recent call last): File "<stdin>", line 1, in <module> UnicodeEncodeError: 'ascii' codec can't encode character u'\xf1' in position 6: ordinal not in range(128) >>>

Copyright (C) 2008, http://www.dabeaz.com

9- 49

Encoding Errors
Encoding/Decoding can use an alternative
error handling policy
s.decode("encoding",errors) s.encode("encoding",errors)

Errors is one of

'strict' 'ignore' 'replace' 'backslashreplace' 'xmlcharrefreplace'

Raise exception (the default) Ignore errors Replace with replacement character Use escape code Use XML character reference

Copyright (C) 2008, http://www.dabeaz.com

9- 50

253

Encoding Errors
Example: Ignore bad characters Example: Encode Unicode into ASCII
>>> a = u"Jalape\xf1o" >>> b = a.encode("us-ascii","xmlcharrefreplace") 'Jalape&#241;o' >>> >>> a = u"Jalape\xf1o" >>> a.encode("ascii",'ignore') 'Jalapeo' >>>

Copyright (C) 2008, http://www.dabeaz.com

9- 51

Finding the Encoding


How do you determine the encoding of a le? Might be known in advance (in the manual) May be indicated in the le itself
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/>

Depends on the data source, application, etc.


Copyright (C) 2008, http://www.dabeaz.com

9- 52

254

Unicode Everywhere
Unicode is the modern standard for text In Python 3, all text is Unicode Here are some basic rules to remember: All text les are encoded (even ASCII) When you read text, you always decode When you write text, you always encode
Copyright (C) 2008, http://www.dabeaz.com

9- 53

A Caution
Unicode may sneak in when you don't expect it Database integration XML Parsing Unicode silently propagates through string-ops
s = "Spicy" t = u"Jalape\u00f1o" w = s + t # Standard 8-bit string # Unicode string # Unicode : u'SpicyJalape\u00f1o"

This propagation may break your code if it's not


expecting to receive Unicode text
Copyright (C) 2008, http://www.dabeaz.com

9- 54

255

Exercise 9.4

Time : 10 Minutes

Copyright (C) 2008, http://www.dabeaz.com

9- 55

256

Section 10

Binary Data Handling and File I/O

Introduction
A major use of Python is to interface with
foreign systems and software

Example : Software written in C/C++ In this section, we look at the problem of


data interchange

Using Python to decode/encode binary data


encodings and other related topics
Copyright (C) 2008, http://www.dabeaz.com

10- 2

257

Overview
Representation of binary data Binary le I/O Structure packing/unpacking Binary structures and ctypes Low-level I/O interfaces
Copyright (C) 2008, http://www.dabeaz.com

10- 3

Binary Data
Binary data - low-level machine data Examples : 16-bit integers, 32-bit integers, 64 Raw low-level data that you typically

bit double precision oats, packed strings, etc. encounter with typed programming languages such as C, C++, Java, etc.

Copyright (C) 2008, http://www.dabeaz.com

10- 4

258

Common Scenarios
You might encounter binary encodings in a
number of settings

Special le formats (images, video, audio, etc.) Low-level network protocols Control of HW (e.g., over serial ports) Reading/writing data meant for use by software
in other languages (C, C++, etc.)
Copyright (C) 2008, http://www.dabeaz.com

10- 5

Binary Data Representation


To store binary data, you can use strings In Python 2, strings are just byte sequences
bytes = '\x89PNG\r\n\x1a\n\x00\x00\x00\rIHDR\x00\x00' \xhh - Encodes an arbitrary byte (hh)

All of the normal string operations work

except that you may have to specify a lot of non-text characters using \xhh escape codes
10- 6

Copyright (C) 2008, http://www.dabeaz.com

259

Binary Data Representation


In Python 2.6 and newer, a special syntax
header = b'\x89PNG\r\n'

should be used if you are writing byte literal strings in your program

byte string prex

This has been added to disambiguate text


(Unicode) strings and raw byte strings required in Python 3

A caution : This is optional in Python 2, but


Copyright (C) 2008, http://www.dabeaz.com

10- 7

byte arrays
Python 2.6 introduces a new bytearray type
# Initialized from a byte string b = bytearray(b"Hello World") # Preinitialized to a specific size buf = bytearray(1024)

A bytearray supports almost all of the usual


string operations, but it is mutable characters and slices

You can make in-place assignments to


Copyright (C) 2008, http://www.dabeaz.com

10- 8

260

Binary File I/O


To obtain binary data, you'll typically read it
from a le, pipe, network socket, etc.

For les, there are special modes


f = open(filename,"rb") f = open(filename,"wb") f = open(filename,"ab") # Read, binary mode # Write, binary mode # Append, binary mode

Disables all newline translation (reads/writes) Required for binary data on Windows Optional on Unix (a portability gotcha)
Copyright (C) 2008, http://www.dabeaz.com

10- 9

Binary Data Packing


A common method for dealing with binary data
is to pack or unpack values from byte strings
python
37, 42 pack unpack

raw byte string


b'%\x00\x00\x00*\x00\x00\x00'

Packing/unpacking is about type conversion Converting low-level data to/from built-in Python
types such as ints, oats, strings, etc.
Copyright (C) 2008, http://www.dabeaz.com

10- 10

261

struct module
Packs/unpacks binary records and structures Often used when interfacing Python to
foreign systems or when reading non-text les (images, audio, video, etc.)
import struct # Unpack two raw 32-bit integers from a string x,y = struct.unpack("ii",s) # Pack a set of fields r = struct.pack("8sif", "GOOG",100, 490.10)

Copyright (C) 2008, http://www.dabeaz.com

10- 11

struct module
Packing/unpacking codes (based on C)
'c' 'b' 'B' 'h' 'H' 'i' 'I' 'l' 'L' 'q' 'Q' 'f' 'd' 's' 'p' 'P'
Copyright (C) 2008, http://www.dabeaz.com

char (1 byte string) signed char (8-bit integer) unsigned char (8-bit integer) short (16-bit integer) unsigned short (16-bit integer) int (32-bit integer) unsigned int (32-bit integer) long (32 or 64 bit integer) unsigned long (32 or 64 bit integer) long long (64 bit integer) unsigned long long (64 bit integer) float (32 bit) double (64 bit) char[] (String) char[] (String with 8-bit length) void * (Pointer)

10- 12

262

struct module
Each code may be preceded by repetition
count
'4i' '20s'

Integer alignment modiers Standard alignment


Copyright (C) 2008, http://www.dabeaz.com

4 integers 20-byte string

'@' '=' '<' '>' '!'

Native byte order and alignment Native byte order, standard alignment Little-endian, standard alignment Big-endian, standard alignment Network (big-endian), standard align

Data is aligned to start on a multiple of the size. For example, an integer (4 bytes) is aligned to start on a 4byte boundary.
10- 13

An IP packet header has this structure


32-bits ver hlen ttl tos length fragment proto checksum sourceaddr destaddr ident

struct Example

To unpack in Python, might do this:


ver = (vhlen & 0xf0) >> 4 hlen = vhlen & 0x0f
Copyright (C) 2008, http://www.dabeaz.com

(vhlen,tos,length, ident,fragment, ttl,proto,checksum, sourceaddr, destaddr) = struct.unpack("!BBHHHBBHII", pkt)

10- 14

263

Exercise 10.1

Time : 15 Minutes

Copyright (C) 2008, http://www.dabeaz.com

10- 15

Binary Type Objects


Instead of unpacking/packing, an alternative
approach is to dene new kinds of Python objects that internally represent low-level binary data in its native format

These objects then provide an interface for


manipulating the internal data from Python "wrapping" binary data with a Python layer

Instead of converting data, this is just


Copyright (C) 2008, http://www.dabeaz.com

10- 16

264

ctypes library
A library that has objects for dening lowlevel C types, structures and arrays

Can create objects that look a lot like normal


Python objects except that they are represented using a C memory layout

A companion/alternative to struct
Copyright (C) 2008, http://www.dabeaz.com

10- 17

ctypes types
There are a predened set of type objects
ctypes type -----------------c_byte c_char c_char_p c_double c_float c_int c_long c_longlong c_short c_uint c_ulong c_ushort c_void_p C Datatype --------------------------signed char char char * double float int long long long short unsigned int unsigned long unsigned short void *

Copyright (C) 2008, http://www.dabeaz.com

10- 18

265

ctypes types
Types are objects that can be instantiated
>>> from ctypes import * >>> x = c_long(42) >>> x c_long(42) >>> print x.value 42 >>> x.value = 23 >>> x c_long(23) >>>

Unlike many Python types, the values are


mutable (access/modify through .value)

Accessing the value is touching raw memory


Copyright (C) 2008, http://www.dabeaz.com

10- 19

ctypes arrays
Dening a C array type Creating and using an instance of this type
>>> >>> 4 >>> 1 >>> 2 >>> 1 2 >>> a = long4(1,2,3,4) len(a) a[0] a[1] for x in a: print a, 3 4 >>> long4 = c_long * 4 >>> long4 <class '__main__.c_long_Array_4'> >>>

Copyright (C) 2008, http://www.dabeaz.com

10- 20

266

ctypes structures
Dening a C structure
>>> class Point(Structure): _fields_ = [ ('x', c_double), ('y', c_double) ] >>> p = Point(2,3) >>> p.x 2.0 >>> p.y 3.0 >>>

Looks kind of like a normal Python object, but


is really just a layer over a C struct
Copyright (C) 2008, http://www.dabeaz.com

10- 21

Direct I/O
Few Python programmers know this, but
ctypes objects support direct I/O

Can directly write ctypes objects onto les Can directly read into existing ctypes objects
f = open("somefile","rb") p = Point() # Create an empty point f.readinto(p) # Read into it ... p = Point(2,3) f = open("somefile","wb") f.write(p) ...

Copyright (C) 2008, http://www.dabeaz.com

10- 22

267

Exercise 10.2

Time : 10 Minutes

Copyright (C) 2008, http://www.dabeaz.com

10- 23

Binary Arrays
array module used to dene binary arrays
>>> from array import array >>> a = array('i',[1,2,3,4,5]) >>> a.append(6) >>> a array('i',[1,2,3,4,5,6]) >>> a.append(37.5) TypeError: an integer is required >>> # Integer array

An array is a like a list except that all elements


are constrained to a single type
a.typecode a.itemsize # Array type code # Item size in bytes

Copyright (C) 2008, http://www.dabeaz.com

10- 24

268

Array Initialization
array(typecode [, initializer])
Typecode -------'c' 'b' 'B' 'h' 'H' 'i' 'I' 'l' 'L' 'f' 'd' Meaning ----------------8-bit character 8-bit integer (signed) 8-bit integer (unsigned) 16-bit integer (signed) 16-bit integer (unsigned) Integer (int) Unsigned integer (unsigned int) Long integer (long) Unsigned long integer (unsigned long) 32-bit float (single precision) 64-bit float (double precision)

Initializer is just a sequence of input values


Copyright (C) 2008, http://www.dabeaz.com

10- 25

Array Operations
Arrays work a lot like lists
a.append(item) a.index(item) a.remove(item) a.count(item) a.insert(index,item) # # # # # Append an item Index of item Remove item Count occurrences Insert item

Key difference is the uniform data type Underneath the covers, data is stored in a

contiguous memory region--just like an array in C/C++


10- 26

Copyright (C) 2008, http://www.dabeaz.com

269

Array Operations
Arrays also have I/O functions
a.fromfile(f,n) # Read and append n items # from file f # Write all items to f a.tofile(f)

Arrays have string packing functions


a.fromstring(s) # Appends items from packed # binary string s # Convert to a packed string a.tostring()

Copyright (C) 2008, http://www.dabeaz.com

10- 27

Some Facts about Arrays


Arrays are more space efcient
1000000 integers in a list : 16 Mbytes 1000000 integers in an array : 4 Mbytes

List item access is about 70% faster.

Arrays store values as native C datatypes so every access involves creating a Python object (int, oat) to view the value matrices. The (+) operator concatenates.
10- 28

Arrays are not mathematical vectors or


Copyright (C) 2008, http://www.dabeaz.com

270

numpy arrays
For mathematics, consider the numpy library It provides arrays that behave like vectors and
matrices (element-wise operations)
>>> from numpy import array >>> a = array([1,2,3,4]) >>> b = array([5,6,7,8]) >>> a + b array([6, 8, 10, 12]) >>>

Many scientic packages build upon this


Copyright (C) 2008, http://www.dabeaz.com

10- 29

271

Section 11

Working with Processes

Overview
Using Python to launch other processes,

control their environment, and collect their output

Detailed coverage of subprocess module

Copyright (C) 2008, http://www.dabeaz.com

11- 2

272

Python Interpreter
Python comes from the Unix/C tradition Interpreter is command-line/console based
% python foo.py

Input/output is from a terminal (tty) Programs may optionally use command line
arguments and environment variables the same Unix conventions

Python can control other programs that use


Copyright (C) 2008, http://www.dabeaz.com

11- 3

Subprocesses
A program can create a new process This is called a "subprocess" The subprocess often runs under the control
of the original process (which is known as the "parent" process) status result of the subprocess

Parent often wants to collect output or the


Copyright (C) 2008, http://www.dabeaz.com

11- 4

273

Simple Subprocesses
There are a number of functions for simple
subprocess control

Executing a system shell command


os.system("rm -rf /tmpdata")

Capturing the output of a command (Unix)


import commands s = commands.getoutput("ls -l")

This would be the closest Python equivalent


to backticks (`ls -l`) in the shell, Perl, etc.
Copyright (C) 2008, http://www.dabeaz.com

11- 5

Advanced Subprocesses
Python has several modules for subprocesses os, popen2, subprocess Historically, this has been a bit of a moving
target. In older Python code, you will probably see extensive use of the popen2 and os modules the subprocess module

Currently, the ofcial "party line" is to use


Copyright (C) 2008, http://www.dabeaz.com

11- 6

274

subprocess Module
A high-level module for launching subprocesses Cross-platform (Unix/Windows) Tries to consolidate the functionality of a wideassortment of low-level system calls (system, popen(), exec(), spawn(), etc.)

Will illustrate with some common use cases


Copyright (C) 2008, http://www.dabeaz.com

11- 7

Executing Commands
Problem: You want to execute a simple command or run a separate program. You don't care about capturing its output.
import subprocess p = subprocess.Popen(['mkdir','temp']) q = subprocess.Popen(['rm','-f','tempdata'])

Executes a command string Returns a Popen object (more in a minute)


Copyright (C) 2008, http://www.dabeaz.com

11- 8

275

Specifying the Command


Popen() accepts a list of command args
subprocess.Popen(['rm','-f','tempdata'])

These are the same as the args in the shell


shell % rm -f tempdata

Note: Each "argument" is a separate item


subprocess.Popen(['rm','-f','tempdata']) subprocess.Popen(['rm','-f tempdata']) # Good # Bad

Don't merge multiple arguments into a single string like this.


Copyright (C) 2008, http://www.dabeaz.com

11- 9

Environment Vars
How to set up environment variables in child
env_vars = { 'NAME1' : 'VALUE1', 'NAME2' : 'VALUE2', ... } p = subprocess.Popen(['cmd','arg1',...,'argn'], env=env_vars)

Note : If this is supplied and there is a PATH


environment variable, it will be used to search for the command (Unix)
11- 10

Copyright (C) 2008, http://www.dabeaz.com

276

Current Directory
If you need to change the working directory
p = subprocess.Popen(['cmd','arg1',...,'argn'], cwd='/some/directory')

Note: This changes the working directory for


the subprocess, but does not affect how Popen() searches for the command

Copyright (C) 2008, http://www.dabeaz.com

11- 11

Collecting Status Codes


When subprocess terminates, it returns a status An integer code of some kind
C: exit(status); Java: System.exit(status); Python: raise SystemExit(status)

Convention is for 0 to indicate "success."


Anything else is an error.
Copyright (C) 2008, http://www.dabeaz.com

11- 12

277

Collecting Status Codes


When you launch a subprocess, it runs
independently from the parent

To wait and collect status, use wait()


p = subprocess.Popen(['cmd','arg1',...,'argn']) ... status = p.wait()

Status will be the integer return code (which


is also stored)
p.returncode # Exit status of subprocess

Copyright (C) 2008, http://www.dabeaz.com

11- 13

Polling a Subprocess
poll() - Checks status of subprocess
p = subprocess.Popen(['cmd','arg1',...,'argn']) ... if p.poll() is None: # Process is still running else: status = p.returncode # Get the return code

Returns None if the process is still running,


otherwise the returncode is returned
11- 14

Copyright (C) 2008, http://www.dabeaz.com

278

Killing a Subprocess
In Python 2.6 or newer, use terminate()
p = subprocess.Popen(['cmd','arg1',...,'argn']) ... p.terminate()

In older versions, you have to hack it yourself


# Unix import os os.kill(p.pid) # Windows import win32api win32api.TerminateProcess(int(p._handle),-1)

Copyright (C) 2008, http://www.dabeaz.com

11- 15

Exercise 11.1

Time : 20 Minutes

Copyright (C) 2008, http://www.dabeaz.com

11- 16

279

Capturing Output
Problem: You want to execute another program and capture its output

Use additional options to Popen()

This works with both Unix and Windows Captures any output printed to stdout
Copyright (C) 2008, http://www.dabeaz.com

import subprocess p = subprocess.Popen(['cmd'], stdout=subprocess.PIPE) data = p.stdout.read()

11- 17

Sending/Receiving Data
Problem: You want to execute a program, send it some input data, and capture its output

Set up pipes using Popen()


p = subprocess.Popen(['cmd'], stdin = subprocess.PIPE, stdout = subprocess.PIPE) p.stdin.write(data) p.stdin.close() result = p.stdout.read() p.stdin # Send data # No more input # Read output stdin

python
p.stdout
Copyright (C) 2008, http://www.dabeaz.com

cmd
stdout

11- 18

280

Sending/Receiving Data
Problem: You want to execute a program, send it some input data, and capture its output

Set up pipes using Popen()


p = subprocess.Popen(['cmd'], stdin = subprocess.PIPE, stdout = subprocess.PIPE) p.stdin.write(data) p.stdin.close() result = p.stdout.read() p.stdin

Pair of les that are # Send data # No more up to the are hooked input # Read output subprocess
stdin

python
p.stdout
Copyright (C) 2008, http://www.dabeaz.com

cmd
stdout

11- 19

Sending/Receiving Data
How to capture stderr
p = subprocess.Popen(['cmd'], stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE) p.stdin stdin stdout stderr

python

p.stdout p.stderr

cmd

Note: stdout/stderr can also be merged


p = subprocess.Popen(['cmd'], stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
Copyright (C) 2008, http://www.dabeaz.com

11- 20

281

I/O Redirection
Connecting input to a le
f_in = open("somefile","r") p = subprocess.Popen(['cmd'], stdin=f_in)

Connecting the output to a le


f_out = open("somefile","w") p = subprocess.Popen(['cmd'], stdout=f_out)

Basically, stdin and stdout can be connected


to any open le object

Note : Must be a real le in the OS


Copyright (C) 2008, http://www.dabeaz.com

11- 21

Subprocess I/O
Subprocess module can be used to set up
fairly complex I/O patterns
import subprocess p1 = subprocess.Popen("ls -l", shell=True, stdout=subprocess.PIPE) p2 = subprocess.Popen("wc",shell=True, stdin=p1.stdout, stdout=subprocess.PIPE) out = p2.stdout.read()

Note: this is the same as this in the shell


shell % ls -l | wc
Copyright (C) 2008, http://www.dabeaz.com

11- 22

282

I/O Issues
subprocess module does not work well for
controlling interactive processes

Buffering behavior is often wrong (may hang) Pipes don't properly emulate terminals Subprocess may not operate correctly Does not work for sending keyboard input
typed into any kind of GUI.
Copyright (C) 2008, http://www.dabeaz.com

11- 23

Interactive Subprocesses
Problem: You want to launch a subprocess, but it involves interactive console-based user input

Must launch subprocess and make its input


emulate an interactive terminal (TTY) (e.g., pexpect)

Best bet: Use a third-party "Expect" library Modeled after Unix "expect" tool (Don Libes)
Copyright (C) 2008, http://www.dabeaz.com

11- 24

283

pexpect Example
http://pexpect.sourceforge.net Sample of controlling an interactive session
import pexpect child = pexpect.spawn("ftp ftp.gnu.org") child.expect('Name .*:') child.sendline('anonymous') child.expect('ftp> ') Expected output child.sendline('cd gnu/emacs') child.expect('ftp> ') child.sendline('ls') Send responses child.expect('ftp> ') print child.before child.sendline('quit')

Copyright (C) 2008, http://www.dabeaz.com

11- 25

Odds and Ends


Python a large number of other modules and
functions related to process management

os module (fork, wait, exec, etc.) signal module (signal handling) time module (system time functions) resource (system resource limits) locale (internationalization) _winreg (Windows Registry)
Copyright (C) 2008, http://www.dabeaz.com

11- 26

284

Exercise 11.2

Time : 10 Minutes

Copyright (C) 2008, http://www.dabeaz.com

11- 27

285

Section 12

Python Integration Primer

Copyright (C) 2008, http://www.dabeaz.com

12- 1

Python Integration
People don't use Python in isolation They use it to interact with other software Software not necessarily written in Python This is one of Python's greatest strengths!
12- 2

Copyright (C) 2008, http://www.dabeaz.com

286

Overview
A brief tour of how Python integrates with
the outside world

Support for common data formats Network programming Accessing C libraries COM Extensions Jython (Java) and IronPython (.NET)
Copyright (C) 2008, http://www.dabeaz.com

12- 3

Data Interchange
Python is adept at exchanging data with Example : Processing XML

other programs using standard data formats

Copyright (C) 2008, http://www.dabeaz.com

12- 4

287

XML Overview
XML documents use structured markup Documents made up of elements Elements have starting/ending tags May contain text and other elements
Copyright (C) 2008, http://www.dabeaz.com

<contact> <name>Elwood Blues</name> <address>1060 W Addison</address> <city>Chicago</city> <zip>60616</zip> </contact>

<name>Elwood Blues</name>

12- 5

XML Example
<?xml version="1.0" encoding="iso-8859-1"?> <recipe> <title>Famous Guacamole</title> <description> A southwest favorite! </description> <ingredients> <item num="2">Large avocados, chopped</item> <item num="1">Tomato, chopped</item> <item num="1/2" units="C">White onion, chopped</item> <item num="1" units="tbl">Fresh squeezed lemon juice</item> <item num="1">Jalapeno pepper, diced</item> <item num="1" units="tbl">Fresh cilantro, minced</item> <item num="3" units="tsp">Sea Salt</item> <item num="6" units="bottles">Ice-cold beer</item> </ingredients> <directions> Combine all ingredients and hand whisk to desired consistency. Serve and enjoy with ice-cold beers. </directions> </recipe>
Copyright (C) 2008, http://www.dabeaz.com

12- 6

288

XML Parsing
Parsing XML documents is easy Use the xml.etree.ElementTree module

Copyright (C) 2008, http://www.dabeaz.com

12- 7

etree Parsing
How to parse an XML document
>>> import xml.etree.ElementTree >>> doc = xml.etree.ElementTree.parse("recipe.xml")

This creates an ElementTree object


>>> doc <xml.etree.ElementTree.ElementTree instance at 0x6e1e8>

Supports many high-level operations


Copyright (C) 2008, http://www.dabeaz.com

12- 8

289

etree Parsing
Obtaining selected elements
>>> >>> >>> >>> title = doc.find("title") firstitem = doc.find("ingredients/item") firstitem2 = doc.find("*/item") anyitem = doc.find(".//item")

nd() always returns the rst element found Note: Element selection syntax above
Copyright (C) 2008, http://www.dabeaz.com

12- 9

etree Parsing
To loop over all matching elements
>>> for i in doc.findall("ingredients/item"): ... print i ... <Element item at 7a6c10> <Element item at 7a6be8> <Element item at 7a6d50> <Element item at 7a6d28> <Element item at 7a6b20> <Element item at 7a6d00> <Element item at 7a6af8> <Element item at 7a6d78>

Copyright (C) 2008, http://www.dabeaz.com

12- 10

290

etree Parsing
Obtaining the text from an element Searching and text extraction combined
>>> doc.findtext("title") 'Famous Guacamole' >>> doc.findtext("directions") '\n Combine all ingredients and hand whisk to desired consistency.\n Serve and enjoy with ice-cold beers.\n ' >>> doc.findtext(".//item") 'Large avocados, chopped' >>> >>> e = doc.find("title") >>> e.text 'Famous Guacamole' >>>

Copyright (C) 2008, http://www.dabeaz.com

12- 11

etree Parsing
To obtain the element tag
>>> elem.tag 'item' >>>

To obtain element attributes


>>> item.get('num') '6' >>> item.get('units') 'bottles'

Copyright (C) 2008, http://www.dabeaz.com

12- 12

291

etree Example
Print out recipe ingredients
for i in doc.findall("ingredients/item"): unitstr = "%s %s" % (i.get("num"),i.get("units","")) print "%-10s %s" % (unitstr,i.text)

Output
2 1 1/2 C 1 tbl 1 1 tbl 3 tsp 6 bottles Large avocados, chopped Tomato, chopped White onion, chopped Fresh squeezed lemon juice Jalapeno pepper, diced Fresh cilantro, minced Sea Salt Ice-cold beer

Copyright (C) 2008, http://www.dabeaz.com

12- 13

Exercise 12.1

Time : 10 Minutes

Copyright (C) 2008, http://www.dabeaz.com

12- 14

292

Network Programming
Python has very strong support for network
programming applications

Will look at just a few simple examples


Copyright (C) 2008, http://www.dabeaz.com

Low-level socket programming Server side/client side modules High-level application protocols

12- 15

Low-level Sockets
A simple TCP server (Hello World)
from socket import * s = socket(AF_INET,SOCK_STREAM) s.bind(("",9000)) s.listen(5) while True: c,a = s.accept() print "Received connection from", a c.send("Hello %s\n" % a[0]) c.close()

socket module provides low-level networking Programming API is almost identical to socket
programming in C, Java, etc.
Copyright (C) 2008, http://www.dabeaz.com

12- 16

293

High-Level Sockets
A simple TCP server (Hello World)
import SocketServer class HelloHandler(SocketServer.BaseRequestHandler): def handle(self): print "Connection from", self.client_address self.request.sendall("Hello World\n") serv = SocketServer.TCPServer(("",8000),HelloHandler) serv.serve_forever()

SocketServer module hides details Makes it easy to implement network servers


Copyright (C) 2008, http://www.dabeaz.com

12- 17

A Simple Web Server


Serve les from a directory
from BaseHTTPServer import HTTPServer from SimpleHTTPServer import SimpleHTTPRequestHandler import os os.chdir("/home/docs/html") serv = HTTPServer(("",8080),SimpleHTTPRequestHandler) serv.serve_forever()

This creates a minimal web server Connect with a browser and try it out
Copyright (C) 2008, http://www.dabeaz.com

12- 18

294

XML-RPC
Remote Procedure Call Uses HTTP as a transport protocol Parameters/Results encoded in XML An RPC standard that is supported across a
variety of different programming languages (Java, Javascript, C++, etc.)

Copyright (C) 2008, http://www.dabeaz.com

12- 19

Simple XML-RPC
How to create a stand-alone server
def add(x,y): return x+y s = SimpleXMLRPCServer(("",8080)) s.register_function(add) s.serve_forever() from SimpleXMLRPCServer import SimpleXMLRPCServer

How to test it (xmlrpclib)

>>> import xmlrpclib >>> s = xmlrpclib.ServerProxy("http://localhost:8080") >>> s.add(3,5) 8 >>> s.add("Hello","World") "HelloWorld" >>>
Copyright (C) 2008, http://www.dabeaz.com

12- 20

295

Simple XML-RPC
Adding multiple functions
from SimpleXMLRPCServer import SimpleXMLRPCServer s = SimpleXMLRPCServer(("",8080)) s.register_function(add) s.register_function(foo) s.register_function(bar) s.serve_forever()

Registering an instance (exposes all methods)


from SimpleXMLRPCServer import SimpleXMLRPCServer s = SimpleXMLRPCServer(("",8080)) obj = SomeObject() s.register_instance(obj) s.serve_forever()

Copyright (C) 2008, http://www.dabeaz.com

12- 21

Exercise 12.2

Time : 15 Minutes

Copyright (C) 2008, http://www.dabeaz.com

12- 22

296

ctypes Module
A library module that allows C functions to be
executed in arbitrary shared libraries/DLLs access foreign C functions

One of several approaches that can be used to

Copyright (C) 2008, http://www.dabeaz.com

12- 23

ctypes Example
Consider this C code:
int fact(int n) { if (n <= 0) return 1; return n*fact(n-1); } int cmp(char *s, char *t) { return strcmp(s,t); } double half(double x) { return 0.5*x; }

Suppose it was compiled into a shared lib


% cc -shared example.c -o libexample.so
Copyright (C) 2008, http://www.dabeaz.com

12- 24

297

ctypes Example
Using C types
>>> >>> >>> 24 >>> -1 >>> 0 >>> import ctypes ex = ctypes.cdll.LoadLibrary("./libexample.so") ex.fact(4) ex.cmp("Hello","World") ex.cmp("Foo","Foo")

It just works (heavy wizardry) However, there is a catch...


Copyright (C) 2008, http://www.dabeaz.com

12- 25

ctypes Caution
C libraries don't contain type information So, ctypes has to guess...
>>> import ctypes >>> ex = ctypes.cdll.LoadLibrary("./libexample.so") >>> ex.fact("Howdy") 1 >>> ex.cmp(4,5) Segmentation Fault

And unfortunately, it usually gets it wrong However, you can help it out.
Copyright (C) 2008, http://www.dabeaz.com

12- 26

298

ctypes Types
You just have to provide type signatures
>>> ex.half.argtypes = (ctypes.c_double,) >>> ex.half.restype = ctypes.c_double >>> ex.half(5.0) 2.5 >>>

Creates a minimal prototype


.argtypes .restype # Tuple of argument types # Return type of a function

Copyright (C) 2008, http://www.dabeaz.com

12- 27

ctypes Types
Sampling of datatypes available
ctypes type -----------------c_byte c_char c_char_p c_double c_float c_int c_long c_longlong c_short c_uint c_ulong c_ushort c_void_p C Datatype --------------------------signed char char char * double float int long long long short unsigned int unsigned long unsigned short void *

Copyright (C) 2008, http://www.dabeaz.com

12- 28

299

ctypes Cautions
Requires detailed knowledge of underlying C
library and how it operates

Function names Argument types and return types Data structures Side effects/Semantics Memory management
Copyright (C) 2008, http://www.dabeaz.com

12- 29

ctypes and C++


Not really supported This is more the fault of C++ C++ creates libraries that aren't easy to

work with (non-portable name mangling, vtables, etc.) mapped to ctypes (e.g., templates, operator overloading, smart pointers, RTTI, etc.)
12- 30

C++ programs may use features not easily

Copyright (C) 2008, http://www.dabeaz.com

300

Extension Commentary
There are more advanced ways of extended
Python with C and C++ code

Low-level extension API Code generator tools (e.g., Swig, Boost, etc.) More details in an advanced course
Copyright (C) 2008, http://www.dabeaz.com

12- 31

Exercise 12.3

Time : 20 minutes

Copyright (C) 2008, http://www.dabeaz.com

12- 32

301

COM Extensions
On Windows, Python can interact with COM Allows Python to script applications on
Windows (e.g., Microsoft Ofce)

Copyright (C) 2008, http://www.dabeaz.com

12- 33

Pythonwin and COM


To work with COM, use Pythonwin extensions An extension to Python that includes additional
modules for Windows

A separate download
http://sourceforge.net/projects/pywin32

The book: "Python Programming on Win32",


by Mark Hammond and Andy Robinson (O'Reilly)
Copyright (C) 2008, http://www.dabeaz.com

12- 34

302

Disclaimer
Covering all of COM is impossible here There are a lot of tricky details Will provide a general idea of how it works
in Python

Consult a reference for gory details


Copyright (C) 2008, http://www.dabeaz.com

12- 35

Python COM Client


Example : Control Microsoft Word First step: Get a COM object for Word
>>> import win32com.client >>> w = win32com.client.Dispatch("Word.Application") >>> w <COMObject Word.Application> >>>

Copyright (C) 2008, http://www.dabeaz.com

12- 36

303

Python as a client
>>> import win32com.client >>> w = win32com.client.Dispatch("Word.Application") >>> w This refers to an <COMObject Word.Application> object in the registry >>>

Copyright (C) 2008, http://www.dabeaz.com

12- 37

Using a COM object


Once you have a COM object, you access its
attributes to do things

Example:
>>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> doc = w.Documents.Add() doc.Range(0,0).Select() sel = w.Selection sel.InsertAfter("Hello World\n") sel.Style = "Heading 1" sel.Collapse(0) sel.InsertAfter("This is a test\n") sel.Style = "Normal" doc.SaveAs("HelloWorld") w.Visible = True

Copyright (C) 2008, http://www.dabeaz.com

12- 38

304

Sample Output

Copyright (C) 2008, http://www.dabeaz.com

12- 39

Commentary
Obviously, there are many more details Must know method names/arguments of
objects you're using

May require Microsoft documentation Python IDEs (IDLE) may not help much
12- 40

Copyright (C) 2008, http://www.dabeaz.com

305

Jython
A pure Java implementation of Python Can be used to write Python scripts that interact
with Java classes and libraries

Ofcial Location:
http://www.jython.org

Copyright (C) 2008, http://www.dabeaz.com

12- 41

Jython Example
Jython runs like normal Python
% jython Jython 2.2.1 on java1.5.0_13 Type "copyright", "credits" or "license" for more information. >>>

And it works like normal Python


>>> print "Hello World" Hello World >>> for i in xrange(5): ... print i, ... 0 1 2 3 4 >>>

Copyright (C) 2008, http://www.dabeaz.com

12- 42

306

Jython Example
Many standard library modules work
>>> import urllib >>> u = urllib.urlopen("http://www.python.org") >>> page = u.read() >>>

And you get access to Java libraries


>>> import java.util >>> d = java.util.Date() >>> d Sat Jul 26 13:31:49 CDT 2008 >>> dir(d) ['UTC', '__init__', 'after', 'before', 'class', 'clone', 'compareTo', 'date', 'day', 'equals', 'getClass', 'getDate', ... ] >>>
Copyright (C) 2008, http://www.dabeaz.com

12- 43

Jython Example
A Java Class
public class Stock { public String name; public int shares; public double price; public Stock(String n, int s, double p) { ! name = n; ! shares = s; ! price = p; } public double cost() { return shares*price; } };

In Jython
Copyright (C) 2008, http://www.dabeaz.com

>>> import Stock >>> s = Stock('GOOG',100,490.10) >>> s.cost() 49010.0 >>>

12- 44

307

IronPython
Python implemented in C#/.NET Can be used to write Python scripts that control
components in .NET

Ofcial Location:
http://www.codeplex.com/IronPython

Copyright (C) 2008, http://www.dabeaz.com

12- 45

IronPython Example
IronPython runs like normal Python
% ipy IronPython 1.1.2 (1.1.2) on .NET 2.0.50727.42 Copyright (c) Microsoft Corporation. All rights reserved. >>>

And it also works like normal Python


>>> print "Hello World" Hello World >>> for i in xrange(5): ... print i, ... 0 1 2 3 4 >>>

Copyright (C) 2008, http://www.dabeaz.com

12- 46

308

IronPython Example
You get access to .NET libraries
>>> import System.Math >>> dir(System.Math) ['Abs', 'Acos', 'Asin', 'Atan', 'Atan2', 'BigMul', 'Ceiling', 'Cos', 'Cosh', ...] >>> System.Math.Cos(3) -0.9899924966 >>>

Can script classes written in C# Same idea as with Jython


Copyright (C) 2008, http://www.dabeaz.com

12- 47

Commentary
Jython and IronPython have limitations Features lag behind those found in the main
Python distribution (for example, Jython doesn't have generators)

Small set of standard library modules work No reason to use unless you are working
with Java or .NET
Copyright (C) 2008, http://www.dabeaz.com

(many of the standard libraries depend on C code which isn't supported)

12- 48

309

Some Final Words


Integration is the real reason why Python is
used by smart programmers and why they consider it to be a "secret weapon" C++, Java, etc.

With Python, you don't give up your code in C, Instead, Python makes that code better Python makes difcult problems easier and
impossible problems more feasible.
Copyright (C) 2008, http://www.dabeaz.com

12- 49

Intro Course Wrap-up


First 12 sections of this class are just the start Basics of the Python language How to work with data How to organize Python software How Python interacts with the environment How it interacts with other programs
Copyright (C) 2008, http://www.dabeaz.com

12- 50

310

Further Topics
Network/Internet Programming (Day 4) Graphical User Interfaces Frameworks and Components Functional Programming Metaprogramming Extension programming
Copyright (C) 2008, http://www.dabeaz.com

12- 51

311

Python Slide Index


Symbols and Numbers
!= operator, 1-40 #, program comment, 1-32 %, oating point modulo operator, 1-54 %, integer modulo operator, 1-50 %, string formatting operator, 2-25 &, bit-wise and operator, 1-50 &, set intersection operator, 2-22 () operator, 5-29 (), tuple, 2-6 **, power operator, 1-54 *, list replication, 1-69 *, multiply operator, 1-50, 1-54 *, sequence replication operator, 2-30 *, string replication operator, 1-60 +, add operator, 1-50, 1-54 +, list concatenation, 1-67 +, String concatenation, 1-59, 9-24 -, set difference operator, 2-22 -, subtract operator, 1-50, 1-54 -i option, Python interpreter, 7-26 -O option, Python interpreter, 7-22, 7-23 . operator, 5-29 ... interpreter prompt, 1-15 .NET framework, and IronPython, 12-45 .pyc les, 4-11 .pyo les, 4-11 / division operator, 1-54 /, division operator, 1-50 //, oor division operator, 1-50, 1-51 < operator, 1-40 <<, left shift operator, 1-50 <= operator, 1-40 == operator, 1-40 > operator, 1-40 >= operator, 1-40 >>, right shift operator, 1-50 >>> interpreter prompt, 1-15 [:] string slicing,, 1-59 [:], slicing operator, 2-31 [], sequence indexing, 2-29 [], string indexing, 1-59 \ Line continuation character, 1-43 ^, bit-wise xor operator, 1-50

{}, dictionary, 2-13 |, bit-wise or operator, 1-50 |, set union operator, 2-22 ~, bit-wise negation operator, 1-50

A
ABC language, 1-5 abs() function, 1-50, 1-54, 4-24 Absolute value function, 1-50, 1-54 __abs__() method, 5-26 Accessor methods, 6-30 Adding items to dictionary, 2-14 __add__() method, 5-26 Advanced String Formatting, 9-31 and operator, 1-40 __and__() method, 5-26 Anonymous functions, 3-45 append() method, of lists, 1-67 Argument naming style, 3-20 Arguments, passing mutable objects, 3-25 Arguments, transforming in function, 3-22 argv variable, sys module, 4-28 array module, 10-24 array module typecodes, 10-25 array object I/O functions, 10-27 array object operations, 10-26 arrays vs. lists, 10-28 arrays, creating with ctypes, 10-20 assert statement, 7-20 assert statement, stripping in -O mode, 7-22 assertEqual() method, unittest module, 7-15 AssertionError exception, 7-20 Assertions, and unit testing, 7-15 assertNotEqual() method, unittest module, 7-15 assertRaises() method, unittest module, 7-15 assert_() method, unittest module, 7-15 Assignment, copying, 2-65 Assignment, reference counting, 2-62, 2-64, 2-65 Associative array, 2-13 Attribute access functions, 5-31 Attribute binding, 6-12 Attribute lookup, 6-17, 6-20 Attribute, denition of, 5-3 Attributes, computed using properties, 6-33, 6-36, 6-37 Attributes, modifying values, 6-13 Attributes, private, 6-27 Awk, and list compreheneions, 2-57

B
Base class, 5-11 base-10 decimals, 4-58 __bases__ attribute of classes, 6-19 binary arrays, 10-24 binary data, 10-4 binary data representation, 10-6 Binary les, 10-9 binary type objects, 10-16 Binding of attributes in objects, 6-12 Block comments, 1-32 Boolean type, 1-46, 1-47 Booleans, and integers, 1-47 Boost Python, 12-31 Bottom up programming style, 3-11 break statement, 2-36 Breakpoint, debugger, 7-29 Breakpoint, setting in debugger, 7-31 Built-in exceptions, 3-52 __builtins__ module, 4-23 bytearray type, 10-8 bytes literals, 10-7

C
C extension, example with ctypes, 12-25 C extensions, accessing with ctypes module, 12-23 C Extensions, other tools, 12-31 C3 Linearization Algorithm, 6-23 Callback functions, 3-44 Calling a function, 1-76 Calling other methods in the same class, 5-9 Capturing output of a subprocess, 11-17 Case conversion, 1-61 Catching exceptions, 1-79 Catching multiple exceptions, 3-54 chr() function, 4-25 Class implementation chart, 6-11 class statement, 5-4 class statement, dening methods, 5-8 Class,, 6-9 Class, representation of, 6-9 Class, __slots__ attribute of, 6-38 __class__ attribute of instances, 6-10 close() function, shelve module, 4-46 Code blocks and indentation, 1-37 Code formatting, 1-43

Code reuse, and generators, 8-30 collect() function, gc module, 6-45 collections module, 4-60 Colon, and indentation, 1-37 COM, 12-33, 12-34 COM, example of controlling Microsoft Word, 12-38 COM, launching Microsoft Word, 12-36 Command line arguments, manual parsing, 4-29 Command line options, 4-28 Command line options, parsing with optparse, 4-30 Command line, running Python, 1-24 Comments, 1-32, 7-3 Comments, vs. documentation strings, 7-3 Community links, 1-2 Compiling regular expressions, 9-13 Complex type, 1-46 complex() function, 4-25 Computed attributes, 6-33, 6-36, 6-37 Concatenation of strings, 1-59 Concatenation, lists, 1-67 Concatenation, of sequences, 2-30 Conditionals, 1-39 CongParser module, 4-48 Conformance checking of functions, 3-41 Container, 2-17 Containers, dictionary, 2-19 Containers, list, 2-18 __contains__() method, 5-25 continue statement, 2-36 Contract programming, 7-21 Conversion of numbers, 1-55 Converting to strings, 1-64 copy module, 2-68, 4-35 copy() function, copy module, 4-35 copy() function, shutil module, 4-42 Copying and moving les, 4-42 copytree() function, shutil module, 4-42 cos() function, math module, 1-54 cPickle module, 4-44 Creating new objects, 5-4 Creating programs, 1-19 ctime() function, time module, 4-41 ctypes direct I/O support, 10-22 ctypes library, 10-17 ctypes module, and C++, 12-30 ctypes module, example of, 12-25 ctypes module, limitations of, 12-29 ctypes module, specifying type signatures, 12-27 ctypes module, supported datatypes, 12-28

ctypes modules, 12-23 ctypes, creating a C datatype instance, 10-19 ctypes, creating arrays, 10-20 ctypes, list of datatypes, 10-18 ctypes, structures, 10-21

D
Data interchange, 12-4 Data structure, dictionary, 6-3 Data structures, 2-5 Database like queries on lists, 2-55 Database, interface to, 4-61 Database, SQL injection attack risk, 4-63 Datatypes, in library modules, 4-57 Date and time manipulation, 4-59 datetime module, 4-59 Debugger, 7-27 Debugger, breakpoint, 7-29, 7-31 Debugger, commands, 7-29 Debugger, launching inside a program, 7-28 Debugger, listing source code, 7-31 Debugger, running at command line, 7-33 Debugger, running functions, 7-32 Debugger, single step execution, 7-32 Debugger, stack trace, 7-30 __debug__ variable, 7-23 decimal module, 4-58 Declarative programming, 2-58 Deep copies, 2-68 deepcopy() function, copy module, 2-68, 4-35 def statement, 1-76, 3-8 Dening a function, 3-8 Dening new functions, 1-76 Dening new objects, 5-4 Denition order, 3-7 del operator, lists, 1-70 delattr() function, 5-31 Deleting items from dictionary, 2-14 __delitem__() method, 5-24 __del__() method, 6-46, 6-47 deque object, collections module, 4-60 Derived class, 5-11 Design by contract, 7-21 Destruction of objects, 6-43 Dictionary, 2-13 Dictionary, and class representation, 6-9 Dictionary, and local function variables, 6-4 Dictionary, and module namespace, 6-5

Dictionary, and object representation, 6-6 Dictionary, creating from list of tuples, 2-21 Dictionary, persistent, 4-46 Dictionary, testing for keys, 2-20 Dictionary, updating and deleting, 2-14 Dictionary, use as a container, 2-19 Dictionary, use as data structure, 6-3 Dictionary, using as function keyword arguments, 3-40 Dictionary, when to use, 2-15 __dict__ attribute, of instances, 6-7, 6-8 __dict__ variable, of modules, 4-14 dir() function, 1-81, 1-82 direct I/O, with ctypes objects', 10-22 Directory listing, 4-38 disable() function, gc module, 6-45 divmod() function, 1-50, 1-54, 4-24 __div__() method, 5-26 doctest module, 3-48, 7-7, 7-8 doctest module, self-testing, 7-10 Documentation, 1-17 Documentation strings, 3-46, 7-3 Documentation strings, and help() command, 7-6 Documentation strings, and help() function, 3-47 Documentation strings, and IDEs, 7-5 Documentation strings, and testing, 3-48, 7-7 Documentation strings, vs. comments, 7-3 Double precision oat, 1-52 Double-quoted string, 1-56 Downloads, 1-2 dumps() function, pickle module, 4-45 Duplicating containers, 2-66

E
easy_install command, 4-71 Edit,compile,debug cycle, 1-13 Eggs, Python package format, 4-71 ElementTree module, xml.etree package, 12-7 elif statement, 1-39 else statement, 1-39 Embedded nulls in strings, 1-58 empty code blocks, 1-42 enable() function, gc module, 6-45 Enabling future features, 1-51 Encapsulation, 6-24 Encapsulation, and accessor methods, 6-30 Encapsulation, and properties, 6-31 Encapsulation, challenges of, 6-25 Encapsulation, uniform access principle, 6-35

end() method, of Match objects, 9-16 endswith() method, strings, 1-62 enumerate() function, 2-39, 2-40, 4-26 environ variable, os module, 4-37 Environment variables, 4-37 Error reporting strategy in exceptions, 3-57 Escape codes, strings, 1-57 Event loop, and GUI programming, 4-54 except statement, 1-79, 3-50 Exception base class, 5-28 Exception, dening new, 5-28 Exception, printing tracebacks, 7-25 Exception, uncaught, 7-24 Exceptions, 1-78, 3-50 Exceptions, catching, 1-79 Exceptions, catching any, 3-55 Exceptions, catching multiple, 3-54 Exceptions, caution on use, 3-56 Exceptions, nally statement, 3-58 Exceptions, how to report errors, 3-57 Exceptions, ignoring, 3-55 Exceptions, list of built-in, 3-52 Exceptions, passed value, 3-53 Exceptions, propagation of, 3-51 Executing system commands, 11-5, 11-8 Execution model, 1-31 Execution of modules, 4-5 exists() function, os.path module, 4-40 exit() function, sys module, 3-59 Exploding heads, and exceptions, 3-56 Exponential notation, 1-52 Extended slicing of sequences, 2-32

F
False Value, 1-47 File globbing, 4-38 File system, copying and moving les, 4-42 File system, getting a directory listing, 4-38 File tests, 4-40 File, binary, 10-9 File, obtaining metadata, 4-41 Files, and for statement, 1-73 Files, opening, 1-72 Files, reading line by line, 1-73 __le__ attribute, of modules, 4-7 Filtering sequence data, 2-53 nally statement, 3-58 nd() method of strings, 9-6

nd() method, strings, 1-62 Finding a substring, 9-6 nditer() method, of regular expressions, 9-19 First class objects, 2-69 Float type, 1-46, 1-52 oat() function, 1-55, 4-25 Floating point numbers, 1-52 Floating point, accuracy, 1-53 Floor division operator, 1-51 __oordiv__() method, 5-26 For loop, and tuples, 2-41 For loop, keeping a loop counter, 2-39 for statement, 2-34 for statement, and les, 1-73 for statement, and generators, 8-17 for statement, and iteration, 8-3 for statement, internal operation of, 8-4 for statement, iteration variable, 2-35 Format codes, string formatting, 2-26 Format codes, struct module, 10-12 format() method of strings, 9-31 Formatted output, 2-24 format_exc() function, traceback module, 7-25 from module import *, 4-18 from statement, 4-17 from __future__ import, 1-51 Function, and generators, 8-17 Function, running in debugger, 7-32 Functions, 3-8, 3-9 Functions, accepting any combination of arguments, 3-39 Functions, anonymous with lambda, 3-45 Functions, argument passing, 3-15 Functions, benets of using, 3-8 Functions, Bottom-up style, 3-11 Functions, calling with positional arguments, 3-17 Functions, checking of return statement, 3-43 Functions, conformance checking, 3-41 Functions, default arguments, 3-16 Functions, dening, 1-76 Functions, denition order, 3-10 Functions, design of, 3-14 Functions, design of and global variables, 3-32 Functions, design of and side effects, 3-26 Functions, design of and transformation of inputs, 3-22 Functions, design of argument names, 3-20 Functions, design of input arguments, 3-21 Functions, documentation strings, 3-46 Functions, global variables, 3-29, 3-30

Functions, keyword arguments, 3-18 Functions, local variables, 3-28 Functions, mixing positional and keyword arguments, 3-19 Functions, multiple return values, 3-24 Functions, overloading (lack of), 3-12 Functions, side effects, 3-25 Functions, tuple and dictionary expansion, 3-40 Functions, variable number of arguments, 3-35, 3-36, 3-37, 3-38

H
hasattr() function, 5-31 Hash table, 2-13 Haskell, 1-5 Haskell, list comprehension, 2-56 has_key() method, of dictionaries, 2-20 Heavy wizardry, 6-42 help() command, 1-17 help() command, and documentation strings, 7-6 help() function, and documentation strings, 3-47 hex() function, 4-25 HTTP, simple web server, 12-18

G
Garbage collection, and cycles, 6-44, 6-45 Garbage collection, and __del__() method, 6-46, 6-47 Garbage collection, reference counting, 6-43 gc module, 6-45 Generating text, 9-23 Generator, 8-17, 8-18 Generator expression, 8-24, 8-25 Generator expression, efciency of, 8-27 Generator tricks presentation, 8-33 Generator vs. iterator, 8-22 Generator, and code reuse, 8-30 Generator, and StopIteration exception, 8-20 Generator, example of following a le, 8-21 Generator, use of, 8-29 getatime() function,, 4-41 getattr() function, 5-31 __getitem__() method, 5-24 getmtime() function, os.path module, 4-41 getoutput() function, commands module, 11-5 getsize() function, os.path module, 4-41 Getting a list of symbols, 1-81 Getting started, 1-8 glob module, 4-38 global statement, 3-31 Global variables, 3-27, 4-6 Global variables, accessing in functions, 3-29 Global variables, modifying inside a function, 3-30 group() method, of Match objects, 9-16 Groups, extracting text from in regular expressions, 9-18 Groups, in regular expressions, 9-17 GUI programming, event loop model, 4-54 GUI programming, with Tkinter, 4-51 Guido van Rossum, 1-3

I
I/O redirection, subprocess module, 11-21 Identiers, 1-33 IDLE, 1-10, 1-11, 1-16 IDLE, creating new program, 1-20, 1-21 IDLE, on Mac or Unix, 1-12 IDLE, running programs, 1-23 IDLE, saving programs, 1-22 IEEE 754, 1-52 if statement, 1-39 Ignoring an exception, 3-55 Immutable objects, 1-63 import statement, 1-77, 4-3 import statement, creation of .pyc and .pyo les, 4-11 import statement, from modier, 4-17 import statement, importing all symbols, 4-18 import statement, proper use of namespaces, 4-20 import statement, repeated, 4-8 import statement, search path, 4-9, 4-10 import statement, supported le types, 4-11 import, as modier, 4-15 import, use in extensible programs, 4-16 importing different versions of a library, 4-16 in operator, 5-25 in operator, dictionary, 2-20 in operator, lists, 1-69 in operator, strings, 1-60 Indentation, 1-36, 1-37 Indentation style, 1-38 index() method, strings, 1-62 Indexing of lists, 1-68 Innite data streams, 8-31 Inheritance, 5-11, 5-14

Inheritance example, 5-13 Inheritance, and object base, 5-16 Inheritance, and polymorphism, 5-19 Inheritance, and __init__() method, 5-17 Inheritance, implementation of, 6-19, 6-21 Inheritance, multiple, 5-20, 6-23 Inheritance, multiple inheritance, 6-22 Inheritance, organization of objects, 5-15 Inheritance, redening methods, 5-18 Inheritance, uses of, 5-12 INI les, parsing of, 4-48 Initialization of objects, 5-6 __init__() method, 6-41 __init__() method in classes, 5-6 __init__() method, and inheritance, 5-17 insert() method, of lists, 1-67 Inspecting modules, 1-81 Inspecting objects, 1-82 Instance data, 5-7 Instances, and __class__ attribute, 6-10 Instances, creating new, 5-5 Instances, modifying after creation, 6-15 Instances, representation of, 6-7, 6-8 int() function, 1-55, 4-25 Integer division, 1-51 Integer type, 1-46 Integer type, operations, 1-50 Integer type, precision of, 1-48 Integer type, promotion to long, 1-49 Interactive mode, 1-13, 1-14, 1-15 Interactive subprocesses, 11-24 Interpreter execution, 11-3 Interpreter prompts, 1-15 __invert__() method, 5-26 IronPython, 12-45 is operator, 2-62, 2-64 isalpha() method, strings, 1-62 isdigit() method, strings, 1-62 isdir() function, os.path module, 4-39, 4-40 isle() function, os.path module, 4-39, 4-40 isinstance() function, 2-71 islower() method, strings, 1-62 Item access methods, 5-24 Iterating over a sequence, 2-34 Iteration, 8-3 Iteration protocol, 8-4 Iteration variable, for loop, 2-35 Iteration, design of iterator classes, 8-14 Iteration, user dened, 8-8, 8-9

Iterator vs. generator, 8-22 itertools module, 8-32 __iter__() method, 8-4

J
Java, and Jython, 12-41 join() function, os.path module, 4-39 join() method, of strings, 9-25 join() method, strings, 1-62 Jython, 12-41

K
key argument of sort() method, 2-48 KeyboardInterrupt exception, 3-59 Keys, dictionary, 2-13 Keyword arguments, 3-18 Keywords, 1-34, 1-35 kill()function, os module, 11-15

L
lambda statement, 3-45 Lazy evaluation, 8-31 len() function, 2-29, 5-24 len() function, lists, 1-69 len() function, strings, 1-60 __len__() method, 5-24 Library modules, 1-77 Life cycle of objects, 6-40 Line continuation, 1-43 Lisp, 1-5 List, 2-29 List comprehension, 2-52, 2-53, 2-54 List comprehension uses, 2-55 List comprehensions and awk, 2-57 List concatenation, 1-67 List processing, 2-51 List replication, 1-69 List type, 1-67 List vs. Tuple, 2-12 list() function, 2-66, 2-67 List, extra memory overhead, 2-12 List, Looping over items, 2-34 List, sorting, 2-46 List, use as a container, 2-18 listdir() function, os module, 4-38

lists vs. arrays, 10-28 Lists, changing elements, 1-68 Lists, indexing, 1-68 Lists, removing items, 1-70 Lists, searching, 1-69 loads() function, pickle module, 4-45 Local variables, 3-27 Local variables in functions, 3-28 log() function, math module, 1-54 Long type, 1-46, 1-49 Long type, use with integers, 1-49 long() function, 1-55, 4-25 Looping over integers, 2-37 Looping over items in a sequence, 2-34 Looping over multiple sequences, 2-42 lower() method, strings, 1-61, 1-62 __lshift__() method, 5-26 lstrip() method, of strings, 9-5

Modules, search path, 4-9, 4-10 Modules, self-testing with doctest, 7-10 __mod__() method, 5-26 move() function, shutil module, 4-42 __mro__ attribute, of classes, 6-23 Multiple inheritance, 5-20, 6-22, 6-23 __mul__() method, 5-26

N
Namespace, 4-14 Namespaces, 4-4, 4-20 __name__ attribute, of modules, 4-7 Naming conventions, Python's reliance upon, 6-26 Negative indices, lists, 1-68 __neg__() method, 5-26 Network programming introduction, 12-15 Network programming, example of sockets, 12-16 __new__() method, 6-41, 6-42 next() method, and iteration, 8-4 next() method, of generators, 8-19 no-op statement, 1-42 None type, 2-4 None type, returned by functions, 3-23 not operator, 1-40 Null value, 2-4 Numeric conversion, 1-55 Numeric datatypes, 1-46 numpy, 10-29

M
Main program, 4-7 main() function, unittest module, 7-16 __main__, 4-7 __main__ module, 7-10 Match objects, re module, 9-16 match() method, of regular expression patterns, 9-14 Matching a regular expression, 9-14 math module, 1-54, 1-77, 4-33 Math operators, 5-26 Math operators, oating point, 1-54 Math operators, integer, 1-50 max() function, 2-33, 4-24 Memory efciency, and generators, 8-31 Memory management, reference counting, 2-62, 2-64 Memory use of tuple and list, 2-12 Method invocation, 5-29 Method, denition of, 5-3 Methods, calling other methods in the same class, 5-9 Methods, in classes, 5-8 Methods, private, 6-28 min() function, 2-33, 4-24 Modules, 4-3 modules variable, sys module, 4-8 Modules, as object, 4-13 Modules, dictionary, 4-14 Modules, execution of, 4-5 Modules, loading of, 4-8 Modules, namespaces, 4-4

O
object base class, 5-16 Object oriented programming, 5-3 Object oriented programming, and encapsulation, 6-24 Objects, attribute binding, 6-17, 6-20 Objects, attributes of, 5-7 Objects, creating containers, 5-24 Objects, creating new instances, 5-5 Objects, creating private attributes, 6-27 Objects, creation steps, 6-41 Objects, dening new, 5-4 Objects, rst class behavior, 2-69 Objects, inheritance, 5-11 Objects, invoking methods, 5-5 Objects, life cycle, 6-40 Objects, making a deep copy of, 2-68 Objects, memory management of, 6-43

Objects, method invocation, 5-29 Objects, modifying attributes of instances, 6-13 Objects, modifying instances, 6-15 Objects, multiple inheritance, 6-22 Objects, reading attributes, 6-16 Objects, representation of, 6-11 Objects, representation of instances, 6-7, 6-8 Objects, representation with dictionary, 6-6 Objects, saving with pickle, 4-44 Objects, serializing into a string, 4-45 Objects, single inheritance, 6-21 Objects, special methods, 5-22 Objects, type checking, 2-71 Objects, type of, 2-63 oct() function, 4-25 Old-style classes, 5-16 Online help, 1-17 open() function, 1-72 open() function, shelve module, 4-46, 4-47 Optimized mode, 7-23 Optimized mode (-O), 7-22 Optional features, dening function with, 3-18 Optional function arguments, 3-16 optparse module, 4-30 or operator, 1-40 ord() function, 4-25 __or__() method, 5-26 os module, 4-36, 11-6 os.path module, 4-39, 4-41 Output, print statement, 1-41 Overloading, lack of with functions, 3-12

P
pack() function, struct module, 10-11 Packing binary structures, 10-10, 10-11 Packing values into a tuple, 2-9 Parallel iteration, 2-42 Parsing, 9-3 pass statement, 1-42 path variable, sys module, 4-9, 4-10 Pattern syntax, regular expressions, 9-11 pdb module, 7-27 pdb module, commands, 7-29 Performance of string operations, 9-8 Performance statistics, prole module, 7-34 Perl, difference in string handling, 1-75 Perl, regular expressions and Python, 9-21 Perl, string interpolation and Python, 9-28

Persistent dictionary, 4-46 pexpect library, 11-24 pickle module, 4-44 pickle module, and strings, 4-45 Pipelines, and generators, 8-29 Pipes, subprocess module, 11-18 poll() method, Popen objects, 11-14 Polymorphism, and inheritance, 5-19 Popen() function, subprocess module, 11-8 popen2 module, 11-6 Positional function arguments, 3-17 Post-assertion, 7-21 pow() function, 1-50, 1-54, 4-24 Powers of numbers, 1-50 __pow__() method, 5-26 Pre-assertion, 7-21 Primitive datatypes, 2-3 print statement, 1-41, 4-32, 5-23 print statement, and les, 1-72 print statement, and str(), 1-64 print statement, trailing comma, 1-41 print, formatted output, 2-25 print_exc() function, traceback module, 7-25 Private attributes, 6-27 Private attributes, performance of name mangling, 6-29 Private methods, 6-28 prole module, 7-34 Proling, 7-34 Program exit, 3-59 Program structure, 3-6 Program structure, denition order, 3-7 Propagation of exceptions, 3-51 Properties, and encapsulation, 6-31 py les, 1-19 Python Documentation, 1-17 Python eggs packages, 4-71 Python inuences, 1-5 Python interpreter, 1-13 Python interpreter, keeping alive after execution, 7-26 Python interpreter, optimized mode, 7-22, 7-23 Python package index, 4-67 Python, extending with ctypes, 12-23 Python, reason created, 1-4 Python, running on command line, 1-24 Python, source les, 1-19 Python, starting on Mac, 1-12 Python, starting on Unix, 1-12 Python, starting on Windows, 1-11 Python, starting the interpreter, 1-9

Python, statement execution, 1-31 Python, uses of, 1-6, 1-7 Python, year created, 1-3 python.org website, 1-2 Pythonwin extension, 12-33, 12-34

R
raise Statement, 1-80 raise statement, 3-50 Raising exceptions, 1-80 random module, 4-34 Random numbers, 4-34 range() function, 2-38, 4-26 range() vs. xrange(), 2-38 Raw strings, 1-57 Raw strings, and regular expressions, 9-12 re module, 9-10 re module, compile() function, 9-13 re module, nd all occurrences of a pattern, 9-19 re module, pattern syntax, 9-11 Read-eval loop, 1-14 Reading attributes on objects, 6-16 readline() method, les, 1-72 Redening methods with inheritance, 5-18 Redening output le, 4-32 Redirecting print to a le, 1-72 Reference counting, 2-62, 2-64, 2-67, 6-43 Reference counting, containers, 2-66 register_function() method, of SimpleXMLRPCServer, 12-21 register_instance() method, of SimpleXMLRPCServer, 12-21 Regular expression syntax, 9-11 Regular expressions (see re), 9-10 Regular expressions, and Perl, 9-21 Regular expressions, compiling patterns, 9-13 Regular expressions, extracting text from numbered groups, 9-18 Regular expressions, Match objects, 9-16 Regular expressions, matching, 9-14 Regular expressions, numbered groups, 9-17 Regular expressions, obtaining matching text, 9-16 Regular expressions, pattern replacement, 9-20 Regular expressions, searching, 9-15 Relational database, interface to, 4-61 Relational operators, 1-40 remove() method, lists, 1-70 Repeated function denitions, 3-12

Repeated imports, 4-8 replace() method of strings, 9-7 replace() method, strings, 1-61, 1-62 Replacing a substring, 9-7 Replacing text, 1-61 Replication of sequences, 2-30 repr() function, 4-25, 5-23 Representation of strings, 1-58 representing binary data, 10-6 __repr__() method, 5-23 Reserved names, 1-34, 1-35 return statement, 3-23 return statement, multiple values, 3-24 returncode attribute, Popen objects, 11-13 reversed() function, 4-26 rnd() method, strings, 1-62 rindex() method, strings, 1-62 rmtree() function, shutil module, 4-42 round() function, 4-24 Rounding errors, oating point, 1-53 __rshift__() method, 5-26 rsplit() method of strings, 9-4 rstrip() method, of strings, 9-5 Ruby, string interpolation and Python, 9-28 run() function, pdb module, 7-33 runcall() function, pdb module, 7-33 runeval() function, pdb module, 7-33 Running Python, 1-9 Runtime error vs. compile-time error, 3-42

S
safe_substitute() method, of Template objects, 9-30 Sample Python program, 1-26 Scope of iteration variable in loops, 2-35 Scripting, 3-3 Scripting language, 1-3 Scripting, dened, 3-4 Scripting, problem with, 3-5 search() method, of regular expression patterns, 9-15 Searching for a regular expression, 9-15 self parameter of methods, 5-7, 5-8 Sequence, 2-29 Sequence sorting, 2-49 Sequence, concatenation, 2-30 Sequence, extended slicing, 2-32 Sequence, indexing, 2-29 Sequence, length, 2-29 Sequence, looping over items, 2-34

Sequence, replication, 2-30 Sequence, slicing, 2-31 Sequence, string, 1-58 Serializing objects into strings, 4-45 Set theory, list comprehension, 2-56 Set type, 2-22 set() function, 2-22 setattr() function, 5-31 __setitem__() method, 5-24 setUp() method, unittest module, 7-17 setup.py le, third party modules, 4-70 setuptools module, 4-71 set_trace() function, pdb module, 7-28 Shallow copy, 2-67 Shell operations, 4-42 shelve module, 4-46 shutil module, 4-42 Side effects, 3-25 SimpleHTTPServer module, 12-18 SimpleXMLRPCServer module, 12-20 sin() function, math module, 1-54 Single-quoted string, 1-56 Single-step execution, 7-32 Slicing operator, 2-31 __slots__ attribute of classes, 6-38 socket module, example of, 12-16 SocketServer module, example of, 12-17 sort() method of lists, 3-44 sort() method, of lists, 2-46, 2-48 sorted() function, 2-49, 4-26 Sorting lists, 2-46 Sorting with a key function, 2-48 Sorting, of any sequence, 2-49 Source les, 1-19 Source les, and modules, 4-3 Source listing, in debugger, 7-31 span) method, of Match objects, 9-16 Special methods, 1-82, 5-22 split() method, of strings, 9-4 split() method, strings, 1-62, 1-66 Splitting strings, 9-4 Splitting text, 1-66 SQL queries , how to form, 4-63 SQL queries, injection attack risk, 4-63 SQL queries, value substitutions in, 4-65 sqlite3 module, 4-62 sqrt() function, math module, 1-54 Stack trace, in debugger, 7-30 Standard I/O streams, 4-32

Standard library, 4-22 start() method, of Match objects, 9-16 startswith() method, strings, 1-62 Statements, 1-31, 3-6 Status codes, subprocesses, 11-12 stderr variable, sys module, 4-32 stdin variable, sys module, 4-32 stdout variable, sys module, 4-32 StopIteration exception, 8-5 StopIteration exception, and generators, 8-20 str() function, 1-64, 4-25, 5-23 String concatenation, performance of, 9-24 String format codes, 2-26 String formatting, 2-25 String formatting, with dictionary, 2-27, 9-29 String interpolation, 2-27, 9-28 String joining, 9-25 String joining vs. concatenation, 9-26 String manipulation, performance of, 9-8 String replacement, 9-7 String searching, 9-6 String splitting, 9-4 String stripping, 9-5 String templates, 9-30 String type, 2-29 StringIO objects, 9-27 Strings, concatenation, 1-59 Strings, conversion to, 1-64 Strings, conversion to numbers, 1-55, 1-75 Strings, escape codes, 1-56, 1-57 Strings, immutability, 1-63 Strings, indexing, 1-59 Strings, length of, 1-60 Strings, literals, 1-56 Strings, methods, 1-61, 1-62 Strings, raw strings, 1-57 Strings, redirecting I/O to, 9-27 Strings, replication, 1-60 Strings, representation, 1-58 Strings, searching for substring, 1-60 Strings, slicing, 1-59 Strings, splitting, 1-66 Strings, triple-quoted, 1-56 Strings, use of raw strings, 9-12 strip() method, of strings, 9-5 strip() method, strings, 1-61, 1-62 Stripping characters, 1-61 Stripping characters from strings, 9-5 struct module, 10-11

struct module, format modiers, 10-13 struct module, packing codes, 10-12 structures, creating with ctypes, 10-21 sub() method of regular expression patterns, 9-20 Subclass, 5-11 Subprocess, 11-4 subprocess module, 11-6, 11-7 subprocess module, capturing output, 11-17 subprocess module, capturing stderr, 11-20 subprocess module, changing the working directory, 11-11 subprocess module, collecting return codes, 11-13 subprocess module, I/O redirection, 11-21 subprocess module, interactive subprocesses, 11-24 subprocess module, pipes, 11-18 subprocess module, polling, 11-14 subprocess module, sending/receiving data, 11-18 subprocess module, setting environment variables, 11-10 Subprocess status code, 11-12 subprocess, killing with a signal, 11-15 substitute() method, of Template objects, 9-30 __sub__() method, 5-26 sum() function, 2-33, 4-24 Superclass, 5-11 Swig, 12-31 sys module, 4-27 System commands, executing as subprocess, 11-8 system() function, os module, 4-36, 11-5 SystemExit exception, 3-59

Tkinter module, 4-51 Tkinter module, sample widgets, 4-53 traceback module, 7-25 Traceback, uncaught exception, 7-24 Tracebacks, 1-80 Triple-quoted string, 1-56 True value, 1-47 Truncation, integer division, 1-51 Truth values, 1-40 try statement, 1-79, 3-50 Tuple, 2-6, 2-29 Tuple vs. List, 2-12 Tuple, immutability of, 2-8 Tuple, packing values, 2-9 Tuple, unpacking in for-loop, 2-41 Tuple, unpacking values, 2-10 Tuple, use of, 2-7 Tuple, using as function arguments, 3-40 Type checking, 2-71 Type conversion, 1-75 Type of objects, 2-63 type() function, 2-63, 2-71 typecodes, array module, 10-25 Types, 1-33 Types, numeric, 1-46 Types, primitive, 2-3

U
Unbound method, 5-30 Uniform access principle, 6-35 Unit testing, 7-12 unittest module, 7-12, 7-13 unittest module, example of, 7-14 unittest module, running tests, 7-16 unittest module, setup and teardown, 7-17 unpack() function, struct module, 10-11 Unpacking binary structures, 10-10, 10-11 Unpacking values from tuple, 2-10 upper() method, strings, 1-61, 1-62 urllib module, 1-77 urlopen() function, urllib module, 1-77 User-dened exceptions, 5-28

T
tan() function, math module, 1-54 TCP, server example, 12-16 tearDown() method, unittest module, 7-17 Template strings, 9-30 TestCase class, of unittest module, 7-13 Testing les, 4-40 testmod() function, doctest module, 7-8 Text parsing, 9-3 Text replacement, 1-61 Text strings, 1-58 Third party modules, 4-67, 4-68 Third party modules, and C/C++ code, 4-72 Third party modules, eggs format, 4-71 Third party modules, native installer, 4-69 Third party modules, setup.py le, 4-70 time module, 4-41

V
Value of exceptions, 3-53 Variable assignment, 2-61, 2-65

Variable assignment, assignment of globals in function, 3-31 Variable assignment, global vs. local, 3-27 Variable number of function arguments, 3-35, 3-36, 3-37, 3-38 Variables, and modules, 4-6 Variables, examining in debugger, 7-30 Variables, names of, 1-33 Variables, reference counting, 2-62, 2-64 Variables, scope in functions, 3-28, 3-29 Variables, type of, 1-33 vars() function, 9-29

W
wait() method, Popen objects, 11-13 walk() function, os module, 4-36 Walking a directory of les, 4-36 while statement, 1-36 Widgets, and Tkinter, 4-53 Windows, binary les, 10-9 Windows, killing a subprocess, 11-15 Windows, starting Python, 1-11 write() method, les, 1-72

X
XML overview, 12-5 XML parsing, extracting document elements, 12-9 XML parsing, extracting element text, 12-11 XML parsing, getting element attributes, 12-12 XML, ElementTree module, 12-7 XML, parsing with ElementTree module, 12-8 XML-RPC, 12-19 XML-RPC, server example, 12-20 xmlrpclib module, 12-20 __xor__() method, 5-26 xrange() function, 2-37, 4-26 xrange() vs. range(), 2-38

Z
zip() function, 2-42, 2-43, 2-44, 4-26

Вам также может понравиться