Introduction to Python

Ilan Schnell

December 19, 2006 - Madison, WI

http://ilan.schnell-web.net/prog/pythonintro/


What is Python?

Python is an interpreted, interactive language.

Official site: http://www.python.org/

To get started read the tutorial by Guido van Rossum (creator of the Python programming language).

Multi-paradigm:

Comes with many modules for: XML parsing, operating on tree objects, compression, network programming, regular expressions, URL parsing, hash functions, ...

Since python is interpreted it is slow, however critical parts of a program can be implemented in C, using SWIG if you want.

Python is pre-installed on many Linux distributions, however python is platform independent.

Installation (as usual): ./configure; make; su; make install

My examples have been tested with Python 2.5 (current version).

From the command line:

$ python -c "print 'Hello World!'"

Create a file hello.py:

print 'Hello World!'

and then (from command line):

$ python hello.py

Add at the top of your file: (and make it executable)

#!/usr/bin/python

and then (from command line):

$ ./hello.py

Interactive python shell:

$ python

now you can enter any python code:

>>> print 'Hello World!'
Hello World!
>>> (352+817)*781
912989

Most important Built-in Types

Python a is dynamically typed language, which means that values have types (not variables).

>>> x=3.5
>>> type(x)
<type 'float'>
>>> x='Hello'
>>> type(x)
<type 'str'>
>>> x=[ 37, 'a', 1.24, True, [12, 23], {'a':3,'b':4} ]
>>> type(x)
<type 'list'>
>>> type(10**100)
<type 'long'>

Interactive (example) session:

>>> p='the five boxing wizards jump quickly'
>>> p
'the five boxing wizards jump quickly'
>>> len(p)
36
>>> len(set(p))
27
>>> p.count(' ')
5
>>> words = p.split()
>>> words
['the', 'five', 'boxing', 'wizards', 'jump', 'quickly']
>>> len(words)
6
>>> words[3]
'wizards'
>>> words[:3]
['the', 'five', 'boxing']
>>> '--'.join(words)
'the--five--boxing--wizards--jump--quickly'
>>> p.replace(' ','--')
'the--five--boxing--wizards--jump--quickly'
>>> p.find('wiz')
16
>>> p.find('fox')
-1

Simple File Parsing:

Suppose we want parse the /etc/passwd file,

root:x:0:0:root:/root:/bin/bash
sshd:x:71:65:SSH daemon:/var/lib/sshd:/bin/false
ilan:x:500:100:Ilan Schnell:/home/ilan:/bin/tcsh
...

and create a dictionary of the uid mapping to the username.

#!/usr/bin/python

uidmap={}
for line in file('/etc/passwd'):
    lst = line.split(':')
    uidmap[lst[2]] = lst[0]

while True:
    print uidmap[raw_input('Please enter uid: ')]

Program in action:

$ ./uidmap.py 
Please enter uid: 500
ilan
Please enter uid: 71
sshd

Whitespace is significant

if 1+2==3:
  print 'Hello'
  if True:
       x=4.2
       print x+1.2
  else:
   msg='Not true'
   print msg

Python forces you to use the indentation that you would use anyway.

Python does not allow to obfuscate the structure of a program by using bogus indentations.

Have you ever seen code like this in C or C++?

/*  Warning: obfuscated C code! */
if( i == j )
    if( k == 0 )
        B[i][i] = 1.0 / A[i][i];
else
    B[i][j] = 4*k;

Handling Exceptions

It is possible to write programs that handle exceptions.

#!/usr/bin/python
import sys

def usage():
    print 'Usage: %s <file>' % sys.argv[0]
    sys.exit(2)

if len(sys.argv) != 2:
    usage()

filename = sys.argv[1]

try:
    f = open(filename)
    print 'Opening `%s` for reading...' % filename
except:
    print 'Could not open `%s` for reading.' % filename
    print 'Using default file instead.'
    try:
        f = open('default.txt')
    except:
        print 'Could not open `default.txt` either.  Exiting'
        sys.exit(1)

print f.read()
f.close()

Importing modules

The many modules can easily be imported.

>>> import math
>>> math.sqrt(2.0)
1.4142135623730951

Import all objects from a module into current scope:

>>> from math import *
>>> sqrt(2.0)*cos(pi/4)
1.0000000000000002

Only import certain objects from a module:

>>> from math import exp, pi
>>> exp(pi)
23.140692632779267
>>> cos(pi)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'cos' is not defined

Aliasing:

>>> from math import pi, sqrt as rt
>>> 1/rt(2*pi)
0.3989422804014327

Functions

Lets take a look at a simple function:

>>> def sum(x, y):
...     return x+y
... 
>>> sum(5, 3)
8
>>> sum('a', 'bc')
'abc'
>>> sum('a', 5)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 2, in sum
TypeError: cannot concatenate 'str' and 'int' objects

What if the function should handle this case as well?

>>> from types import *
>>> def sum(x, y):
...     if type(x)==StringType and type(y)==IntType:
...         return x+str(y)
...     return x+y
... 
>>> sum('a', 5)
'a5'
>>> sum('Hello ', 'World!')
'Hello World!'
>>> sum(1100, 325)
1425

Regular Expressions

The re module and provides Perl-style regular expression patterns.

For an introduction, I recommend A.M. Kuchling's howto.

Say we wish to have a program that tells us whether an email address (in its first argument) is syntactically valid.

#!/usr/bin/python
import sys, re

pattern = re.compile(r'^[\w.-]+@[\w.-]+\.[A-Z]{2,4}$', re.I)

if pattern.match(sys.argv[1]):
    print "OK"
else:
    print "Something is wrong with this email address."

Here is how you invoke the program:

$ ./regex.py guido@google.com
OK
$ ./regex.py george.whitehouse.gov
Something is wrong with this email address.

Checking Examples in Source Code

The doctest module allows you to check examples in your source code.

Consider the following code fib.py:

def fib(n):
    """Returns Fibonacci series up to n:
    >>> fib(100)
    [1, 2, 3, 5, 8, 13, 21, 34, 55, 89]
    """
    res, a, b = [], 1, 1
    while b <= n:
        res.append(b)
        a, b = b, a+b
    return res

if __name__ == "__main__":
    import doctest
    doctest.testmod()

When you run

$ python fib.py

the example inside the source code will be verified.

>>> from fib import *
>>> fib(100)
[1, 2, 3, 5, 8, 13, 21, 34, 55, 89]

Basic XML parsing:

Print the values of src values for all img elements found in a well-formed XHTML document retrieved from the internet.

#!/usr/bin/python
from urllib2 import urlopen
from xml.dom.minidom import parse

dom = parse(urlopen('http://schnell-web.net/'))

for elt in dom.getElementsByTagName('img'):
    print elt.getAttribute('src')

Output:

icon/welcome.gif
imag/tove.jpg
imag/wolf.jpg
imag/arvin.jpg
imag/ilan.jpg

The exec statement

Executes a string of a python code.

Suppose you which to write a program which reads a config-file ~/.myprogrc:

# Configuration file for my progamm
verbose  = True
errorlog = "/var/log/myprogram/error.log"
colors   = { "fore" : "#ae2020",
             "back" : "#f7d8c2" }

Instead of scanning, parsing and analyzing the config-file yourself, write the config-file in python syntax and let python do all the work for you:

#!/usr/bin/python
import os.path
exec(file(os.path.expanduser('~/.myprogrc')))
if verbose:
    print 'Welcome to my program!'
    print 'Background-color is:', colors['back']

Running the program in a shell:

$ ./myprog.py
Welcome to my program!
Background-color is: #f7d8c2

Network programming in python

Simple HTTP Server

What our server does?

Browser when connecting to our server:

Browser
#!/usr/bin/python
from SocketServer import BaseRequestHandler, TCPServer
import sys

if len(sys.argv) != 2:
    print 'USAGE: %s <port>' % sys.argv[0]
    sys.exit(2)

def htmlSave(s):
    htmlentList = [ ('&','&amp;'), ('<','&lt;'), ('>','&gt;') ]
    return reduce( lambda c, (a,b): c.replace(a,b), htmlentList, s )

class RequestHandler(BaseRequestHandler):
    def handle(self):
        print "Client connected:", self.client_address
        
        content="""<html><head><title>HTTP Server</title></head><body>
<h2>Information about this HTTP/TCP request:</h2>
<p>IP: <b>%s</b><br />
 Port: <b>%s</b></p>
<h4>Message:</h4>
<pre style='border: solid 1px #bdf; background-color: #def;'>%s</pre>
</body></html>
""" % ( self.client_address[0],
        self.client_address[1],
        htmlSave(self.request.recv(2**16).replace('\r','<CR>')) )
        
        self.request.sendall( """HTTP/1.0 200 OK\r
Content-Type: text/html; charset=utf-8\r
Content-Length: %s\r
\r\n%s""" % ( len(content), content ))
        
        self.request.close()

TCPServer(('',int(sys.argv[1])), RequestHandler).serve_forever()

Generators

Generators are a special class of functions. Regular functions compute an object and return it, but generators return an stream of objects.

def fib():
    a, b = 1, 1
    while True:
        a, b = b, a+b
        yield a

The above function will return an infinite stream of Fibonacci numbers:

>>> i = fib()
>>> i.next()
1
>>> i.next()
2
>>> i.next()
3
>>> i.next()
5
>>> for n in i:
...     print n,
... 
8 13 21 34 55 89 144 233 377 610 987 1597 2584 4181 6765 10946 17711 28657
46368 75025 121393 196418 317811 514229 832040 1346269 2178309 3524578 5702887
9227465 14930352 24157817 39088169 63245986 102334155 ...

More on functions

In python functions are ordinary objects. Therefore, a function can take functions as input, or return new functions, or both.

The function integral takes another function as an argument:

def integral(f, a, b):
    N = 1000
    dx = (b-a)/N
    return sum(f(a+dx*n) for n in xrange(N))*dx

The function mkAdder takes an integer an returns a function:

def mkAdder(n):
    def add(i):
        return i+n
    return add

add3 = mkAdder(3)
print add3(2)       #   5
print add3(8)       #  11

print mkAdder(5)(4) #   9

The last line may seem peculiar in that a single function (+) of two arguments (the two numbers being added) is written in terms of two functions which take one argument each.

Higher order functions

A higher-order function takes functions as input and returns new functions.

def compliment(f):
    """
    Returns the compliment (which is a function) of function f.
    """
    def res(*args):
        return not f(*args)
    return res

def even(n):
    return bool((n)%2 == 0)

odd = compliment(even)

Note that the argument of compliment is not limited to a function which only takes one argument.

Without the higher-order function compliment, we would have to write:

def odd(n):
    return not even(n)

Summary

Python