Introduction to Python
Ilan Schnell
December 19, 2006 - Madison, WI
Python is an interpreted, interactive language.
Official site: http://www.python.org/
To get started read the tutorial by Guido van Rossum (creator of the Python programming language).
Multi-paradigm:
Comes with many modules for: XML parsing, operating on tree objects, compression, network programming, regular expressions, URL parsing, hash functions, ...
Since python is interpreted it is slow, however critical parts of a program can be implemented in C, using SWIG if you want.
Python is pre-installed on many Linux distributions, however python is platform independent.
Installation (as usual): ./configure; make; su; make install
My examples have been tested with Python 2.5 (current version).
From the command line:
$ python -c "print 'Hello World!'"
Create a file hello.py
:
print 'Hello World!'
and then (from command line):
$ python hello.py
Add at the top of your file: (and make it executable)
#!/usr/bin/python
and then (from command line):
$ ./hello.py
Interactive python shell:
$ python
now you can enter any python code:
>>> print 'Hello World!'
Hello World!
>>> (352+817)*781
912989
Python a is dynamically typed language, which means that values have types (not variables).
>>> x=3.5
>>> type(x)
<type 'float'>
>>> x='Hello'
>>> type(x)
<type 'str'>
>>> x=[ 37, 'a', 1.24, True, [12, 23], {'a':3,'b':4} ]
>>> type(x)
<type 'list'>
>>> type(10**100)
<type 'long'>
>>> p='the five boxing wizards jump quickly'
>>> p
'the five boxing wizards jump quickly'
>>> len(p)
36
>>> len(set(p))
27
>>> p.count(' ')
5
>>> words = p.split()
>>> words
['the', 'five', 'boxing', 'wizards', 'jump', 'quickly']
>>> len(words)
6
>>> words[3]
'wizards'
>>> words[:3]
['the', 'five', 'boxing']
>>> '--'.join(words)
'the--five--boxing--wizards--jump--quickly'
>>> p.replace(' ','--')
'the--five--boxing--wizards--jump--quickly'
>>> p.find('wiz')
16
>>> p.find('fox')
-1
Suppose we want parse the /etc/passwd
file,
root:x:0:0:root:/root:/bin/bash
sshd:x:71:65:SSH daemon:/var/lib/sshd:/bin/false
ilan:x:500:100:Ilan Schnell:/home/ilan:/bin/tcsh
...
and create a dictionary of the uid
mapping to the username.
#!/usr/bin/python
uidmap={}
for line in file('/etc/passwd'):
lst = line.split(':')
uidmap[lst[2]] = lst[0]
while True:
print uidmap[raw_input('Please enter uid: ')]
Program in action:
$ ./uidmap.py
Please enter uid: 500
ilan
Please enter uid: 71
sshd
Not in general, only the indentation level of your statements is significant.
Also, the exact amount of indentation doesn't matter at all, but only the relative indentation of nested blocks.
if 1+2==3:
print 'Hello'
if True:
x=4.2
print x+1.2
else:
msg='Not true'
print msg
Python does not allow to obfuscate the structure of a program by using bogus indentations.
Have you ever seen code like this in C or C++?
/* Warning: obfuscated C code! */
if( i == j )
if( k == 0 )
B[i][i] = 1.0 / A[i][i];
else
B[i][j] = 4*k;
It is possible to write programs that handle exceptions.
#!/usr/bin/python
import sys
def usage():
print 'Usage: %s <file>' % sys.argv[0]
sys.exit(2)
if len(sys.argv) != 2:
usage()
filename = sys.argv[1]
try:
f = open(filename)
print 'Opening `%s` for reading...' % filename
except:
print 'Could not open `%s` for reading.' % filename
print 'Using default file instead.'
try:
f = open('default.txt')
except:
print 'Could not open `default.txt` either. Exiting'
sys.exit(1)
print f.read()
f.close()
The many modules can easily be imported.
>>> import math
>>> math.sqrt(2.0)
1.4142135623730951
Import all objects from a module into current scope:
>>> from math import *
>>> sqrt(2.0)*cos(pi/4)
1.0000000000000002
Only import certain objects from a module:
>>> from math import exp, pi
>>> exp(pi)
23.140692632779267
>>> cos(pi)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'cos' is not defined
Aliasing:
>>> from math import pi, sqrt as rt
>>> 1/rt(2*pi)
0.3989422804014327
Lets take a look at a simple function:
>>> def sum(x, y):
... return x+y
...
>>> sum(5, 3)
8
>>> sum('a', 'bc')
'abc'
>>> sum('a', 5)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 2, in sum
TypeError: cannot concatenate 'str' and 'int' objects
What if the function should handle this case as well?
>>> from types import *
>>> def sum(x, y):
... if type(x)==StringType and type(y)==IntType:
... return x+str(y)
... return x+y
...
>>> sum('a', 5)
'a5'
>>> sum('Hello ', 'World!')
'Hello World!'
>>> sum(1100, 325)
1425
The re module and provides Perl-style regular expression patterns.
For an introduction, I recommend A.M. Kuchling's howto.
Say we wish to have a program that tells us whether an email address (in its first argument) is syntactically valid.
#!/usr/bin/python
import sys, re
pattern = re.compile(r'^[\w.-]+@[\w.-]+\.[A-Z]{2,4}$', re.I)
if pattern.match(sys.argv[1]):
print "OK"
else:
print "Something is wrong with this email address."
Here is how you invoke the program:
$ ./regex.py guido@google.com
OK
$ ./regex.py george.whitehouse.gov
Something is wrong with this email address.
The doctest module allows you to check examples in your source code.
Consider the following code fib.py
:
def fib(n):
"""Returns Fibonacci series up to n:
>>> fib(100)
[1, 2, 3, 5, 8, 13, 21, 34, 55, 89]
"""
res, a, b = [], 1, 1
while b <= n:
res.append(b)
a, b = b, a+b
return res
if __name__ == "__main__":
import doctest
doctest.testmod()
When you run
$ python fib.py
the example inside the source code will be verified.
>>> from fib import *
>>> fib(100)
[1, 2, 3, 5, 8, 13, 21, 34, 55, 89]
Print the values of src
values for all img
elements found
in a well-formed XHTML document retrieved from the internet.
#!/usr/bin/python
from urllib2 import urlopen
from xml.dom.minidom import parse
dom = parse(urlopen('http://schnell-web.net/'))
for elt in dom.getElementsByTagName('img'):
print elt.getAttribute('src')
Output:
icon/welcome.gif
imag/tove.jpg
imag/wolf.jpg
imag/arvin.jpg
imag/ilan.jpg
exec
statementExecutes a string of a python code.
Suppose you which to write a program which reads a config-file ~/.myprogrc
:
# Configuration file for my progamm
verbose = True
errorlog = "/var/log/myprogram/error.log"
colors = { "fore" : "#ae2020",
"back" : "#f7d8c2" }
Instead of scanning, parsing and analyzing the config-file yourself, write the config-file in python syntax and let python do all the work for you:
#!/usr/bin/python
import os.path
exec(file(os.path.expanduser('~/.myprogrc')))
if verbose:
print 'Welcome to my program!'
print 'Background-color is:', colors['back']
Running the program in a shell:
$ ./myprog.py
Welcome to my program!
Background-color is: #f7d8c2
Python has a variety of modules available for network programming.
More information about programming IP Sockets.
We make use of the module SocketServer.
Handle the HTTP request by sending back an HTML page.
This HTML page contains:
IP address and port number of the client connecting.
The original HTTP request.
#!/usr/bin/python
from SocketServer import BaseRequestHandler, TCPServer
import sys
if len(sys.argv) != 2:
print 'USAGE: %s <port>' % sys.argv[0]
sys.exit(2)
def htmlSave(s):
htmlentList = [ ('&','&'), ('<','<'), ('>','>') ]
return reduce( lambda c, (a,b): c.replace(a,b), htmlentList, s )
class RequestHandler(BaseRequestHandler):
def handle(self):
print "Client connected:", self.client_address
content="""<html><head><title>HTTP Server</title></head><body>
<h2>Information about this HTTP/TCP request:</h2>
<p>IP: <b>%s</b><br />
Port: <b>%s</b></p>
<h4>Message:</h4>
<pre style='border: solid 1px #bdf; background-color: #def;'>%s</pre>
</body></html>
""" % ( self.client_address[0],
self.client_address[1],
htmlSave(self.request.recv(2**16).replace('\r','<CR>')) )
self.request.sendall( """HTTP/1.0 200 OK\r
Content-Type: text/html; charset=utf-8\r
Content-Length: %s\r
\r\n%s""" % ( len(content), content ))
self.request.close()
TCPServer(('',int(sys.argv[1])), RequestHandler).serve_forever()
Generators are a special class of functions. Regular functions compute an object and return it, but generators return an stream of objects.
def fib():
a, b = 1, 1
while True:
a, b = b, a+b
yield a
The above function will return an infinite stream of Fibonacci numbers:
>>> i = fib()
>>> i.next()
1
>>> i.next()
2
>>> i.next()
3
>>> i.next()
5
>>> for n in i:
... print n,
...
8 13 21 34 55 89 144 233 377 610 987 1597 2584 4181 6765 10946 17711 28657
46368 75025 121393 196418 317811 514229 832040 1346269 2178309 3524578 5702887
9227465 14930352 24157817 39088169 63245986 102334155 ...
In python functions are ordinary objects. Therefore, a function can take functions as input, or return new functions, or both.
The function integral
takes another function as an argument:
def integral(f, a, b):
N = 1000
dx = (b-a)/N
return sum(f(a+dx*n) for n in xrange(N))*dx
The function mkAdder
takes an integer an returns a function:
def mkAdder(n):
def add(i):
return i+n
return add
add3 = mkAdder(3)
print add3(2) # 5
print add3(8) # 11
print mkAdder(5)(4) # 9
The last line may seem peculiar in that a single function (+) of two arguments (the two numbers being added) is written in terms of two functions which take one argument each.
A higher-order function takes functions as input and returns new functions.
def compliment(f):
"""
Returns the compliment (which is a function) of function f.
"""
def res(*args):
return not f(*args)
return res
def even(n):
return bool((n)%2 == 0)
odd = compliment(even)
Note that the argument of compliment
is not limited to a function
which only takes one argument.
Without the higher-order function compliment
,
we would have to write:
def odd(n):
return not even(n)
is an interpreted, mature and powerful high level language.
is easy to learn.
comes with has a large number of useful modules.
has many API's available. E.g. for connecting python to:
databases
web servers
is very attractive for
rapid application development.
use as a scripting or glue language to connect existing components together.
runs on all major platforms, and can be freely distributed.