Multi dimensional (axial) slicing in Python

(Ilan Schnell, April 2008)

Recently, I came across numpy which supports working with multidimensional arrays in Python. What exactly is a multidimensional array?

Consider a vector in three dimensional space represented as a list, e.g. v=[8, 5, 11]. This is a one dimensional array, since there is only one index, that means that every element can be accessed with one index, i.e. v[i] is sufficient to access all elements and i is the index. Even a vector in 11 dimensional space is a one dimensional array.

Obviously, this terminology is confusing, since the term dimension can refer either to the space or the array. Therefore, one uses the term axis when referring to dimensions of an array. The number of axes is called rank. For the example above, we can say that the vector in three dimensional space is represented by an array with one axis, an array with rank one.

Having a precise terminology, we can move on to arrays with rank two. For example a 3x3 matrix which describes a rotation of a vector in three dimensional space an be represented as an array of rank two. In mathematical notation, we would refer to the matrix elements as Aij. In plain Python, we can represent such a matrix as a list of lists:

>>> A = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
>>> i, j = 1, 2
>>> A[i][j]

This works as long as we restrict ourselves to individual elements in the matrix, but suppose we which to extract the 2x2 sub matrix consisting of the elements 5, 6, 8, 9 using slicing. We wish we could simply type:

>>> A[1:3][1:3]   # We want to get [[5, 6], [8, 9]]
[[7, 8, 9]]

Using single axis slicing, it is impossible to arrive at the desired result with a single small expression. Therefore Python supports the syntax for multi axis slicing, that is we can write expression like A[1:3, 1:3] without getting an error message. However, Python does not come with multi axis arrays, it only supports the syntax. To understand better in which way Pythons supports multi axis slicing syntax, the following tiny class to exposes the arguments passed to the __getitem__ method.

>>> class Foo:
...   def __getitem__(self, *args):
...     print args
>>> x = Foo()
>>> x[1]
>>> x[1:]
(slice(1, 2147483647, None),)
>>> x[1:, :]
((slice(1, None, None), slice(None, None, None)),)
>>> x[1:, 20:10:-2, ...]
((slice(1, None, None), slice(20, 10, -2), Ellipsis),)

[Note the different result for the first axis in x[1:] and x[1:, :]. The slice object slice(1, 2147483647, None) only occurs when a the method is called using a single axis. My assumption is that this was the behavior when slices were first implemented into Python, and for backwards compatibility reasons, is was not possible to change this. I tried the same thing in Python 3.0 and the behavior is consistent for single and multiple axes slices, i.e. the slice object is always slice(1, None, None), as it should be.]

Now it is the responsibility of a the multi axis array library to do something useful with these arguments. numpy uses these arguments in to return the desired result:

>>> from numpy import *
>>> B = array(A)
>>> B
array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])
>>> B[1:3, 1:3]
array([[5, 6],
       [8, 9]])
>>> B[1:3, 1:3].tolist()
[[5, 6], [8, 9]]

Since python does not come with multi axis arrays, the slice and the Ellipsis object are almost never used explicitly in Python programs. Nevertheless, in the following example, a list and a slice object are passed to a generator which iterates over the sliced list:

>>> def mygen(lst, s=slice(None)):
...   for e in lst[s]:
...     yield e
>>> for n in mygen(range(10)): print n,
0 1 2 3 4 5 6 7 8 9
>>> for n in mygen(range(10), slice(7)): print n,
0 1 2 3 4 5 6
>>> for n in mygen(range(10), slice(8, 1, -2)): print n,
8 6 4 2

Using numpy, an explicit slice object can be very useful. In the following real world example (using numpy), I had to build a sub matrices. Instead of writing

A = S[j2:2*j+1+j2, j2:2*j+1+j2]

where j2 and j are integers, one can write

s = slice(j2, 2*j+1+j2)
A = S[s, s]