The NumPy array: a structure for efficient numerical computation

HAL Id: inria-00564007

https://inria.hal.science/inria-00564007

Submitted on 7 Feb 2011

HAL is a multi-disciplinary open access

archive for the deposit and dissemination of sci-

entic research documents, whether they are pub-

lished or not. The documents may come from

teaching and research institutions in France or

abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est

destinée au dépôt et à la diusion de documents

scientiques de niveau recherche, publiés ou non,

émanant des établissements d’enseignement et de

recherche français ou étrangers, des laboratoires

publics ou privés.

The NumPy array: a structure for ecient numerical

computation

Stefan van Der Walt, S. Chris Colbert, Gaël Varoquaux

To cite this version:

Stefan van Der Walt, S. Chris Colbert, Gaël Varoquaux. The NumPy array: a structure for ef-

cient numerical computation. Computing in Science and Engineering, 2011, 13 (2), pp.22-30.

�10.1109/MCSE.2011.37�. �inria-00564007�

The NumPy array: a structure for eﬃcient

numerical computation

St´efan van der Walt, Stellenbosch University South Africa

S. Chris Colbert, Enthought USA

Gael Varoquaux, INRIA Saclay France

This article is published in IEEE Computing in Science and Engineering. Please refer to

∗

In the Python world, NumPy arrays are the stan-

dard representation for numerical data. Here, we

show how these arrays enable eﬃcient implemen-

tation of numerical computations in a high-level

language. Overall, three techniques are applied

to improve performance: vectorizing calculations,

avoiding copying data in memory, and minimizing

operation counts.

We ﬁrst present the NumPy array structure, then

show how to use it for eﬃcient computation, and

ﬁnally how to share array d ata with other li-

braries.

Introduction

The Python programming language provides a

rich set of high-level data structures: lists for enu-

merating a collection of objects, dictionaries to

build hash tables, etc. However, these structures

are not ideally suited to high-performance numer-

ical computation.

In the mid-90s, an international team of volun-

teers started to develop a data-structure for eﬃ-

cient array computation. This structure evolved

into what is now known as the N-dimensional

NumPy array.

The NumPy package, which comprises the

NumPy array as well as a set of accompanying

mathematical functions, has found wide-spread

adoption in academia, national laboratories, and

industry, with applications ranging from gaming

to space exploration.

∗

Personal use of this material is permitted. Permission

from IEEE must be obtained for all other users, includ-

ing reprinting/ republishing this material for advertising

or promotional purposes, creating new collective works for

resale or redistribution to servers or lists, or reuse of any

copyrighted components of this work in other works.

A NumPy array is a multidimensional, uniform

collection of elements. An array is character-

ized by the type of elements it contains and by

its shape. For example, a matrix may be repre-

sented as an array of shape (M × N) that contains

numbers, e.g., ﬂoating point or complex numbers.

Unlike matrices, NumPy arrays can have any

dimensionality. Furthermore, they may contain

other kinds of elements (or even combinations of

elements), such as booleans or dates.

Underneath the hood, a NumPy arr ay is really

just a convenient way of describing one or more

blocks of compu ter memory, so that the numbers

represented may be easily manipulated.

Basic usage

Throughout the code examples in the article,

we assume that NumPy is imported as follows:

import numpy as np

Code snippets are shown as they appear inside an

[

IPython] prompt, such as this:

In [3]: np.__version__

Out [3]: ’1.4.1’

Elements contained in an array can b e indexed us-

ing the [] operator. In addition, parts of an array

may be retrieved using standard Python slicing

of the form start:stop:step. For instance, the

ﬁrst two rows of an array x are given by x[:2, :]

or columns 1 through 3 by x[:, 1:4]. Similarly,

every second r ow is given by x[::2, :]. Note

that Python uses zero-based indexing.

The structure of a NumPy array: a view on

memory

A NumPy array (also called an “ndarray”, short

for N-dimensional array) describes memory, usin g

the f ollowing attributes:

Data pointer the memory address of the ﬁ rst

byte in the array.

Data type description the kind of elements con-

tained in the array, for example ﬂoating point

numbers or integers.

Shape the shape of the array, for example (10,

10) f or a ten-by-ten array, or (5, 5, 5) for a

block of data describing a mesh grid of x-, y-

and z-coord inates.

Strides the number of bytes to skip in memory to

proceed to the next element. For a (10, 10)

array of bytes, for example, the strides may be

(10, 1), in other words: proceed one byte to

get to the next column and ten bytes to locate

the next row.

Flags which deﬁne whether we are allowed to

modify the array, whether memory layout is

C- or Fortran-contiguous

, and so forth.

NumPy’s strided memory model deserves

particular attention, as it provides a pow-

erful way of viewing the same memory in

diﬀerent ways without copying data. For

example, consider the following integer array:

# Generate the integers from zero to eight and

# re pack them into a 3x3 array

In [1]: x = np.arange(9).reshape((3, 3))

In [2]: x

Out[2]:

array([[0, 1, 2],

[3, 4, 5],

[6, 7, 8]])

In [3]: x.strides

Out[3]: (24, 8)

On our 64-bit system, the default integer

data-type occupies 64-bits, or 8 bytes, in mem-

ory. The strid es therefore describe skipping

3 integers in memory to get to the next row

and one to get to the next column. We can

In C, memory is laid out in “row major” order, i.e.,

rows are stored one after another in memory. In Fortran,

columns are stored successively.

now generate a view on the same memory

where we only examine every second element.

In [4]: y = x[::2, ::2]

In [5]: y

Out[5]:

array([[0, 2],

[6, 8]])

In [6]: y.strides

Out[6]: (48, 16)

The arrays x and y point to the same memory

(i.e., if we modify the values in y we also modify

those in x), but the strides for y have been

changed so that only every second element is

seen along either axis. y is said to be a view on x:

In [7]: y[0, 0] = 100

In [8]: x

Out[8]:

array([[100, 1, 2],

[ 3, 4, 5],

[ 6, 7, 8]])

Views need not be created u sing slicing only.

By modifying str ides, for example, an array

can be transposed or reshaped at zero cost (no

memory needs to be copied). Moreover, the

strides, shape and dtype attributes of an array

may be speciﬁed manually by the user (provided

they are all compatible), enabling a plethora of

ways in which to interpret the underlying data.

# Transpose the array, using the shorthand "T"

In [9]: xT = x.T

In [10]: xT

Out[10]:

array([[100, 3, 6],

[ 1, 4, 7],

[ 2, 5, 8]])

In [11]: xT.strides

Out[11]: (8, 24)

# Change the shape of the array

In [12]: z = x.reshape((1, 9))

In [13]: z

Out[13]: array([[100, 1, 2, 3, 4, 5, 6, 7, 8]])

In [14]: z.strides

Out[14]: (72, 8)

# i.e., for the two-dimensional z, 9 * 8 bytes

# to skip over a row of 9 uint8 elements,

# 8 bytes to skip a single element

2 IEEE Computing in science and Engineering

# View data as bytes in memory rather than

# 64bit integers

In [15]: z.dtype = np.dtype(’uint8’)

In [16]: z

Out[17]:

array([[100, 0, 0, ...,

0, 0, 0, 0, 0, 0]], dtype=uint8)

In [18]: z.shape

Out[19]: (1, 72)

In [20]: z.strides

Out[20]: (72, 1)

In each of these cases, the resulting array points to

the same memory. The diﬀerence lies in the way

the data is interpreted, based on shape, strides

and data-type. Since no data is copied in memory,

these operations are extremely eﬃcient.

Numerical operations on arrays: vectoriza-

tion

In any scripting language, un judicious use of for-

loops may lead to poor performance, particularly

in the case where a simple computation is applied

to each element of a large data-set.

Grouping these element-wise operations together,

a process known as vectorisation, allows NumPy

to perform such computations much more rapidly.

Suppose we have a vector a and wish to

multiply its magnitude by 3. A traditional

for-lo op approach would look as follows:

In [21]: a = [1, 3, 5]

In [22]: b = [3*x for x in a]

In [23]: b

Out[23]: [3, 9, 15]

The vectorized approach applies this operation to

all elements of an array:

In [24]: a = np.array([1, 3, 5])

In [25]: b = 3 * a

In [26]: b

Out[26]: array([ 3, 9, 15])

Vectorized operations in NumPy are implemented

in C, resulting in a signiﬁcant speed improvement.

Operations are not restricted to interactions be-

tween scalars and arrays. For example, here

NumPy performs a fast element-wise subtraction

of two arr ays:

In [27]: b - a

Out [27]: array([2, 6, 10])

When the shapes of the two arguments are

not the same, but share a common shape di-

mension, the operation is broadcast across the

array. In other words, NumPy expands the

arrays s uch that the operation becomes viable:

In [28]: m = np.arange(6).reshape((2,3))

In [29]:m

Out [29]:

array([[0, 1, 2],

[3, 4, 5]])

In [30]: b + m

Out[30]:

array([[ 3, 10, 17],

[ 6, 13, 20]])

Refer to “Broadcasting Rules” to see when these

operations are viable.

To save memory, the broadcasted arrays are never

physically constructed; NumP y simply uses the

appropriate array elements during computation

Broadcasting Rules

Before broadcasting two arrays, NumPy veriﬁes

that all dimensions are suitably matched. Dimen-

sions match when they are equal, or when either

is 1 or None. In the latter case, the dimension of

the output array is expanded to the larger of the

two.

For example, consider arrays x and y with s hapes

(2, 4, 3) and (4, 1) respectively. These arrays

are to be combin ed in a broadcasting operation

such as z = x + y. We match their dimensions

as follows:

x (2, 4, 3)

y ( 4, 1)

z (2, 4, 3)

Therefore, the dimensions of these arrays are com-

patible, and yield and output of shape (2, 4, 3).

Vectorization and broadcasting examples

Evaluating Functions

Suppose we wish to evaluate a function f over

a large set of numbers, x, stored as an array.

Under the hood, this is achieved by using strides of

zero.

Using a for-loop, the result is produced as follows:

In [31]: def f(x):

...: return x**2 - 3*x + 4

...:

In [32]: x = np.arange(1e5)

In [33]: y = [f(i) for i in x]

On our machine, th is loop executes in approxi-

mately 500 miliseconds. Applying the function f

on th e NumPy array x engages the fast, vector-

ized loop, which operates on each element indi-

vidually:

In [34]: y = f(x)

In [35]: y

Out[35]:

array([ 4.0000e+0, 2.0000e+0, 2.0000e+0, ...,

9.9991e+9, 9.9993e+9, 9.9995e+9])

The vectorized computation executes in 1 milisec-

ond.

As the length of the input array x grows ,

however, execution speed decreases due to th e

construction of large temporary arrays. For ex-

ample, the operation above roughly tran s lates to

a = x**2

b = 3*x

c = a - b

fx = c + 4

Most array-based systems do not provide a

way to circumvent the creation of these tempo-

raries. With NumPy, the user may choose to

perform operations “in-place”—in other words,

in such a way that no new memory is allocated

and all results are stored in the current array.

def g(x):

# Allocate the output array with x-squared

fx = x**2

# In-place operations: no new memory allocated

fx -= 3*x

fx += 4

return fx

Applying g to x takes 600 microseconds; almost

twice as fast as the naive vectorization. Note that

we did not compute 3*x in-place, as it would mod-

ify the original data in x.

This example illustrates the ease with which

NumPy handles vectorized array operations,

without relinquishing control over performance-

critical aspects such as memory allocation.

Note that performance may be bo osted even fur-

ther by using tools such as [

Cython], [Theano]

or [

numexpr], which lessen the load on the mem-

ory bus. Cython, a Python to C compiler dis-

cussed later in this issue, is especially useful in

cases where code cannot be vectorized easily.

Finite Diﬀerencing

The derivative on a discrete sequence is often

computed using ﬁnite diﬀerencing. Slicing makes

this operation trivial.

Suppose we have an n + 1 length vector and per-

form a forward divided diﬀerence.

In [36]: x = np.arange(0, 10, 2)

In [37]: x

Out[37]: array([ 0, 2, 4, 6, 8])

In [38]: y = x**2

In [39]: y

Out[39]: array([ 0, 4, 16, 36, 64])

In [40]: dy_dx = (y[1:]-y[:-1])/(x[1:]-x[:-1])

In [41]: dy_dx

Out[41]: array([ 2, 6, 10, 14, 18])

In this example, y[1:] takes a slice of the y ar-

ray starting at index 1 an d continuing to the end

of the array. y[:-1] takes a slice which starts at

index 0 and continues to one index short of the

end of the array. Thus y[1:] - y[:-1] has the

eﬀect of subtracting, from each element in the ar-

ray, the element directly preceding it. Performing

the same diﬀerencing on the x array and dividing

the two resulting arrays yields the forward divided

diﬀerence.

If we assume that the vectors are length n +

2, then calculating the central divided diﬀer-

ence is simply a matter of m odifying the slices:

In [42]: dy_dx_c = (y[2:]-y[:-2])/(x[2:]-x[:-2])

In [43]: dy_dx_c

Out[43]: array([ 4, 8, 12, 16])

In Pure Python, these operation would be written

using a for loop. For x containing 1000 elements,

the NumPy implementation is 100 times faster.

Creating a grid using broa dcasting

Suppose we want to produ ce a three-dimensional

grid of distances R

ijk

+ j

+ k

with i = −100 . . . 99, j = −100 . . . 99, and

4 IEEE Computing in science and Engineering

Figure 1: Computing dense grid values without and

with broadcasting. No te how, with broadcasting,

much less memory is used.

k = −100 . . . 99. In m ost vectorized program-

ming languages, this wou ld require forming three

intermediate 200 × 200 arrays, i, j, and k as in:

In [44]: i, j, k = np.mgrid[-100:100, -100:100,

...: -100:100]

In [45]: print i.shape, j.shape, k.shape

(200, 200, 200) (200, 200, 200) (200, 200, 200)

In [46]: R = np.sqrt(i**2 + j**2 + k**2)

In [47]: R.shape

Out[47]: (200, 200, 200)

Note the use of the special mgrid object, which

produces a meshgrid when sliced.

In this case we have allocated 4 named arrays,

i, j, k, R and an additional 5 temporary arrays

over the course of the operation. Each of these

arrays contains roughly 64MB of data resulting in

a total memory allocation of ˜576MB. In total, 48

million operations are performed: 200

to square

each array and 200

per add ition.

In a non-vectorized language, no temporary

arrays need to be allocated when the output val-

ues are calculated in a nested for-loop, e.g. (in C):

int R[200][200][200];

int i, j, k;

for (i = -100; i < 100; i++)

for (j = -100; j < 100; j++)

for (k = -100; k < 100; k++)

R[i][j][k] = sqrt(i*i + j*j + k*k);

We can achieve a similar eﬀect using NumPy’s

broadcasting facilities. Instead of constructing

large temporary arrays, we instruct NumPy to

combine three one-dimens ional vectors (a row-

vector, a column-vector and a depth-vector) to

form the th ree-dimensional result. Broadcasting

does not require large intermediate arrays.

First, construct the three coordinate vectors (i

for the x-axis, j for the y-axis and k for the z-axis):

# Construct the row vector: from -100 to +100

i = np.arange(-100, 100).reshape(200, 1, 1)

# Construct the column vector

j = np.reshape(i, (1, 200, 1))

# Construct the depth vector

k = np.reshape(i, (1, 1, 200))

NumPy also provides a short-hand for the ab ove

construction, namely

i, j, k = np.ogrid[-100:100, -100:100, -100:100]

Note how the arrays contain th e same number

of elements, but that they have diﬀerent orienta-

tions. We now let NumPy broadcast i, j and k

to form the three-dimensional result, as shown in

Fig. 1.:

In [48]: R = np.sqrt(i**2 + j**2 + k**2)

In [49]: R.shape

Out[49]: (200, 200, 200)

Here, the total memory allocation is only 128MB:

4 named arrays totalling ˜64Mb (1.6KB * 3 +

64MB) and 5 temporary arrays of ˜64MB (1.6KB

* 3 + 320KB + 64MB). A total of approxi-

mately 16 million operations are performed: 200

to square each arr ay, 200

for the ﬁrst addition,

and 200

each for the second addition as well as

for the square root.

When using naive vectorization, calculating R re-

quires 410ms to compute. Broadcasting reduces

this time to 182ms–a factor 2 speed-up along with

a signiﬁcant reduction in memory use.

Computer Vision

Consider an n × 3 array of three dimensional

points and a 3 × 3 camera matrix:

points = np.random.random((100000, 3))

camera = np.array([[500., 0., 320.],

[ 0., 500., 240.],

[ 0., 0., 1.]])

Often, we want to transform the 3D coordinates

into their 2D pixel locations on the image, as

viewed by the camera. This operation involves

taking the matrix dot product of each point with

the camera matrix, and then dividing the result-

ing vector by its third component. With Num Py,

it is written as:

# Perform the matrix product on the coordinates

vecs = camera.dot(points.T).T

# Divide resulting coordinates by their z-value

pixel_coords = vecs/vecs[:, 2, np.newaxis]

The dot function

implements the matrix prod-

uct, in contrast to the element-wise product *. It

can be applied to one- or two-dimensional arrays.

This code executes in 9 miliseconds–a 70x

speedup over a Python for-loop version.

Aside from the optimized NumPy dot prod-

uct, we make use of NumPy’s array opera-

tions with element-by-element division and the

broadcasting machinery. The code new

vecs /

new

vecs[:, 2, np.newaxis] divides each col-

umn of new

vecs by its thir d column (in other

words, each row is divided by its third ele-

ment). The np.newaxis index is used to change

new

vecs[:, 2] into a column-vector s o that

broadcasting may take place.

The above examples show h ow vectorization pro-

vides a powerful and eﬃcient m eans of operating

on large arrays, without compromising clear and

concise code or relinquishing control over aspects

such as memory allocation.

It should be noted that vectorization and broad-

casting is n o panacea; for example, when repeated

operations take place on very large chunks of

memory, it may be better to use an outer for-loop

combined with a vectorised inner loop to make

optimal use of the system cache.

Sharing data

As shown above, performance is often improved

by preventing repeated copying of data in mem-

ory. In this section, we show how NumPy may

make use of foreign memory–in other words, mem-

ory that is not allocated or controlled by NumPy–

without copying data.

Eﬃcient I/O with memory mapping

An array stored on disk may be addressed

directly without copying it to memory in its

entirety. This technique, kn own as memory

mapping, is useful for addressing only a small

portion of a very large array. NumPy supports

The dot function leverages accelerated BLAS imple-

mentations, if available.

memory mapped arrays w ith the same inter-

face as any other NumPy array. First, let us

construct such an array and ﬁll it with some data:

In [50]: a = np.memmap(’/tmp/myarray.memmap’,

...: mode=’write’, shape=(300, 300),

...: dtype=np.int)

# Pretend "a" is a one-dimensional, 300*300

# array and assign values into it

In [51]: a.flat = np.arange(300* 300)

In [52]: a

Out[52]:

memmap([[ 0, 1, ..., 298, 299],

[ 300, 301, ..., 598, 599],

[ 600, 601, ..., 898, 899],

...,

[89100, 89101, ..., 89398, 89399],

[89400, 89401, ..., 89698, 89699],

[89700, 89701, ..., 89998, 89999]])

When the “ﬂush” method is called, its data is

written to disk:

In [53]: a.flush()

The array can now be loaded and parts of it ma-

nipulated; calling “ﬂush” writes the altered data

back to disk:

# Load the memory mapped array

In [54]: b = np.memmap(’/tmp/myarray.memmap’,

...: mode=’r+’, shape=(300, 300),

...: dtype=np.int)

# Perform some operation on the elements of b

In [55]: b[100, :] *= 2

# Store the modifications to disk

In [56]: b.flush()

The array interface for foreign blocks of memory

Often, NumPy arrays have to be created from

memory constructed and populated by foreign

co de, e.g., a result produced by an external C++

or Fortran library.

To facilitate such exchanges without copying the

already allocated memory, NumPy deﬁnes an ar-

ray interface that speciﬁes how a given object

exposes a block of memory. NumPy knows how to

view any object with a valid

array interface

dictionary attribute as an array. Its most impor-

tant values are data (address of the data in mem-

ory), shape and typestr (the kind of elements

stored).

6 IEEE Computing in science and Engineering

The following example deﬁnes a MutableString

class th at allocates a string

s. The Muta-

bleString represents a f oreign block of memory,

now made made available to Nu mPy by deﬁning

the

array interface dictionary.

Note the use of the ctypes library, which allows

Python to execute code directly from dynamic C

libraries. In this instance, we use its utility func-

tion, create

string buffer, to allocate a string

in memory and addressof to establish its posi-

tion in memory.

import ctypes

class MutableString(object):

def __init__(self, s):

# Allocate string memory

self._s = ctypes.create_string_buffer(s)

self.__array_interface__ = {

# Shape of the array

’shape’: (len(s),),

# Address of data,

# the memory is not read-only

’data’: (ctypes.addressof(self._s), False),

# Stores 1-byte unsigned integers.

# "|" indicates that Endianess is

# irrelevant for this data-type.

’typestr’: ’|u1’,

}

def __str__(self):

"Convert to a string for printing."

return str(buffer(self._s))

The above class is instantiated, after which

NumPy is asked to interpret it as an array, which

is possible because of its

array interface at-

tribute.

# Create an instance of our mutable string class

In [57]: m = MutableString(’abcde’)

# View the character byte values as an array

In [58]: am = np.asarray(m)

In [59]: am

Out[59]: array([ 97, 98, 99, 100, 101],

dtype=uint8)

# Modify the values of the array

In [60]: am += 2

In [61]: am

Out[61]: array([ 99, 100, 101, 102, 103],

dtype=uint8)

# Since the underlying memory was updated,

# the string now has a different value.

In [62]: print m

cdefg

This example illustrates how NumPy is able

to interpret any block of memory, as long as

the necessary information is provided via an

array interface dictionary.

Structured data-types to expose complex data

NumPy arrays are homogeneous, in other words,

each element of the array has the same data-type.

Traditionally, we think of the fundamental data-

types: integers, ﬂoats, etc. However, NumPy ar-

rays may also store compound elements, such as

the combination (1, 0.5) – an array and a ﬂoat.

Arrays that store such compound elements are

known as structured arrays.

Imagine an experiment in which measurements of

the following ﬁelds are recorded:

• Timestamp in nanoseconds (a 64-bit un-

signed integer)

• Position (x- and y-coordinates, stored as

ﬂoating point numbers)

We can describe these ﬁelds using a s ingle data-

type:

In [63]: dt = np.dtype([(’time’, np.uint64),

...: (’pos’, [(’x’, float),

...: (’y’, float)])])

An array of measurements can then be con-

structed using this data-type as

In [64]: x = np.array([(1, (0, 0.5)),

...: (2, (0, 10.3)),

...: (3, (5.5, 1.1))],

...: dtype=dt)

The individual ﬁelds of such a structured array

can be queried:

# Display all time-stamps

In [65]: x[’time’]

Out[65]: array([1, 2, 3], dtype=uint64)

# Display x-coordinates for all timestamps >= 2

In [66]: times = (x[’time’] >= 2)

In [67]: print times

[False True True]

In [68]: x[times][’pos’][’x’]

Out[68]: array([ 0. , 5.5])

Structured arrays are useful for reading complex

binary ﬁles. Suppose we have a ﬁle “foo.dat” that

contains binary data structured according to the

data-type ’dt’ introduced above: for each record,

the ﬁrs t 8 bytes are a 64-bit unsigned integer time

stamp and the next 16 bytes a position comprised

of a 64-bit ﬂoating point ’x’ position and a ﬂoating

point ’y’ position.

Loading such data manually is cumbersome: for

each ﬁeld in the record, bytes are read from ﬁle

and converted to the appropr iate data-type, tak-

ing into consideration factors s uch as endianess.

In contrast, using a structured data-type simpli-

ﬁes this operation to a single line of code:

In [69]: data = np.fromfile("foo.dat", dtype=dt)

Conclusion

We show that the N-dimensional array introduced

by NumPy is a high-level data structure that facil-

itates vectorization of for-loops. Its sop histicated

memory description allows a wide variety of oper-

ations to be performed without copying any data

in memory, bringing signiﬁcant performance gains

as data-sets grow large. Arrays of multiple dimen-

sions may be combined using broadcasting to re-

duce the number of operations performed during

numerical computation.

When NumPy is skillfully applied, most compu-

tation time is spent on vectorized array opera-

tions, instead of in Python for-loops (which are

often a bottleneck). Further speed improvements

are ach ieved by optimizing compilers, such as

Cython, which better exploit cache eﬀects.

NumPy and similar projects foster an environ-

ment in which numerical problems may by de-

scribed using high-level code, thereby opening the

door to scientiﬁc code that is b oth transparent

and easy to maintain.

References

[IPython] Fernando Perez, Brian E. Granger.

IPython: A System for Interactive Scientiﬁc

Computing, Computing in Science and Engi-

neering, vol. 9, no. 3, pp. 21-29, May/June

2007, doi:10.1109/MCSE.2007.53.

[cython] S. Behnel, R. Bradshaw, D. S Seljebotn,

G. Ewing et al. C-Extensions for Python.

http://www.cython.org.

[theano] J. Bergstra. Op timized Symbolic Ex-

pressions and GPU Metaprogramming with

Theano, Proceedings of the 9th Python in Sci-

ence Conference (SciPy2010), Au s tin, Texas,

June 2010.

[numexpr] D. Cooke, F. Alted, T.

Hochberg, G. Thalhammer, numexpr

http://code.google.com/p/numexpr/

NumPy’s documentation is maintained us-

ing a WikiPedia-like community f orum,

available at http://docs.scipy.org/. Dis-

cussions take place on the project mailing

list (http://www.scipy.org/Mailing

Lists).

NumPy is a volunteer eﬀort.

8 IEEE Computing in science and Engineering