Hagen Fürstenau | 1 Oct 2011 01:31

multidimensional arrays and buffer unpacking

Hi,

I have a two-dimensional numpy array, which I want to pass to an "cdef 
inline" function, but Cython warns me: "Buffer unpacking not optimized 
away". IIUC, this means that although the function gets inlined, the 
buffer is still (unnecessarily) checked and extracted every time. Is 
there a way to avoid this?

I also thought about using (or extracting from the numpy array) a C 
array, but there doesn't seem to be syntax for multi-dimensional C 
arrays (or did I miss something?), so I'd have to use a one-dimensional 
array and the indexing would get ugly.

Any suggestions?

- Hagen

Robert Bradshaw | 1 Oct 2011 02:02
Favicon

Re: multidimensional arrays and buffer unpacking

On Fri, Sep 30, 2011 at 4:31 PM, Hagen Fürstenau <hagen <at> zhuliguan.net> wrote:
> Hi,
>
> I have a two-dimensional numpy array, which I want to pass to an "cdef
> inline" function, but Cython warns me: "Buffer unpacking not optimized
> away". IIUC, this means that although the function gets inlined, the buffer
> is still (unnecessarily) checked and extracted every time. Is there a way to
> avoid this?
>
> I also thought about using (or extracting from the numpy array) a C array,
> but there doesn't seem to be syntax for multi-dimensional C arrays (or did I
> miss something?), so I'd have to use a one-dimensional array and the
> indexing would get ugly.
>
> Any suggestions?

This is one of the motivations for
https://github.com/cython/cython/pull/61 (not merged yet). Until then,
either inline it manually or use 1-d arrays with manual indexing or
extract it into a multi dimensional array yourself (e.g. a double***).

- Robert

anirudh | 1 Oct 2011 09:49
Picon

Re: Importing an array

Hi!

Boy! What a long delay. I am going to post the entire piece of code. Not too long. You could probably browse over it.

"""waveFn Class to solve the Schrödinger equation
Example:
>>> psi = waveFn()
>>> psi.setup()
>>> psi.solve()
"""

def waveFn():
def __init__(self):
self.timeStep = #the smaller the better
self.spaceStep =
self.timeSize = 
self.spaceSize =

self.currentState = None
self.nextState = 
self.solution = 

self.bndCndns = 

self.oneRunNo = 
#Diagonal rows in tridiagonal matrix equation. Refer Wiki for details.
self.a = 
self.b =
self.c =
self.v =

def potential(self, x):
#potential constant in time so generate this just once
return(np.cos(x))

def setup(self, x):
self.currentState = np.exp(x)
return

def oneRun(self, a, b, c, v):
#Crank-Nicolson algorithm. Refer Wiki for details.
spaceSize = self.spaceSize
currentState = self.currentState
nextState = self.nextState
for t in range(oneRunNo):
for i in range(spaceSize):
m = a[i]/b[i-1]
b[i] = b[i] - m*c[i-1]
v[i] = v[i] - m*v[i-1]
nextState[spaceSize] = v[spaceSize-1]/b[spaceSize-1]
for i in range(spaceSize-2,-1,-1):
nextState[i]=(v[i]-c[i]*nextState[i+1])/b[i]

currentState = nextState

def solve(self):
timeSize = self.timeSize
currentState = self.currentState
nextState = self.nextState
for t in range():
oneRun()
np.save(solution, nextState)

anirudh | 1 Oct 2011 09:51
Picon

Re: Importing an array

This is the python only code. I now need to optimize this. Some have told me to give up on python altogether and go to C. At the least, they asked me to write the CPU intensive Crank Nicolson algorithm in C. Should I stick to python and then optimize with cython or will C actually give me so big a bonus?

anirudh | 1 Oct 2011 09:53
Picon

Re: Importing an array

Hi!

Boy! What a long delay. I am going to post the entire piece of code. Not too long. You could probably browse over it.

"""waveFn Class to solve the Schrödinger equation
Example:
>>> psi = waveFn()
>>> psi.setup()
>>> psi.solve()
"""

class waveFn():
def __init__(self):
self.timeStep =  #the smaller the better
self.spaceStep =
self.timeSize = 
self.spaceSize =

self.currentState = None
self.nextState = 
self.solution = 

self.bndCndns = 

self.oneRunNo = 
#Diagonal rows in tridiagonal matrix equation. Refer Wiki for details.
self.a = 
self.b =
self.c =
self.v =

def potential(self, x):
#potential constant in time so generate this just once
return(np.cos(x))

def setup(self, x):
self.currentState = np.exp(x)
return

def oneRun(self, a, b, c, v):
#Crank-Nicolson algorithm. Refer Wiki for details.
spaceSize = self.spaceSize
currentState = self.currentState
nextState = self.nextState
for t in range(oneRunNo):
for i in range(spaceSize):
m = a[i]/b[i-1]
b[i] = b[i] - m*c[i-1]
v[i] = v[i] - m*v[i-1]
nextState[spaceSize] = v[spaceSize-1]/b[spaceSize-1]
for i in range(spaceSize-2,-1,-1):
nextState[i]=(v[i]-c[i]*nextState[i+1])/b[i]

currentState = nextState

def solve(self):
timeSize = self.timeSize
currentState = self.currentState
nextState = self.nextState
for t in range():
oneRun()
np.save(solution, nextState)
anirudh | 1 Oct 2011 09:54
Picon

Re: Importing an array

This is the python only code. I now need to optimize this. Some have told me to give up on python altogether and go to C. At the least, they asked me to write the CPU intensive Crank Nicolson algorithm in C. Should I stick to python and then optimize with cython or will C actually give me so big a bonus?

anirudh | 1 Oct 2011 10:44
Picon

Re: Importing an array

What I'm trying tot do towards the end is save the numpy array currentState after every 300 runs and clear the memory so that I don't run into memory problems. This used to happen before with the earlier pure python program and so I'm trying to avoid it. I need a np.save(file, currentState) and then somehow use something like malloc/re-alloc to free up that space for the new currentState.

Michael Hogg | 1 Oct 2011 13:37
Picon

Re: multidimensional arrays and buffer unpacking

Hi Hagen,


If you use the C++ library, you could use a multidimensional vector. For example:

from libcpp.vector cimport vector
cdef vector[vector[double]] vvd

# Resize this to a 3x3 vector
cdef vector[double] vd
vd.resize(3,0.0)
for i in xrange(3): vvd.push_back(vd)

# Then the indexing is normal. For example, printing the contents:
cdef int i,j
for i in xrange(3):
    for j in xrange(3);
        print vvd[i][j]

-Michael

Dag Sverre Seljebotn | 1 Oct 2011 13:42
Picon
Picon
Gravatar

Re: Re: multidimensional arrays and buffer unpacking

On 10/01/2011 01:37 PM, Michael Hogg wrote:
> Hi Hagen,
>
> If you use the C++ library, you could use a multidimensional vector. For

Well, not if you care about performance. std::vector would use a 
seperate malloc per inner vector ("row"), which could potentially 
scatter memory accesses all over memory.

"Object-oriented" C++ is in general horrible for performance (for 
computational code) with all the pointer chasing going on. Of course, 
one can code in C++ in "Fortan-style", it's more about programming 
styles than the language.

Dag Sverre

Guillaume Bouchard | 1 Oct 2011 19:22
Picon
Favicon

Re: Undefined __pyx_vtab

Hello,

I'm sorry to reopen this thread, but because I received no answer, I'm
unable to know if you think there is no solution, if it is a known
bug, or if I just missed the obvious answer and that my code is wrong
in some way.

I simplified more the example. Everything is in the same directory.

I have a trivial setup.py file:

-------------
from distutils.core import setup
from Cython.Build import cythonize

setup(name="Test", ext_modules=cythonize('*.pyx'))
--------------

And three files:

math3d.pxd

-----------
from coordsys cimport CoordSyst

cdef class _CObj:
	cdef void dummy(self)

cdef class Position(_CObj):
	cdef CoordSyst _parent
-----------------

math3d.pyx:

--------------------
from coordsys cimport CoordSyst

cdef class _CObj:
	cdef void dummy(self): pass

cdef class Position(_CObj):
	def do_it(self):
		cdef float*value = self._parent._root_matrix()
------------------------

file coordsys.pxd:

------------
from math3d cimport Position

cdef class CoordSyst(Position):
	cdef float* _root_matrix(self)
------------

The issue, when trying to build, is that the struct for CoordSyst data
seems to have no vtable:

-----------------
$ python2 setup.py build_ext --inplace
Compiling math3d.pyx because it changed.
Cythonizing math3d.pyx
running build_ext
building 'math3d' extension
creating build
creating build/temp.linux-i686-2.7
gcc -pthread -fno-strict-aliasing -march=i686 -mtune=generic -O2 -pipe
-DNDEBUG -march=i686 -mtune=generic -O2 -pipe -fPIC
-I/usr/include/python2.7 -c math3d.c -o
build/temp.linux-i686-2.7/math3d.o
math3d.c: In function ‘__pyx_pf_6math3d_8Position_do_it’:
math3d.c:532:127: error: ‘struct __pyx_obj_8coordsys_CoordSyst’ has no
member named ‘__pyx_vtab’
error: command 'gcc' failed with exit status 1
----------------

I'm quoting some random stuff which seems to be interesting:

> a) Position must inherit from something (_CObj) and _CObj must have a
> cdef'ed method (dummy) for the error to appear
> b) Position must have the CoordSyst attribute, and a method (here
> _root_matrix) an this attribute must be called in the .pyx (in fact,
> math3d.c:537 is the self._parent._root_matrix() call)
> c) CoordSyst must inherit from Position, but must be defined in another file.

Thank you for you time.

--

-- 
Guillaume


Gmane