Stefan Behnel | 30 May 09:28 2015
Picon

PEP 492 implemented (async/await)

Hi,

I invested a couple of days implementing PEP 492 in Cython.

https://www.python.org/dev/peps/pep-0492/

It turned out nicely, so it's now merged into master to become part of
Cython 0.23. I also spent some time testing and debugging it against Python
3.5 so that Yury Selivanov could adapt their side for interoperability. The
second beta of 3.5 will be released tomorrow and it should "just work".
Testing is very welcome.

Note that the language feature is available in Cython for all Python
versions (2.6+), but usage from Python code with async/await is obviously
limited to Python 3.5 where this syntax is available. My guess is that one
of the next asyncio (and trollius) backport package releases will add
support as well, so that you could run Cython coroutines on top of asyncio
also in older Python releases. It's mostly about dropping some explicit
type checks here and there or replacing them with ABC isinstance checks.

Have fun,

Stefan

--

-- 

--- 
You received this message because you are subscribed to the Google Groups "cython-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cython-users+unsubscribe <at> googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
(Continue reading)

Antony Lee | 29 May 10:01 2015
Picon

boundscheck=True faster than boundscheck=False

Consider the following minimal example:

#cython: language_level=3, boundscheck=False, wraparound=False, initializedcheck=False, cdivision=True
cimport cython
from libc.stdlib cimport malloc

def main(size_t ni, size_t nt, size_t nx):
    cdef:
        size_t i, j, t, x, y
        double[:, :, ::1] a = <double[:ni, :ni, :nx]>malloc(ni * ni * nx * sizeof(double))
        double[:, :, ::1] b = <double[:nt, :ni, :nx]>malloc(nt * ni * nx * sizeof(double))
        size_t[:, :, ::1] best = <size_t[:nt, :ni, :nx]>malloc(nt * ni * nx * sizeof(size_t))
        size_t mxi
        double s, mxs
    for t in range(nt):
        for j in range(ni):
            for y in range(nx): # this loops does nothing but is needed for the effect below.
                mxs = -1e300
                for i in range(ni):
                    for x in range(nx):
                        with cython.boundscheck(False): # Faster!?!?
                            s = b[t, i, x] + a[i, j, x]
                        if s >= mxs:
                            mxs = s
                            mxi = i
                best[t + 1, j, y] = mxi
    return best[0, 0, 0]

essentially summing two 2D arrays along some specific axes and finding the maximizing index along another axis.

When compiled with gcc -O3 and called with the arguments (1, 2000, 2000), adding the boundscheck=True results in a twice faster execution than when boundscheck=False.

Any hint of why this would be the case?  (Well, I can probably guess this has again to do with GCC autovectorization...)

Thanks in advance,

Antony

--

---
You received this message because you are subscribed to the Google Groups "cython-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cython-users+unsubscribe <at> googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Jeff Reback | 29 May 01:26 2015
Picon

PERF with inline gil released functions

I am working on releasing the GIL for a lot of pandas cython extensions (see here) and ran into this odd performance case:

Turns out that if I am calling a nogil function that has a with gil: block inside it the function has wrapper code generated to save the gil state. However
if I inline this exact same code then this code is NOT generated.

So I create an extension class and several functions. Perf is below, and the cython -a relevant section. Is this expected? 


# extensions.pyx


cimport cython


cdef
class A:


    cdef
:
        size_t n

   
def __cinit__(self):
       
self.n = 0

    cdef
inline void incr(self) nogil:
       
if self.n % 2 == 0:
           
with gil:
               
pass
       
self.n += 1


cdef
class B:


   
def run_inline(self):
        cdef
:
            A a
= A()
            size_t i
       
with nogil:
           
for i in range(10000):
               
if a.n % 2 == 0:
                   
with gil:
                       
pass
                a
.n += 1


   
def run_func(self):
        cdef
:
            A a
= A()
            size_t i
       
with nogil:
           
for i in range(10000):
                a
.incr()



# extension.py
import pyximport
pyximport
.install()
import extensions




And perf


In [3]: import extension


In [4]: b = extension.extensions.B()


In [5]: %timeit b.run_inline()
1000 loops, best of 3: 1.62 ms per loop


In [6]: %timeit b.run_func()
100 loops, best of 3: 8.14 ms per loop





# cython -a extensions.pyx

+11: cdef inline void incr(self) nogil:
static CYTHON_INLINE void __pyx_f_10extensions_1A_incr(struct __pyx_obj_10extensions_A *__pyx_v_self) {
 
#ifdef WITH_THREAD
 
PyGILState_STATE __pyx_gilstate_save = PyGILState_Ensure();
 
#endif
 __Pyx_RefNannyDeclarations
 __Pyx_RefNannySetupContext
("incr", 0);
 
#ifdef WITH_THREAD
 
PyGILState_Release(__pyx_gilstate_save);
 
#endif
 
/*try:*/ {
/* … */
 
/* function exit code */
 
#ifdef WITH_THREAD
 
PyGILState_Release(__pyx_gilstate_save);
 
#endif
}






--

---
You received this message because you are subscribed to the Google Groups "cython-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cython-users+unsubscribe <at> googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Jeff Reback | 28 May 23:34 2015
Picon

using nogil on an extension type causes state saving on every call, but inlining doesn't

As part of refactoring lots of pandas routines to release the gil where appropriate (see  here). I was running into an odd performance issue.

We have an extension type that looks like this (I know not checking for memory exceptions ATM, and just trying to understand a couple of things here)

cdef uint8_t Int64VectorData_needs_resize(Int64VectorData *data) nogil:

   
return data.n == data.m

cdef
void Int64VectorData_append(Int64VectorData *data, int64_t x) nogil:
    data
.data[data.n] = x
    data
.n += 1

cdef
class Int64Vector:
    cdef
:
       
Int64VectorData *data
        ndarray ao

   
def __cinit__(self):
       
self.data = <Int64VectorData *>PyMem_Malloc(sizeof(Int64VectorData))
       
self.data.n = 0
       
self.data.m = _INIT_VEC_CAP
       
self.ao = np.empty(self.data.m, dtype=np.int64)
       
self.data.data = <int64_t*> self.ao.data

    cdef resize
(self):
       
self.data.m = max(self.data.m * 4, _INIT_VEC_CAP)
       
self.ao.resize(self.data.m)
       
self.data.data = <int64_t*> self.ao.data

   
def __dealloc__(self):
       
PyMem_Free(self.data)

   
def __len__(self):
       
return self.data.n

   
def to_array(self):
       
self.ao.resize(self.data.n)
       
self.data.m = self.data.n
       
return self.ao

    cdef inline void append(self, int64_t x):

        if Int64VectorData_needs_resize(self.data):
            self.resize()

        Int64VectorData_append(self.data, x)

 
   cdef inline void append_nogil(self, int64_t x) nogil:

        if Int64VectorData_needs_resize(self.data):
            with gil:
self.resize()

        Int64VectorData_append(self.data, x)
 


If I call it like this (IOW, inlining the append call)

 
    with nogil:
           
for i in range(n):

                val
= values[i]
                k
= kh_get_int64(self.table, val)
               
if k != self.table.n_buckets:
                    idx
= self.table.vals[k]
                    labels
[i] = idx
               
else:
                    k
= kh_put_int64(self.table, val, &ret)
                   
self.table.vals[k] = count


                   
if Int64VectorData_needs_resize(ud):
                       
with gil:
                            uniques
.resize()
                   
Int64VectorData_append(ud, val)
                    labels
[i] = count
                    count
+= 1



All is well and good. 

However, I originall had was making a call like this:

unique.append

--

---
You received this message because you are subscribed to the Google Groups "cython-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cython-users+unsubscribe <at> googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Sturla Molden | 28 May 18:27 2015
Picon

Re: vectorization of looping on an array from cython

Chris Barker <chris.barker <at> noaa.gov> wrote:

> did you look at the actual for loop generated? In addlocal,  Cyton knows
> that i and n are both ints, and can therefore write a plain old C for loop.
> but in add, the argument to the range() function is an arbitrary python
> type, so I'd expect it to write a less optimized C for loop, and so the C
> compiler may not know for sure that this is a simple loop through an array.

Cython should know the type of the .shape attribute of a typed memoryview.
If it does not, it qualifies as a bug.

Sturla

--

-- 

--- 
You received this message because you are subscribed to the Google Groups "cython-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cython-users+unsubscribe <at> googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Antony Lee | 27 May 20:29 2015
Picon

vectorization of looping on an array from cython

(crossposted from StackOverflow)

Consider the following example of doing an inplace-add on a Cython memoryview:

    #cython: boundscheck=False, wraparound=False, initializedcheck=False, nonecheck=False, cdivision=True
    from libc.stdlib cimport malloc, free
    from libc.stdio cimport printf
    cimport numpy as np
    import numpy as np


    cdef extern from "time.h":
        int clock()


    cdef void inplace_add(double[::1] a, double[::1] b):
        cdef int i
        for i in range(a.shape[0]):
            a[i] += b[i]


    cdef void inplace_addlocal(double[::1] a, double[::1] b):
        cdef int i, n = a.shape[0]
        for i in range(n):
            a[i] += b[i]


    def main(int N):
        cdef:
            int rep = 1000000, i
            double* pa = <double*>malloc(N * sizeof(double))
            double* pb = <double*>malloc(N * sizeof(double))
            double[::1] a = <double[:N]>pa
            double[::1] b = <double[:N]>pb
            int start
        start = clock()
        for i in range(N):
            a[i] = b[i] = 1. / (1 + i)
        for i in range(rep):
            inplace_add(a, b)
        printf("loop %i\n", clock() - start)
        print(np.asarray(a)[:4])
        start = clock()
        for i in range(N):
            a[i] = b[i] = 1. / (1 + i)
        for i in range(rep):
            inplace_addlocal(a, b)
        printf("loop_local %i\n", clock() - start)
        print(np.asarray(a)[:4])

With these Cython directives, the seemingly equivalent `inplace_add` and `inplace_addlocal` both compile to tight C loops.  But for `N=128` (the approximate size I'm expecting) `inplace_addlocal` is twice(!) faster than `inplace_add`, after compilation with `gcc -Ofast` (and directly writing a C function taking a (int, double*, double*) is more or less as fast as `addlocal`, with or without `#openmp simd`).  Passing `-fopt-info` to `gcc` shows that `inplace_addlocal` gets vectorized, but not `inplace_add`.

Is this an issue with the C code that Cython generates (i.e., gcc truly cannot infer whatever guarantees it needs to vectorize the code), or with gcc (i.e., some optimization is missing), or something else?

Thanks.

Antony

--

---
You received this message because you are subscribed to the Google Groups "cython-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cython-users+unsubscribe <at> googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Martin Bammer | 26 May 21:12 2015
Picon

Current master fails to compile with MinGW g++

Traceback (most recent call last):
  File "C:\Python\test_cython\setup.py", line 220, in Py2C
    setup(ext_modules = cythonize(extensions, gdb_debug = bCythonDebug))
  File "c:\python27\lib\distutils\core.py", line 166, in setup
    raise SystemExit, "error: " + str(msg)
SystemExit: error: command 'C:\\MinGW\\bin\\g++.exe' failed with exit status 1

test_cython.cpp: In function 'void __Pyx_RaiseArgtupleInvalid(const char*, int, Py_ssize_t, Py_ssize_t, Py_ssize_t)':
test_cython.cpp:3959:59: warning: unknown conversion type character 'z' in format [-Wformat=]
                  (num_expected == 1) ? "" : "s", num_found);
                                                           ^
test_cython.cpp:3959:59: warning: format '%s' expects argument of type 'char*', but argument 5 has type 'Py_ssize_t {aka int}' [-Wformat=]
test_cython.cpp:3959:59: warning: unknown conversion type character 'z' in format [-Wformat=]
test_cython.cpp:3959:59: warning: too many arguments for format [-Wformat-extra-args]
test_cython.cpp: In function 'void __Pyx_RaiseTooManyValuesError(Py_ssize_t)':
test_cython.cpp:4478:94: warning: unknown conversion type character 'z' in format [-Wformat=]
                  "too many values to unpack (expected %" CYTHON_FORMAT_SSIZE_T "d)", expected);
                                                                                              ^
test_cython.cpp:4478:94: warning: too many arguments for format [-Wformat-extra-args]
test_cython.cpp: In function 'void __Pyx_RaiseNeedMoreValuesError(Py_ssize_t)':
test_cython.cpp:4484:48: warning: unknown conversion type character 'z' in format [-Wformat=]
                  index, (index == 1) ? "" : "s");
                                                ^
test_cython.cpp:4484:48: warning: format '%s' expects argument of type 'char*', but argument 3 has type 'Py_ssize_t {aka int}' [-Wformat=]
test_cython.cpp:4484:48: warning: too many arguments for format [-Wformat-extra-args]
c:\python27\libs/libpython27.a(dmmes00976.o):(.idata$7+0x0): undefined reference to `_head_C__build27_cpython_PCBuild_libpython27_a'
c:\python27\libs/libpython27.a(dmmes01026.o):(.idata$7+0x0): undefined reference to `_head_C__build27_cpython_PCBuild_libpython27_a'
c:\python27\libs/libpython27.a(dmmes00239.o):(.idata$7+0x0): undefined reference to `_head_C__build27_cpython_PCBuild_libpython27_a'
c:\python27\libs/libpython27.a(dmmes01018.o):(.idata$7+0x0): undefined reference to `_head_C__build27_cpython_PCBuild_libpython27_a'
c:\python27\libs/libpython27.a(dmmes00242.o):(.idata$7+0x0): undefined reference to `_head_C__build27_cpython_PCBuild_libpython27_a'
c:\python27\libs/libpython27.a(dmmes00245.o):(.idata$7+0x0): more undefined references to `_head_C__build27_cpython_PCBuild_libpython27_a' follow
collect2.exe: error: ld returned 1 exit status

--

---
You received this message because you are subscribed to the Google Groups "cython-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cython-users+unsubscribe <at> googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Jelle Zijlstra | 26 May 20:17 2015
Picon

cpdef'ed generator always causes SystemError to be thrown

Whenever I call a cpdef'ed generator function from another Cython function, I get "SystemError: error return without exception set".

Sample code:

$ cat test.py
def f():
    return [x for x in _f()]

globals()['f'] = f


def _f():
    yield 1
    yield 2
    return

$ cat test.pxd 
cpdef list f()
cpdef _f()
$ python
>>> import test; test.f()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
SystemError: error return without exception set
>>> 
$ ls test.*
test.c  test.pxd  test.py  test.so

Is this a bug in Cython?

--

---
You received this message because you are subscribed to the Google Groups "cython-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cython-users+unsubscribe <at> googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Corrin Meyer | 26 May 18:05 2015
Picon

Linking issues when building cython

I am having an issue when trying to install Cython on Windows 7.

I have a fresh install of Python 2.7.10, open a 'cmd' window and then try to install cython using the command 'pip install cython'.  It tries to build and apparently link the C-extensions but ends with the following error:

C:\Programs\mingw-w64\i686-4.9.2-posix-dwarf-rt_v4-rev2\mingw32\bin\gcc.exe -shared -s c:\users\corrin~1\appdata\local\temp\pip-build-srrbvr\cython\cython\plex\scanners.o c:\users\corrin~1\appdata\local\temp\pip-build-srrbvr\cython\cython\plex\Scanners.def -LC:\Python27\libs -LC:\Python27\PCbuild -lpython27 -lmsvcr90 -o build\lib.win32-2.7\Cython\Plex\Scanners.pyd
C:\Python27\libs/libpython27.a: error adding symbols: File format not recognized
collect2.exe: error: ld returned 1 exit status
error: command 'C:\\Programs\\mingw-w64\\i686-4.9.2-posix-dwarf-rt_v4-rev2\\mingw32\\bin\\gcc.exe' failed with exit status 1

I am using the 32-bit version of Python 2.7.10 and the 32-bit mingw-builds version of mingw-w64 (GCC 4.9.2 for i686, posix threads, dwarf exception handling, build revision 2).  I have created a 'C:\Python27\Libs\distutils\distutils.cfg' file with the contents:

[build]
compiler=mingw32

I have also tried the vanilla MinGW compiler and several different version of the mingw-build compiler for GCC 4.9.2. with varying results.  Of particular interest, with vanilla MinGW it skips 'libpython27.a' as incompatible and tries 'python27.lib' instead.  At this point I am at a loss for what is going on or wrong here.  Additionally, there doesn't seem to be any relatively current information on building Cython for windows (or for python c-extensions in general).

--

---
You received this message because you are subscribed to the Google Groups "cython-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cython-users+unsubscribe <at> googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Adam Callison | 25 May 22:57 2015

include statement

Hi all,

In the documentation for the cython 'include' statement (http://docs.cython.org/src/userguide/language_basics.html#the-include-statement), it is claimed that "the included file can contain any complete statements or declarations that are valid in the context where the include statement appears, including other include statements. The contents of the included file should begin at an indentation level of zero, and will be treated as though they were indented to the level of the include statement that is including the file.".

However, when I attempt to use it:

cdef int init_psi(double * psi, s_params params):
    include
"initpsi.pxi"
   
return 0

I am told "include statement not allowed here".

The file "initpsi.pxi" begins at 0 indentation, as instructed. What am I doing wrong?

Adam



--

---
You received this message because you are subscribed to the Google Groups "cython-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cython-users+unsubscribe <at> googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
MinRK | 25 May 22:11 2015
Picon

Python 3.5 support

I’ve been testing with the Python 3.5 beta, and there appears to be something preventing Cython modules from being importable.

Compilation runs fine, but when I try to import the compiled module:

python -c 'import Cython.Runtime.refnanny'

I get:

ImportError: dynamic module does not define module export function (PyInit_.refnanny)

which suggests it is related to PEP-489.

When I do a simpler test with tst.pyx (after installing Cython with --no-cython-compile):

# tst.pyx from libc.stdio cimport printf def p(int i): printf("%d\n", i)

And compile manually, it seems to work:

cython tst.pyx clang -fpic `python3.5-config --cflags` -c tst.c -o tst.o^C clang -shared -o tst.so tst.o `python3.5-config --ldflags` python -c 'import tst; tst.p(5)'

But all Cython Extensions get the same export error above.

This is on OS X 10.10 with the Python 3.5 beta bdist from Python.org.

Is this a known issue? Is anyone successfully using Cython with Python 3.5?

-MinRK

--

---
You received this message because you are subscribed to the Google Groups "cython-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cython-users+unsubscribe <at> googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Gmane