Sturla Molden | 1 Jan 04:03
Picon

Re: [Cython] Cython 0.12 not working on Windows XP


[This is perhaps OT on the Cython list. Please forgive me the noise.]

Stefan,

CreateProcess in Win32 API and fork in Interix (SUA/SFU) both call the
kernel function ZwCreateProcess in ntdll.dll. If NULL is passed as section
handle to ZwCreateProcess, it will clone the current process. The
copy-on-write optimization actually happen in hardware. Most modern
processors have a paging memory-management unit that can tag pages as
shared and copy-on-write. The reason that early versions of Unix or Linux
did not copy-on-write optimize fork but modern do, is that CPUs have
evolved. You can read about Windows kernel programming in Nebbet's book on
NT kernel internals. It even has example code for a boiler-plate
implementation of fork.

Cygwin's problem is that hooking up a process created by ZwCreateProcess
to the Win32 or SUA subsystem is undocumented; neither MS documentation or
Nebbet cover that. That is why Cygwin does not implement a copy-on-write
fork yet, although most modern hardware supports it.

I recommend using Interix (SFU/SUA) instead of Cygwin. You will need 
Windows XP professional or enterprice, or Windows Vista / Windows 7
Ultimate. The codebase of Interix is mostly derived from OpenBSD. This
means it is very safe. The old POSIX subsystem form Windows NT4 is
deprecated in favour of Interix. You'll find more information at
www.interix.com. Prebuilt binaries for common Unix tools can be downloaded
from the Warehouse in SUA community.

Note that SUA is UNIX, not Windows, it just happens to share kernel. This
(Continue reading)

Picon
Picon

[Cython] [OT] Windows and Unix

Sturla Molden wrote:
> [This is perhaps OT on the Cython list. Please forgive me the noise.]

Well, I for one think it is interesting, so continuing OT...

> 
> Stefan,
> 
> CreateProcess in Win32 API and fork in Interix (SUA/SFU) both call the
> kernel function ZwCreateProcess in ntdll.dll. If NULL is passed as section
> handle to ZwCreateProcess, it will clone the current process. The
> copy-on-write optimization actually happen in hardware. Most modern
> processors have a paging memory-management unit that can tag pages as
> shared and copy-on-write. The reason that early versions of Unix or Linux
> did not copy-on-write optimize fork but modern do, is that CPUs have
> evolved. You can read about Windows kernel programming in Nebbet's book on
> NT kernel internals. It even has example code for a boiler-plate
> implementation of fork.
> 
> Cygwin's problem is that hooking up a process created by ZwCreateProcess
> to the Win32 or SUA subsystem is undocumented; neither MS documentation or
> Nebbet cover that. That is why Cygwin does not implement a copy-on-write
> fork yet, although most modern hardware supports it.
> 
> I recommend using Interix (SFU/SUA) instead of Cygwin. You will need 
> Windows XP professional or enterprice, or Windows Vista / Windows 7
> Ultimate. The codebase of Interix is mostly derived from OpenBSD. This
> means it is very safe. The old POSIX subsystem form Windows NT4 is
> deprecated in favour of Interix. You'll find more information at
> www.interix.com. Prebuilt binaries for common Unix tools can be downloaded
(Continue reading)

Sturla Molden | 3 Jan 07:23
Picon

Re: [Cython] [OT] Windows and Unix

Dag Sverre Seljebotn wrote:

> In my experience, Cygwin is often used as an
> easy way out for porting open source software to Windows, and SFU/SUA
> seem to exclude at least most of the home users and quite a few laptop
> users.

First, note that Cygwin is GPL unless you buy a commercial license (for a
fee undisclosed by Red Hat).

The Cygwin fork call is not copy-on-write optimized, which makes Cygwin
unsuited for fork-based internet servers. Also, programs that require
inter-process read-only access to huge memory buffers can avoid using
shared memory by forking. This trick will not work on Cygwin.

And then there is the security issue. How safe is Cygwin against various
exploits? As far as I can tell, there is no auto-update of system
components.

> Can one compile gcc for SFU/SUA? Does that have less or more problems
> than gcc for Cygwin?

gcc is the system C compiler on Interix. Microsoft is for some reason not
using their own C compiler, but rather relying on gcc. Perhaps Visual C++
did not pass UNIX certification; or perhaps this is a decision to make
porting from Linux easier. I don't know. But in any case, gcc is
preinstalled, you can build your own, or download one here:

http://www.suacommunity.com/tool_warehouse.htm

(Continue reading)

Zak Stone | 3 Jan 07:58
Picon

Re: [Cython] quick novice question about type conversion

Thank you very much for these additional details, Dag! They were
exactly what I needed.

Many thanks to all of you for your inspiring work on Cython, and happy new year!

Zak

On Mon, Dec 28, 2009 at 4:23 AM, Dag Sverre Seljebotn
<dagss@...> wrote:
>> Thanks for the quick reply. Could you say just a little more about how
>> to work around this issue? The numpy.pxd file is close to 1000 lines
>> long, and I don't know enough about Cython or NumPy to dive in and do
>> a hot-fix.
>
> You simply do
>
> cdef ndarray[int] myarray = np.zeros((100,), dtype=np.intc)
>
> "int" on the left is compile-time context and refers to c int, while
> np.intc on the right side is a run-time object (where numpy.pxd doesn't
> really come into play).
>
> What you need to fix NumPy pxd for is so that you can type the completely
> equivalent
>
> cdef ndarray[np.intc_t] myarray = np.zeros((100,), dtype=np.intc)
>
> Dag
>
> _______________________________________________
(Continue reading)

Hoyt Koepke | 4 Jan 23:43
Picon
Gravatar

[Cython] ANN: TreeDict, a new cython package

Hello,

I'd like to announce a new open source python/cython package,
TreeDict, that the cython community might be interested in.  I've been
using it and testing it for a number of months and have finally
documented it sufficiently to release.  This is my first real open
source software package that I hope can be generally useful,
particularly to the scientific community, and I'd greatly appreciate
feedback on any and all parts of the
design/coding/project/release/documentation from the more experienced
developers here.

--------------------------------------------------------

TreeDict is a dictionary-like, hierarchical python container to
simplify the bookkeeping surrounding parameters, variables and data.
It aims to be fast, lightweight, intuitive, feature-rich and stable.
It's written in python/cython and BSD licensed.

While intended for general python development, it includes a number of
features particularly useful for scientific programming. It is similar
in basic functionality to MATLAB structures in terms of concise syntax
and implicit branch creation. In addition, TreeDict implements all the
methods of regular dictionaries, pickling, fast non-intersecting
hashing for efficient caching, manipulations on the tree structure,
fuzzy key matching for helpful error messages, and a system for
forward-referencing branches to make lists of parameters more
readable.

Documentation and examples are available at
(Continue reading)

Nathaniel Smith | 5 Jan 03:30

Re: [Cython] ANN: TreeDict, a new cython package

On Mon, Jan 4, 2010 at 2:43 PM, Hoyt Koepke <hoytak <at> gmail.com> wrote:
> I'd like to announce a new open source python/cython package,
> TreeDict, that the cython community might be interested in.  I've been
> using it and testing it for a number of months and have finally
> documented it sufficiently to release.  This is my first real open
> source software package that I hope can be generally useful,
> particularly to the scientific community, and I'd greatly appreciate
> feedback on any and all parts of the
> design/coding/project/release/documentation from the more experienced
> developers here.

Neat! From a quick read of the docs, I have a few minor questions:
  -- Why invent a new mechanism for global variables
(getTree/treeExists)? It seems redundant (and violates "there's only
one way to do it", if you care about that prescription for pythonicity
;-)
  -- I'd prefer PEP8-complaint method names, e.g. branch_name rather
than branchName.
  -- What advantage do you get from using cython instead of regular
python? I guess the edit distance calculation for similarity matching
is CPU-bound, but I wouldn't expect even that to be a bottleneck in
regular usage.

-- Nathaniel
_______________________________________________
Cython-dev mailing list
Cython-dev <at> codespeak.net
http://codespeak.net/mailman/listinfo/cython-dev
Hoyt Koepke | 5 Jan 05:10
Picon
Gravatar

Re: [Cython] ANN: TreeDict, a new cython package

> Neat! From a quick read of the docs, I have a few minor questions:

Thanks for the look & the helpful feedback!  I appreciate it.

>  -- Why invent a new mechanism for global variables
> (getTree/treeExists)? It seems redundant (and violates "there's only
> one way to do it", if you care about that prescription for pythonicity
> ;-)

I see what you mean, but there are a couple, possibly weak, reasons
why I thought those would be useful.  First, my inspiration was the
python logging module with its getLogger function that
instantiates/returns a logger with that given name, and the resulting
logger is common across modules.  Second, it always felt a little
weird to me to "write" important information to an imported module's
namespace without some sort of function that lets you know what you're
doing.  However, a bit of googling indicates that this is an accepted
way of doing things if there's a global config file imported
everywhere and all the config parameters are set in that file.  Now
I'm +0 on keeping them in.

>  -- I'd prefer PEP8-complaint method names, e.g. branch_name rather
> than branchName.

Ah... I was not familiar with that PEP, but it makes sense.  I chose
to do it this way since I usually have variable names in that style,
which become attributes with TreeDict, and wanted to avoid conflicts.
However, I think a PEP would overrule that.  Since I'm currently the
only one using it, I can change over to that.

(Continue reading)

Matthew Bromberg | 6 Jan 02:04
Picon
Favicon

[Cython] Arrays of Python Objects

I need to create a fast hashing function to cache the creation and 
deletion of
an extension class that uses an external C buffer.  (It's actually a 
buffer on a hardware device.)

I can't afford to keep creating and deleting these items so I want to 
save them all off and reuse them as needed.

So I thought it would be easy just to create an array of python objects 
and use a simple hash into an array
written in cython and maybe a little C.

I was wrong.  A buffer of the type
cdef object[int] myhash
can only be declared locally inside some function or method.
  I don't see any suitable python class.  The closest would be set,
but this needs to be minimalistic and as fast as possible.   Set is just 
too heavy and probably slow for my needs.

In fact it would be nice if I could override the creation and 
destruction of my extension classes so that the class and it's python
attribute objects don't get garbage collected.  My best hope was storing 
it in some kind of python aware cython array, but I'm not sure this will 
work. If I store objects in C arrays somehow,  then the objects will get 
garbage collected anyway.

It seems to be a bit of a conundrum.

Any thoughts?

(Continue reading)

Sturla Molden | 6 Jan 14:01
Picon

Re: [Cython] Arrays of Python Objects


> I was wrong.  A buffer of the type
> cdef object[int] myhash
> can only be declared locally inside some function or method.

Then write an extension class that wraps the buffer, and store one on
module scope. Something like this (not tested):

DEF BUF_SIZE = 32

import myclass # your extension class

from python cimport PyObject, Py_INCREF, Py_DECREF

cdef class bufferwrapper:

    cdef PyObject *buffer[BUF_SIZE]

    def __cinit__(bufferwrapper self):
        cdef int n
        for n in range(BUF_SIZE):
            self.buffer[n] = NULL

    def __init__(bufferwrapper self):
        cdef int n
        cdef object tmp
        for n in range(BUF_SIZE):
            tmp = myclass.myclass()
            Py_INCREF(tmp)
            self.buffer[n] = <PyObject*> tmp
(Continue reading)

Sturla Molden | 6 Jan 14:22
Picon

Re: [Cython] Arrays of Python Objects


Matthew Bromberg wrote:

> I was wrong.  A buffer of the type
> cdef object[int] myhash
> can only be declared locally inside some function or method.

Ignore my last post!!!

You are confusing PEP 3118 buffers with C arrays.

You probably want this:

    DEF BUF_SIZE = 32
    cdef object myhash[BUF_SIZE]

which should work at module scope. This says "myhash is a C array of
object, of length BUF_SIZE."

Your statement

   cdef object[int] myhash

is not a Cython syntax error, but means something else:

"object is a type that exposes a PEP 3118 Py_buffer storing C integers,
and myhash is an instance of PEP 3118 Py_buffer."

A Py_buffer can only be used inside a method because of the auxillary
variables to which it unboxes for fast array lookup. Hence the error
(Continue reading)


Gmane