Igor | 24 May 00:46
Picon

Re: code using Thrust and pyCUDA

http://dev.math.canterbury.ac.nz/home/pub/14/
http://dev.math.canterbury.ac.nz/home/pub/19/

There are some tricks that took me a while to get right. I've polished
and simplified it since then, but it's buried in some production code,
not easy to extract. See other published worksheets on that server
though.

Igor

On Thu, May 24, 2012 at 10:22 AM, Periwal, Vipul (NIH/NIDDK) [E]
<vipulp@...> wrote:
> I'd be very interested in any example of what you outlined in your email on the PyCUDA email list.
>
> Thanks,
> Vipul Periwal

Bryan Catanzaro | 23 May 18:09
Picon
Favicon
Gravatar

Re: Compiling thrust code in pyCUDA

Right.  As you can see from the example I posted, you have to keep a
separation between host compiler code and nvcc code, which you then
link together.  There are a few things to keep in mind.

1) NVCC cannot see any boost::python.  I'm in the process of filing a
bug against boost::python, which contains some non-standard C++ that
will never be compilable by NVCC.  Consequently, you'll need to do all
the manipulation of Python objects (access, construction) in host code
compiled with the host compiler.
2) The host compiler cannot see any GPU code.  So all your calls to
Thrust, etc. should be done from the device module.  You can include
your own code and link against your own libraries with the appropriate
Codepy calls.
3) As far as templates go, I've used two main strategies.  As you
mentioned, one is to write a wrapper which instantiates the template,
and call that wrapper from the host code.  The other is to use
explicit template instantiation in the device module, and use an
extern template instantiation in the host module.  Both have worked
for me in the past.

- bryan

On Wed, May 23, 2012 at 2:20 AM, Apostolis Glenis
<apostglen46@...> wrote:
> Really cool stuff.I guess I can have my thrust code in a different file and
> just compile the file at runtime,correct?
> One more thing is templates.If I have a function that requires template
> argument,i have to write a wrapper function for initialization at runtime?
>
> Thanks again,
(Continue reading)

wood | 23 May 16:35
Gravatar

PyCUDA virtual memory usage

Hi everyone,

I'm working on a reasonably large piece of Python software which uses 
PyCUDA for the performance-critical section of the code.  I've been 
experiencing a memory leak, and while trying to track it down I've 
noticed that PyCUDA has a large virtual memory footprint -- somewhere in 
the ballpark of 36GB even when no arrays have yet been allocated.  Is 
this typical for PyCUDA, or is there perhaps something wrong with my 
setup?

Thanks,
'
Brendan Wood

Thomas Wiecki | 23 May 14:55
Favicon

'function_param_set_pre_v4' is not defined

Hi,

I get:

Traceback (most recent call last):
  File "sim_drift_gpu.py", line 4, in <module>
    import pycuda.gpuarray as gpuarray
  File "/usr/local/lib/python2.7/dist-packages/pycuda-2011.2.2-py2.7-linux-i686.egg/pycuda/gpuarray.py",
line 3, in <module>
    import pycuda.elementwise as elementwise
  File "/usr/local/lib/python2.7/dist-packages/pycuda-2011.2.2-py2.7-linux-i686.egg/pycuda/elementwise.py",
line 33, in <module>
    from pycuda.tools import context_dependent_memoize
  File "/usr/local/lib/python2.7/dist-packages/pycuda-2011.2.2-py2.7-linux-i686.egg/pycuda/tools.py",
line 30, in <module>
    import pycuda.driver as cuda
  File "/usr/local/lib/python2.7/dist-packages/pycuda-2011.2.2-py2.7-linux-i686.egg/pycuda/driver.py",
line 545, in <module>
    _add_functionality()
  File "/usr/local/lib/python2.7/dist-packages/pycuda-2011.2.2-py2.7-linux-i686.egg/pycuda/driver.py",
line 525, in _add_functionality
    Function._param_set = function_param_set_pre_v4
NameError: global name 'function_param_set_pre_v4' is not defined

I think there is a typo in line 145 in driver.py when CUDA < 4.0 is used:
function_param_set -> function_param_set_pre_v4

so that it matches line 524:
        Function._param_set = function_param_set_pre_v4

(Continue reading)

Bryan Catanzaro | 23 May 00:43
Picon
Favicon
Gravatar

Re: Compiling thrust code in pyCUDA

Sure, here's an example of how to call thrust::sort on a PyCUDA gpuarray.
https://gist.github.com/2772091

- bryan

On Tue, May 22, 2012 at 2:36 PM, Apostolis Glenis
<apostglen46@...> wrote:
> That's great,do you have any ,preferably small,example to see exactly how it
> works?
>
> Thanks
>
> Apostolis
>
> 2012/5/23 Bryan Catanzaro <bcatanzaro@...>
>>
>> I do it all the time using another of Andreas' projects: CodePy.
>>
>> - bryan
>>
>> On Tue, May 22, 2012 at 12:58 PM, Apostolis Glenis
>> <apostglen46@...> wrote:
>> > Just curious :
>> > What would it take to compile a thrust function with pyCUDA?
>> >
>> > Apostolis
>> >
>> > _______________________________________________
>> > PyCUDA mailing list
>> > PyCUDA@...
(Continue reading)

Apostolis Glenis | 22 May 21:58
Picon

Compiling thrust code in pyCUDA

Just curious :

What would it take to compile a thrust function with pyCUDA?

Apostolis

Kepler features in PyCUDA

Hi,

I wanted to ask about the status/plans of including the new CUDA
features introduced with Kepler into PyCUDA.

Kind regard,
Ludwig
--

-- 
:::.: Ludwig Schmidt-Hackenberg
:::.: ludwig <at> iupr.com

:::.: AG Bildverstehen und Mustererkennung
:::.: Technische Universität Kaiserslautern
:::.: www.iupr.com

_______________________________________________
PyCUDA mailing list
PyCUDA <at> tiker.net
http://lists.tiker.net/listinfo/pycuda
Thomas Wiecki | 22 May 19:55
Favicon

undefined symbol: cuMemAllocPitch_v2

Hi,

I updated various packages (e.g. ugpraded cuda to 4, most recent
pycuda, ubuntu nvidia-dev drivers) on ubuntu 11.10. Pycuda was working
fine before, but after rebuilding and reinstalling the newest version
I get:

ImportError: /usr/local/lib/python2.7/dist-packages/pycuda-2011.2.2-py2.7-linux-x86_64.egg/pycuda/_driver.so:
undefined symbol: cuMemAllocPitch_v2

When I import pycuda.gpuarray. Any ideas?

Thomas

Serge Rey | 18 May 01:28
Picon
Favicon

pycuda install problem

hi all,

we have been trying, unsuccessfully to get pycuda installed on a new
mac pro with a quadro fx 4800. we have nvida toolkit and drivers
installed correctly and can build the test binaries just fine.

our error/log messages are below - if anyone has suggestions we would
be most appreciative.

thanks in advance.

-- 
Sergio (Serge) Rey
Professor, School of Geographical Sciences and Urban Planning
GeoDa Center for Geospatial Analysis and Computation
Arizona State University
http://geoplan.asu.edu/rey

Editor, International Regional Science Review
http://irx.sagepub.com

 pycuda-2011.2.2  python setup.py build
running build
running build_py
creating build
creating build/lib.macosx-10.5-x86_64-2.7
creating build/lib.macosx-10.5-x86_64-2.7/pycuda
copying pycuda/__init__.py -> build/lib.macosx-10.5-x86_64-2.7/pycuda
copying pycuda/_cluda.py -> build/lib.macosx-10.5-x86_64-2.7/pycuda
copying pycuda/_mymako.py -> build/lib.macosx-10.5-x86_64-2.7/pycuda
copying pycuda/autoinit.py -> build/lib.macosx-10.5-x86_64-2.7/pycuda
copying pycuda/characterize.py -> build/lib.macosx-10.5-x86_64-2.7/pycuda
copying pycuda/compiler.py -> build/lib.macosx-10.5-x86_64-2.7/pycuda
copying pycuda/cumath.py -> build/lib.macosx-10.5-x86_64-2.7/pycuda
copying pycuda/curandom.py -> build/lib.macosx-10.5-x86_64-2.7/pycuda
copying pycuda/debug.py -> build/lib.macosx-10.5-x86_64-2.7/pycuda
copying pycuda/driver.py -> build/lib.macosx-10.5-x86_64-2.7/pycuda
copying pycuda/elementwise.py -> build/lib.macosx-10.5-x86_64-2.7/pycuda
copying pycuda/gpuarray.py -> build/lib.macosx-10.5-x86_64-2.7/pycuda
copying pycuda/reduction.py -> build/lib.macosx-10.5-x86_64-2.7/pycuda
copying pycuda/scan.py -> build/lib.macosx-10.5-x86_64-2.7/pycuda
copying pycuda/tools.py -> build/lib.macosx-10.5-x86_64-2.7/pycuda
creating build/lib.macosx-10.5-x86_64-2.7/pycuda/gl
copying pycuda/gl/__init__.py -> build/lib.macosx-10.5-x86_64-2.7/pycuda/gl
copying pycuda/gl/autoinit.py -> build/lib.macosx-10.5-x86_64-2.7/pycuda/gl
creating build/lib.macosx-10.5-x86_64-2.7/pycuda/sparse
copying pycuda/sparse/__init__.py ->
build/lib.macosx-10.5-x86_64-2.7/pycuda/sparse
copying pycuda/sparse/cg.py -> build/lib.macosx-10.5-x86_64-2.7/pycuda/sparse
copying pycuda/sparse/coordinate.py ->
build/lib.macosx-10.5-x86_64-2.7/pycuda/sparse
copying pycuda/sparse/inner.py -> build/lib.macosx-10.5-x86_64-2.7/pycuda/sparse
copying pycuda/sparse/operator.py ->
build/lib.macosx-10.5-x86_64-2.7/pycuda/sparse
copying pycuda/sparse/packeted.py ->
build/lib.macosx-10.5-x86_64-2.7/pycuda/sparse
copying pycuda/sparse/pkt_build.py ->
build/lib.macosx-10.5-x86_64-2.7/pycuda/sparse
creating build/lib.macosx-10.5-x86_64-2.7/pycuda/compyte
copying pycuda/compyte/__init__.py ->
build/lib.macosx-10.5-x86_64-2.7/pycuda/compyte
copying pycuda/compyte/array.py ->
build/lib.macosx-10.5-x86_64-2.7/pycuda/compyte
copying pycuda/compyte/dtypes.py ->
build/lib.macosx-10.5-x86_64-2.7/pycuda/compyte
copying pycuda/compyte/scan.py ->
build/lib.macosx-10.5-x86_64-2.7/pycuda/compyte
running build_ext
building '_driver' extension
creating build/temp.macosx-10.5-x86_64-2.7
creating build/temp.macosx-10.5-x86_64-2.7/src
creating build/temp.macosx-10.5-x86_64-2.7/src/cpp
creating build/temp.macosx-10.5-x86_64-2.7/src/wrapper
creating build/temp.macosx-10.5-x86_64-2.7/bpl-subset
creating build/temp.macosx-10.5-x86_64-2.7/bpl-subset/bpl_subset
creating build/temp.macosx-10.5-x86_64-2.7/bpl-subset/bpl_subset/libs
creating build/temp.macosx-10.5-x86_64-2.7/bpl-subset/bpl_subset/libs/python
creating build/temp.macosx-10.5-x86_64-2.7/bpl-subset/bpl_subset/libs/python/src
creating build/temp.macosx-10.5-x86_64-2.7/bpl-subset/bpl_subset/libs/python/src/converter
creating build/temp.macosx-10.5-x86_64-2.7/bpl-subset/bpl_subset/libs/python/src/object
creating build/temp.macosx-10.5-x86_64-2.7/bpl-subset/bpl_subset/libs/smart_ptr
creating build/temp.macosx-10.5-x86_64-2.7/bpl-subset/bpl_subset/libs/smart_ptr/src
creating build/temp.macosx-10.5-x86_64-2.7/bpl-subset/bpl_subset/libs/thread
creating build/temp.macosx-10.5-x86_64-2.7/bpl-subset/bpl_subset/libs/thread/src
creating build/temp.macosx-10.5-x86_64-2.7/bpl-subset/bpl_subset/libs/thread/src/pthread
/usr/bin/clang -fno-strict-aliasing -fno-common -dynamic -fwrapv -Wall
-g -DPYGPU_PACKAGE=pycuda -DHAVE_CURAND=1 -DBOOST_PYTHON_SOURCE=1
-DPYGPU_PYCUDA=1 -DBOOST_MULTI_INDEX_DISABLE_SERIALIZATION=1
-Dboost=pycudaboost -Isrc/cpp -Ibpl-subset/bpl_subset
-I/usr/local/cuda/include
-I/usr/local/Cellar/python/2.7.3/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy/core/include
-I/usr/local/Cellar/python/2.7.3/Frameworks/Python.framework/Versions/2.7/include/python2.7
-c src/cpp/cuda.cpp -o
build/temp.macosx-10.5-x86_64-2.7/src/cpp/cuda.o -arch x86_64 -arch
i386 -isysroot /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.6.sdk
In file included from src/cpp/cuda.cpp:1:
In file included from src/cpp/cuda.hpp:29:
In file included from bpl-subset/bpl_subset/boost/foreach.hpp:78:
In file included from bpl-subset/bpl_subset/boost/range/rend.hpp:19:
In file included from bpl-subset/bpl_subset/boost/range/reverse_iterator.hpp:20:
In file included from
bpl-subset/bpl_subset/boost/iterator/reverse_iterator.hpp:12:
In file included from
bpl-subset/bpl_subset/boost/iterator/iterator_adaptor.hpp:15:
In file included from
bpl-subset/bpl_subset/boost/iterator/iterator_facade.hpp:26:
bpl-subset/bpl_subset/boost/type_traits/is_pod.hpp:45:35: error: 'T'
does not refer to a
      value
            BOOST_INTERNAL_IS_POD(T)
                                  ^
bpl-subset/bpl_subset/boost/type_traits/is_pod.hpp:26:47: note:
expanded from macro
      'BOOST_INTERNAL_IS_POD'
#define BOOST_INTERNAL_IS_POD(T) BOOST_IS_POD(T)
                                              ^
bpl-subset/bpl_subset/boost/type_traits/intrinsics.hpp:139:37: note:
expanded from macro
      'BOOST_IS_POD'
#   define BOOST_IS_POD(T) __is_pod(T)
                                    ^
bpl-subset/bpl_subset/boost/config/suffix.hpp:424:72: note: expanded from macro
      'BOOST_STATIC_CONSTANT'
#     define BOOST_STATIC_CONSTANT(type, assignment) static const type
assignment
                                                                       ^
bpl-subset/bpl_subset/boost/type_traits/is_pod.hpp:38:20: note: declared here
template <typename T> struct is_pod_impl
                   ^
bpl-subset/bpl_subset/boost/type_traits/is_pod.hpp:46:13: error: type
name requires a
      specifier or qualifier
         >::value));
            ^
bpl-subset/bpl_subset/boost/config/suffix.hpp:424:72: note: expanded from macro
      'BOOST_STATIC_CONSTANT'
#     define BOOST_STATIC_CONSTANT(type, assignment) static const type
assignment
                                                                       ^
In file included from src/cpp/cuda.cpp:1:
In file included from src/cpp/cuda.hpp:29:
In file included from bpl-subset/bpl_subset/boost/foreach.hpp:78:
In file included from bpl-subset/bpl_subset/boost/range/rend.hpp:19:
In file included from bpl-subset/bpl_subset/boost/range/reverse_iterator.hpp:20:
In file included from
bpl-subset/bpl_subset/boost/iterator/reverse_iterator.hpp:12:
In file included from
bpl-subset/bpl_subset/boost/iterator/iterator_adaptor.hpp:15:
In file included from
bpl-subset/bpl_subset/boost/iterator/iterator_facade.hpp:26:
bpl-subset/bpl_subset/boost/type_traits/is_pod.hpp:46:13: error: expected ')'
         >::value));
            ^
bpl-subset/bpl_subset/boost/config/suffix.hpp:424:72: note: expanded from macro
      'BOOST_STATIC_CONSTANT'
#     define BOOST_STATIC_CONSTANT(type, assignment) static const type
assignment
                                                                       ^
bpl-subset/bpl_subset/boost/type_traits/is_pod.hpp:42:9: note: to match this '('
        (::boost::type_traits::ice_or<
        ^
bpl-subset/bpl_subset/boost/config/suffix.hpp:424:72: note: expanded from macro
      'BOOST_STATIC_CONSTANT'
#     define BOOST_STATIC_CONSTANT(type, assignment) static const type
assignment
                                                                       ^
In file included from src/cpp/cuda.cpp:1:
In file included from src/cpp/cuda.hpp:29:
In file included from bpl-subset/bpl_subset/boost/foreach.hpp:78:
In file included from bpl-subset/bpl_subset/boost/range/rend.hpp:19:
In file included from bpl-subset/bpl_subset/boost/range/reverse_iterator.hpp:20:
In file included from
bpl-subset/bpl_subset/boost/iterator/reverse_iterator.hpp:12:
In file included from
bpl-subset/bpl_subset/boost/iterator/iterator_adaptor.hpp:15:
In file included from
bpl-subset/bpl_subset/boost/iterator/iterator_facade.hpp:26:
bpl-subset/bpl_subset/boost/type_traits/is_pod.hpp:46:20: error:
expected expression
         >::value));
                   ^
In file included from src/cpp/cuda.cpp:1:
In file included from src/cpp/cuda.hpp:34:
In file included from bpl-subset/bpl_subset/boost/python.hpp:11:
In file included from bpl-subset/bpl_subset/boost/python/args.hpp:8:
In file included from bpl-subset/bpl_subset/boost/python/detail/prefix.hpp:13:
In file included from
bpl-subset/bpl_subset/boost/python/detail/wrap_python.hpp:142:
In file included from
/usr/local/Cellar/python/2.7.3/Frameworks/Python.framework/Versions/2.7/include/python2.7/Python.h:126:
/usr/local/Cellar/python/2.7.3/Frameworks/Python.framework/Versions/2.7/include/python2.7/modsupport.h:27:65:
warning:
      'format' attribute argument not supported: PyArg_ParseTuple
PyAPI_FUNC(int) PyArg_ParseTuple(PyObject *, const char *, ...)
Py_FORMAT_PARSETUPLE(...
                                                                ^
/usr/local/Cellar/python/2.7.3/Frameworks/Python.framework/Versions/2.7/include/python2.7/pyport.h:871:57:
note:
      expanded from macro 'Py_FORMAT_PARSETUPLE'
#define Py_FORMAT_PARSETUPLE(func,p1,p2) __attribute__((format(func,p1,p2)))
                                                        ^
1 warning and 4 errors generated.
error: command '/usr/bin/clang' failed with exit status 1
➜  pycuda-2011.2.2

_______________________________________________
PyCUDA mailing list
PyCUDA <at> tiker.net
http://lists.tiker.net/listinfo/pycuda
Picon
Favicon

Invitation to connect on LinkedIn

LinkedIn

Saito Norio requested to add you as a connection on LinkedIn:

Michael,

I'd like to add you to my professional network on LinkedIn.

- Saito

 
View invitation from Saito Norio

 

WHY MIGHT CONNECTING WITH SAITO NORIO BE A GOOD IDEA?

Saito Norio's connections could be useful to you

After accepting Saito Norio's invitation, check Saito Norio's connections to see who else you may know and who you might want an introduction to. Building these connections can create opportunities in the future.

 

© 2012, LinkedIn Corporation

Eli Stevens (Gmail | 15 May 00:20
Picon
Gravatar

LaunchError: cuModuleLoadDataEx failed: launch failed -

I've seen this error a few times, but it's not reproducible.  Can
anyone give any insight into what might be going wrong?

Traceback (most recent call last):
  File "/home/elis/edit/work/dev/mms/common/util/threads.py", line 219, in run
    mod = cudahelper.compileSourceModule(kernel.code_str,
kernel.buildOptions_list)
  File "/home/elis/edit/work/dev/mms/common/util/cudahelper.py", line
551, in compileSourceModule
    return SourceModule(const_src + textwrap.dedent(source),
options=list(default_build_options | set(build_options)),
nvcc=os.path.join(os.getenv('CUDA_HOME', '/usr/local/cuda'), 'bin',
'nvcc'), **kwargs)
  File "/home/elis/venv/dev/local/lib/python2.7/site-packages/pycuda-2011.2.2-py2.7-linux-x86_64.egg/pycuda/compiler.py",
line 286, in __init__
    self.module = module_from_buffer(cubin)
LaunchError: cuModuleLoadDataEx failed: launch failed -

This happened in some (but not all, I don't think) of the threads that
got launched at roughly the same time.

A quick google search didn't turn up anything relevant.  Since I can't
reproduce it, I haven't had much traction on debugging it.  Is the
SourceModule threadsafe?  We're using it from a number of python
threads ATM, so if it's not threadsafe that could explain it.  Any
clues would be much appreciated.  :)

Thanks,
Eli


Gmane