Per Tunedal | 20 Mar 09:45 2015

Installation on Windows

Hi,
how do I install Numpy on Windows? I've tried the setup.py file, but get
an error message:

setup.py install

gives:
No module named msvccompiler in numpy.distutils; trying from distutils
error: Unable to find vcvarsall.bat

Yours,
Per Tunedal
Jianhong Wang | 20 Mar 03:31 2015
Picon

how to optimize numpy code for Markovian path

Below is a python function to generate Markov path (the travelling salesman problem).

def generate_travel_path(markov_matrix, n): assert markov_matrix.shape[0] == markov_matrix.shape[1] assert n <= markov_matrix.shape[0] p = markov_matrix.copy() path = [0] * n for k in range(1, n): k1 = path[k-1] row_sums = 1 / (1 - p[:, k1]) p *= row_sums[:, np.newaxis] p[:, k1] = 0 path[k] = np.random.multinomial(1, p[k1, :]).argmax() assert len(set(path)) == n return path

markov_matrix is a predefined Markov transition matrix. The code generates a path starting from node zero and visit every node once based on this matrix.

However I feel the function is quite slow. Below is the line-by-line profile with a 53x53 markov_matrix:

Timer unit: 3.49943e-07 s Total time: 0.00551195 s File: <ipython-input-29-37e4c9b5469e> Function: generate_travel_path at line 1 Line # Hits Time Per Hit % Time Line Contents ============================================================== 1 def generate_travel_path(markov_matrix, n): 2 1 31 31.0 0.2 assert markov_matrix.shape[0] == markov_matrix.shape[1] 3 1 12 12.0 0.1 assert n <= markov_matrix.shape[0] 4 5 1 99 99.0 0.6 p = markov_matrix.copy() 6 1 12 12.0 0.1 path = [0] * n 7 53 416 7.8 2.6 for k in range(1, n): 8 52 299 5.8 1.9 k1 = path[k-1] 9 52 3677 70.7 23.3 row_sums = 1 / (1 - p[:, k1]) 10 52 4811 92.5 30.5 p = p * row_sums[:, np.newaxis] 11 52 1449 27.9 9.2 p[:, k1] = 0 12 52 4890 94.0 31.0 path[k] = np.random.multinomial(1, p[k1, :]).argmax() 13 14 1 51 51.0 0.3 assert len(set(path)) == n 15 1 4 4.0 0.0 return path

If I ran this function 25000 times, it will take me more than 125 seconds. Any headroom to improve the speed?

Below is a simple function to generate a Markov matrix.

def initial_trans_matrix(n): x = np.ones((n, n)) / (n - 1) np.fill_diagonal(x, 0.0) return x
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion <at> scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Saprative Jana | 19 Mar 07:25 2015
Picon

Improve Numpy Datetime Functionality for Gsoc

hi,
    I am Saprative .I am new to numpy devlopment. I want to work on the project of improving datetime functionality numpy project .I want to solve some related bugs and get started with the basics. As there is no irc channel for numpy so i am facing a problem of contacting with the mentors moreover there is no mentors mentioned for this project. So anybody who can help me out please contact with me.
from,
Saprative Jana
(Mob: +919477325233)
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion <at> scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Marcos . | 19 Mar 02:02 2015
Picon

Porting C to Python

Dear colleagues,

My name is Marcos Chaves, I'm an undergraduate student from Brazil. I study at the State University of Campinas, currently at the end of my graduation. 

I'm deeply interested in applying to GSOC this year, specifically to work with NumPy.

About some months ago me and my friends at work started porting some legacy C/C++ code to Python, a code that we used to evaluate some images using OpenCV. We made use of NumPy and I liked it. I believe I can be of some help in making it better, particularly in solving the issue with porting parts of it from C (while maintaining optimal performance).

I've never contributed to NumPy in any way and I see that its a requirement for me to apply for such task. I would like to know if there's a way I can participate, because the end of student registration is coming soon and I'm not sure if there is enough time for me to do something like that, but I'm interested.

Some input from the potential mentor, Nathaniel Smith, along with some extra info on this task would be very nice for me.

Thank you!
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion <at> scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion
James A. Bednar | 17 Mar 19:37 2015
Picon
Picon

ANN: HoloViews 1.0 released

We are pleased to announce the first public release of HoloViews, a
Python package for scientific and engineering data visualization:

   http://ioam.github.io/holoviews

HoloViews provides composable, sliceable, declarative data structures
for building even complex visualizations easily.

It's designed to exploit the rich ecosystem of scientific Python tools
already available, using Numpy for data storage, matplotlib and mpld3
as plotting backends, and integrating fully with IPython Notebook to
make your data instantly visible.

If you look at the website for just about any other visualization
package, you'll see a long list of pretty pictures, each one of which
has a page or two of code putting it together.  There are pretty
pictures in HoloViews too, but there is *no* hidden code -- *all* of
the steps needed to build a given figure are shown right before the
HoloViews plot, with just a few lines needed for nearly all of our
examples, even complex multi-figure subplots and animations.  This
concise but flexible specification makes it practical to explore and
analyze your data interactively, while leaving a full record for later
reproducibility in the notebook.

It may sound like magic, but it's not -- HoloViews simply lets you
annotate your data with appropriate metadata, and then the data can
display itself!  HoloViews provides a set of general, compositional,
multidimensional data structures suitable for both discrete and
continuous real-world data, and pairs them with separate customizable
plotting classes to visualize them without extensive coding.  An
large collection of continuously tested IPython Notebook tutorials
accompanies HoloViews, showing you precisely the small number of steps
required to generate any of the plots.

Some of the most important features:

- Freely available under a BSD license
- Python 2 and 3 compatible
- Minimal external dependencies -- easy to integrate into your workflow
- Builds figures by slicing, sampling, and composing your data
- Builds web-embeddable animations without any extra coding
- Easily customizable without obscuring the underlying data objects
- Includes interfaces to pandas and Seaborn
- Winner of the 2015 UK Open Source Award

For the rest, check out ioam.github.io/holoviews!

Jean-Luc Stevens
Philipp Rudiger
James A. Bednar

The University of Edinburgh
School of Informatics

--

-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
Shubhankar Mohapatra | 17 Mar 19:00 2015
Picon

Mathematical functions in Numpy

Hello all,
I am a undergraduate and i am trying to do a project this time on numppy in gsoc. This project is about integrating vector math library classes of sleef and yeppp into numpy to make the mathematical functions faster. I have already studied the new library classes but i am unable to find the sin , cos function definitions in the numpy souce code.Can someone please help me find the functions in the source code so that i can implement the new library class into numpy.
Thanking you,
Shubhankar Mohapatra

_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion <at> scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Allan Haldane | 17 Mar 17:45 2015
Picon

should views into structured arrays be reversible?

Hello all,

I've introduced PR 5548 <https://github.com/numpy/numpy/pull/5548>
which, through more careful safety checks, allows views of object
arrays. However, I had to make 'partial views' into structured arrays
irreversible, and I want to check with the list that that's ok.

With the PR, if you only view certain fields of an array you cannot take
a 'reverse' view of the resulting object to get back the original array:

    >>> arr = np.array([(1,2),(4,5)], dtype=[('A', 'i'), ('B', 'i')])
    >>> varr = arr.view({'names': ['A'], 'formats': ['i'],
    ...                  'itemsize': arr.dtype.itemsize})
    >>> varr.view(arr.dtype)
    TypeError: view would access data parent array doesn't own

Ie., with this PR you can only take views into parts of an array that
have fields.

This was necessary in order to guarantee that we never interpret memory
containing a python Object as another type, which could cause a
segfault. I have a more extensive discussion & motivation in the PR,
including an alternative idea.

So does this limitation seem reasonable?

Cheers,
Allan
	
Dieter Van Eessen | 17 Mar 09:11 2015
Picon

Re: 3D array and the right hand rule

Hello,

Sorry to disturb again, but the topic still bugs me somehow...
I'll try to rephrase the question:

- What's the influence of the type of N-array representation with respect to TENSOR-calculus?
- Are multiple representations possible?
- I assume that the order of the dimensions plays a major role in for example TENSOR product.
Is this assumption correct?

As I said before, my math skills are lacking in this area...
I hope you consider this a valid question.

kind regards,
Dieter


On Fri, Jan 30, 2015 at 2:32 AM, Alexander Belopolsky <ndarray <at> mac.com> wrote:

On Mon, Jan 26, 2015 at 6:06 AM, Dieter Van Eessen <dieter.van.eessen <at> gmail.com> wrote:
I've read that numpy.array isn't arranged according to the 'right-hand-rule' (right-hand-rule => thumb = +x; index finger = +y, bend middle finder = +z). This is also confirmed by an old message I dug up from the mailing list archives. (see message below)

Dieter,

It looks like you are confusing dimensionality of the array with the dimensionality of a vector that it might store.  If you are interested in using numpy for 3D modeling, you will likely only encounter 1-dimensional arrays (vectors) of size 3 and 2-dimensional arrays  (matrices) of size 9 or shape (3, 3).

A 3-dimensional array is a stack of matrices and the 'right-hand-rule' does not really apply.  The notion of C/F-contiguous deals with the order of axes (e.g. width first or depth first) while the right-hand-rule is about the direction of the axes (if you "flip" the middle finger right hand becomes left.)  In the case of arrays this would probably correspond to little-endian vs. big-endian: is a[0] stored at a higher or lower address than a[1].  However, whatever the answer to this question is for a particular system, it is the same for all axes in the array, so right-hand - left-hand distinction does not apply. 

_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion <at> scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion




--
gtz,
Dieter VE
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion <at> scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Dave Hirschfeld | 16 Mar 16:53 2015
Picon

Fastest way to compute summary statistics for a specific axis

I have a number of large arrays for which I want to compute the mean and 
standard deviation over a particular axis - e.g. I want to compute the 
statistics for axis=1 as if the other axes were combined so that in the 
example below I get two values back

In [1]: a = randn(30, 2, 10000)

For the mean this can be done easily like:

In [2]: a.mean(0).mean(-1)
Out[2]: array([ 0.0007, -0.0009])

...but this won't work for the std. Using some transformations we can 
come up with something which will work for either:

In [3]: a.transpose(2,0,1).reshape(-1, 2).mean(axis=0)
Out[3]: array([ 0.0007, -0.0009])

In [4]: a.transpose(1,0,2).reshape(2, -1).mean(axis=-1)
Out[4]: array([ 0.0007, -0.0009])

If we look at the performance of these equivalent methods:

In [5]: %timeit a.transpose(2,0,1).reshape(-1, 2).mean(axis=0)
100 loops, best of 3: 14.5 ms per loop

In [6]: %timeit a.transpose(1,0,2).reshape(2, -1).mean(axis=-1)
100 loops, best of 3: 5.05 ms per loop

we can see that the latter version is a clear winner. Investigating 
further, both methods appear to copy the data so the performance is 
likely down to better cache utilisation.

In [7]: np.may_share_memory(a, a.transpose(2,0,1).reshape(-1, 2))
Out[7]: False

In [8]: np.may_share_memory(a, a.transpose(1,0,2).reshape(2, -1))
Out[8]: False

Both methods are however significantly slower than the initial attempt:

In [9]: %timeit a.mean(0).mean(-1)
1000 loops, best of 3: 1.2 ms per loop

Perhaps because it allocates a smaller temporary?

For those who like a challenge: is there a faster way to achieve what 
I'm after?

Cheers,
Dave
Stephan Hoyer | 16 Mar 06:12 2015
Picon

numpy.stack -- which function, if any, deserves the name?

In the past months there have been two proposals for new numpy functions using the name "stack":

1. np.stack for stacking like np.asarray(np.bmat(...))
http://thread.gmane.org/gmane.comp.python.numeric.general/58748/

2. np.stack for stacking along an arbitrary new axis (this was my proposal)

Both functions generalize the notion of stacking arrays from the existing hstack, vstack and dstack, but in two very different ways. Both could be useful -- but we can only call one "stack". Which one deserves that name?

The existing *stack functions use the word "stack" to refer to combining arrays in two similarly different ways:
a. For ND -> ND stacking along an existing dimensions (like numpy.concatenate and proposal 1)
b. For ND -> (N+1)D stacking along new dimensions (like proposal 2).

I think it would be much cleaner API design if we had different words to denote these two different operations. Concatenate for "combine along an existing dimension" already exists, so my thought (when I wrote proposal 2), was that the verb "stack" could be reserved (going forward) for "combine along a new dimension." This also has the advantage of suggesting that "concatenate" and "stack" are the two fundamental operations for combining N-dimensional arrays. The documentation on this is currently quite confusing, mostly because no function like that in proposal 2 currently exists.

Of course, the *stack functions have existed for quite some time, and in many cases vstack and hstack are indeed used for concatenate like functionality (e.g., whenever they are used for 2D arrays/matrices). So the case is not entirely clear-cut. (We'll never be able to remove this functionality from NumPy.)

In any case, I would appreciate your thoughts.

Best,
Stephan
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion <at> scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Robert McGibbon | 16 Mar 05:32 2015
Picon

Rewrite np.histogram in c?

Hi,

Numpy.histogram is implemented in python, and is a little sluggish. This has been discussed previously on the mailing list, [1, 2]. It came up in a project that I maintain, where a new feature is bottlenecked by numpy.histogram, and one developer suggested a faster implementation in cython [3].

Would it make sense to reimplement this function in c? or cython? Is moving functions like this from python to c to improve performance within the scope of the development roadmap for numpy? I started implementing this a little bit in c, [4] but I figured I should check in here first.

-Robert

_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion <at> scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Gmane