Colin J. Williams | 1 Mar 2005 20:08
Picon
Favicon

Some comments on the Numeric3 Draft of 1-Mar-05

Abstract

The two stage approach, Array and ufuncs PEP's, seems a reasonable way to proceed.

Name

Array  would be a good choice for the new Type/Class.  Python has an array module with an array constructor and an obsolete ArrayType synonym for array.  Recognizing the case difference, Array would be a suitable choice.  The module might be multiArray.

Basic Types

These are, presumably,  intended as the types of the data elements contained in an Array instance.  I would see then as sub-types of Array. Hopefully some term, more descriptive than void can be used to name bit sequences.

It might be useful to have a Table type where there is a header of some sort to keep track, for each column of the column name name and the datatype in that column, so that the user could, optionally specify validity checks.

I wonder why there is a need for 30 new types.  Python itself has about 30 distinct types.  Wouldn't it be more saleable to think in terms of an Array type/class with three built-in sub-classes (Boolean, Numeric and Object) and 25 or so dataElement types?  With this sort of structure, a Matrix class could be implemented as a sub-class of the Numeric sub-class.

Regarding the naming of the Numeric dataElementType, there is benefit to the numarray.numerictypes approach.

Suppose one has:
import numarray.numerictypes as _nt

Then, the editor (PythonWin for example) responds to the entry of "_nt." with a drop down menu offering the available types from which the user can select one.

I suggest that Numeric3 offers the opportunity to drop the word rank from its lexicon.  "rank" has an established usage long before digital computers.  See: http://mathworld.wolfram.com/Rank.html

Perhaps some abbreviation for "Dimensions" would be acceptable.

len() seems to be treated as a synonym for the number of dimensions.  Currently, in numarray, it follows the usual sequence of sequences approach of Python and returns the number of rows in a two dimensional array.

Rank-0 arrays and Python Scalars


Regarding Rank-0 Question 2.  I've already, in effect, answered "yes".  I'm sure that a more compelling "Pro" could be written

The "Con" case is valid but, I suggest, of no great consequence.  In my view, the important considerations are (a) the complexity of training the newcomer and (b) whether the added work should be imposed on the generic code writer or the end user.  I suggest that the aim should be to make things as easy as possible for the end user.

The Proposed Solution appear to create complexity for little benefit.  It would probably add to the difficulty of persuading the Python folk to accept the proposal.

The Python LongType continues to remain outside the pale.  I don't see this as a big problem.

Buffer Behaviour

I gather that a buffer cannot have a hole in it.  I believe MatLab provides for the deletion of rows or columns.  I don't see this as being an important capability.

Mapping Iterator

An example could help here.  I am puzzled by "slicing syntax does not work in constructors.".

Attributes

Why not make flags of the record class?  Then arr.flags.NOTSWAPPED could return a Boolean value.

Presumably there will also be Fortran flag.  If so, I wonder why the config function needs a fortran parameter.

Methods

numarray also has a useful info() method.
dumps()?

Matrix Class

" A default Matrix class will either inherit from or contain the Python class".  Surely, almost all of the objects above are to be rooted in "new" style classes.  See PEP's 252 and 253 or http://www.python.org/2.2.2/descrintro.html

Colin W.


------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click

konrad.hinsen | 2 Mar 2005 09:01
Favicon
Gravatar

Re: Some comments on the Numeric3 Draft of 1-Mar-05

On 01.03.2005, at 20:08, Colin J. Williams wrote:

> Basic Types
>  These are, presumably,  intended as the types of the data elements 
> contained in an Array instance.  I would see then as sub-types of 
> Array.

Element types as subtypes???

>  I wonder why there is a need for 30 new types.  Python itself has 
> about 30 distinct types.  Wouldn't it be more saleable to think in 
> terms of an Array

The Python standard library has hundreds of types, considering that the 
difference between C types and classes is an implementation detail.

>  Suppose one has:
>  import numarray.numerictypes as _nt
>
>  Then, the editor (PythonWin for example) responds to the entry of 
> "_nt." with a drop down menu offering the available types from which 
> the user can select one.

That sounds interesting, but it looks like this would require specific 
support from the editor.

>  I suggest that Numeric3 offers the opportunity to drop the word rank 
> from its lexicon.  "rank" has an established usage long before digital 
> computers.  See: http://mathworld.wolfram.com/Rank.html

The meaning of "tensor rank" comes very close and was probably the 
inspiration for the use of this terminology in array system.

>  Perhaps some abbreviation for "Dimensions" would be acceptable.

The equivalent of "rank" is "number of dimensions", which is a bit long 
for my taste.

>  len() seems to be treated as a synonym for the number of dimensions.  
> Currently, in numarray, it follows the usual sequence of sequences 
> approach of Python and returns the number of rows in a two dimensional 
> array.

As it should. The rank is given by len(array.shape), which is pretty 
much a standard idiom in Numeric code. But I don't see any place in the 
PEP that proposes something different!

> Rank-0 arrays and Python Scalars
>
>  Regarding Rank-0 Question 2.  I've already, in effect, answered 
> "yes".  I'm sure that a more compelling "Pro" could be written

Three "pro" argument to be added are:

- No risk of user confusion by having two types that are nearly but not
   exactly the same and whose separate existence can only be explained
   by the history of Python and NumPy development.

- No problems with code that does explicit typechecks (isinstance(x, 
float)
   or type(x) == types.FloatType). Although explicit typechecks are 
considered
   bad practice in general, there are a couple of valid reasons to use 
them.

- No creation of a dependency on Numeric in pickle files (though this 
could
   also be done by a special case in the pickling code for arrays)

>  The "Con" case is valid but, I suggest, of no great consequence.  In 
> my view, the important considerations are (a) the complexity of 
> training the newcomer and (b) whether the added work should be imposed 
> on the generic code writer or the end user.  I suggest that the aim 
> should be to make things as easy as possible for the end user.

That is indeed a valid argument.

> Mapping Iterator
>  An example could help here.  I am puzzled by "slicing syntax does not 
> work in constructors.".

Python allows the colon syntax only inside square brackets. x[a:b] and 
x[a:b:c] are fine but it is not possible to write iterator(a:b).  One 
could use iterator[a:b] instead, but this is a bit confusing, as it is 
not the iterator that is being sliced.

Konrad.

-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_ide95&alloc_id396&op=click
Robert Kern | 1 Mar 2005 21:06
Favicon

Re: Some comments on the Numeric3 Draft of 1-Mar-05

Colin J. Williams wrote:

> I suggest that Numeric3 offers the opportunity to drop the word /rank/ 
> from its lexicon.  "rank" has an established usage long before digital 
> computers.  See: http://mathworld.wolfram.com/Rank.html

It also has a well-established usage with multi-arrays.

   http://mathworld.wolfram.com/TensorRank.html

> Perhaps some abbreviation for "Dimensions" would be acceptable.

It is also reasonable to say that array([1., 2., 3.]) has 3 dimensions.

>       Matrix Class
> 
> " A default Matrix class will either inherit from or contain the Python 
> class".  Surely, almost all of the objects above are to be rooted in 
> "new" style classes.  See PEP's 252 and 253 or 
> http://www.python.org/2.2.2/descrintro.html

Sure, but just because inheritance is possible does not entail that it 
is a good idea.

--

-- 
Robert Kern
rkern <at> ucsd.edu

"In the fields of hell where the grass grows high
  Are the graves of dreams allowed to die."
   -- Richard Harter

-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
Sebastian Haase | 1 Mar 2005 18:42
Picon

bug in pyfits w/ numarray 1.2

Hi,
After upgrading to the latest numarray we get this error from pyfits:
>>> a = U.loadFits(fn)
Traceback (most recent call last):
  File "<input>", line 1, in ?
  File "/jws30/haase/PrLin/Priithon/useful.py", line 1069, in loadFits
    return ff[ slot ].data
  File "/jws30/haase/PrLin/pyfits.py", line 1874, in __getattr__
    raw_data = num.fromfile(self._file, type=code, shape=dims)
  File "/jws30/haase/PrLin0/numarray/numarraycore.py", line 517, in fromfile
    bytesleft=type.bytes*_gen.product(shape)
AttributeError: 'str' object has no attribute 'bytes'
>>>pyfits.__version__
'0.9.3 (June 30, 2004)'

Looks like pyfits uses a typecode-string 'code'
in this line 1874:
raw_data = num.fromfile(self._file, type=code, shape=dims)

I this supposed to still work in numarray ? Or should pyfits be updated ?
I tried  num.fromfile(self._file, typecode=code, shape=dims)
but 'typecode' doesn't seem an allowed keyword for fromfile()

Thanks,
Sebastian Haase

-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
Colin J. Williams | 2 Mar 2005 18:21
Picon
Favicon

Re: Some comments on the Numeric3 Draft of 1-Mar-05

konrad.hinsen <at> laposte.net wrote:

> On 01.03.2005, at 20:08, Colin J. Williams wrote:
>
>> Basic Types
>>  These are, presumably,  intended as the types of the data elements 
>> contained in an Array instance.  I would see then as sub-types of Array.
>
>
> Element types as subtypes???

Sub-types in the sense that, given an instance a of Array, a.elementType 
gives us the type of the data elements contained in a.

>
>>  I wonder why there is a need for 30 new types.  Python itself has 
>> about 30 distinct types.  Wouldn't it be more saleable to think in 
>> terms of an Array
>
>
> The Python standard library has hundreds of types, considering that 
> the difference between C types and classes is an implementation detail.
>
I was thinking of the objects in the types module.

>>  Suppose one has:
>>  import numarray.numerictypes as _nt
>>
>>  Then, the editor (PythonWin for example) responds to the entry of 
>> "_nt." with a drop down menu offering the available types from which 
>> the user can select one.
>
>
> That sounds interesting, but it looks like this would require specific 
> support from the editor.
>
Yes, it is built into Mark Hammond's PythonWin and is a valuable tool.  
Unfortunately, it is not available for Linux.  However, I believe that 
SciTE and boa-constructor are intended to have the "completion" 
facility.  These open source projects are available both with Linux and 
Windows.

>>  I suggest that Numeric3 offers the opportunity to drop the word rank 
>> from its lexicon.  "rank" has an established usage long before 
>> digital computers.  See: http://mathworld.wolfram.com/Rank.html
>
>
> The meaning of "tensor rank" comes very close and was probably the 
> inspiration for the use of this terminology in array system.

Yes: The total number of contravariant 
<http://mathworld.wolfram.com/ContravariantTensor.html> and covariant 
<http://mathworld.wolfram.com/CovariantTensor.html> indices of a tensor 
<http://mathworld.wolfram.com/Tensor.html>. The rank of a tensor 
<http://mathworld.wolfram.com/Tensor.html> is independent of the number 
of dimensions <http://mathworld.wolfram.com/Dimension.html> of the space 
<http://mathworld.wolfram.com/Space.html>.

I was thinking in terms of linear independence, as with Matrix Rank: The 
rank of a matrix <http://mathworld.wolfram.com/Matrix.html> or a linear 
map <http://mathworld.wolfram.com/LinearMap.html> is the dimension 
<http://mathworld.wolfram.com/Dimension.html> of the range 
<http://mathworld.wolfram.com/Range.html> of the matrix 
<http://mathworld.wolfram.com/Matrix.html> or the linear map 
<http://mathworld.wolfram.com/LinearMap.html>, corresponding to the 
number of linearly independent 
<http://mathworld.wolfram.com/LinearlyIndependent.html> rows or columns 
of the matrix, or to the number of nonzero singular values 
<http://mathworld.wolfram.com/SingularValue.html> of the map.

I guess there has been a tussle between the tensor users and the matrix 
users for some time.

>
>>  Perhaps some abbreviation for "Dimensions" would be acceptable.
>
>
> The equivalent of "rank" is "number of dimensions", which is a bit 
> long for my taste.

Perhaps nDim, numDim or dim would be acceptable.

>
>>  len() seems to be treated as a synonym for the number of 
>> dimensions.  Currently, in numarray, it follows the usual sequence of 
>> sequences approach of Python and returns the number of rows in a two 
>> dimensional array.
>
>
> As it should. The rank is given by len(array.shape), which is pretty 
> much a standard idiom in Numeric code. But I don't see any place in 
> the PEP that proposes something different!

This was probably my misreading of len(T).

>
>> Rank-0 arrays and Python Scalars
>>
>>  Regarding Rank-0 Question 2.  I've already, in effect, answered 
>> "yes".  I'm sure that a more compelling "Pro" could be written
>
>
> Three "pro" argument to be added are:
>
> - No risk of user confusion by having two types that are nearly but not
>   exactly the same and whose separate existence can only be explained
>   by the history of Python and NumPy development.

Thanks, history has a pull in favour of retaining the current approach.

>
> - No problems with code that does explicit typechecks (isinstance(x, 
> float)
>   or type(x) == types.FloatType). Although explicit typechecks are 
> considered
>   bad practice in general, there are a couple of valid reasons to use 
> them.
>
I would see this as supporting the conversion to a scalar.  For example:

     >>> type(type(x))
    <type 'type'>
     >>> isinstance(x, float)
    True
     >>> isinstance(x, types.FloatType)
    True
     >>>

> - No creation of a dependency on Numeric in pickle files (though this 
> could
>   also be done by a special case in the pickling code for arrays)
>
>>  The "Con" case is valid but, I suggest, of no great consequence.  In 
>> my view, the important considerations are (a) the complexity of 
>> training the newcomer and (b) whether the added work should be 
>> imposed on the generic code writer or the end user.  I suggest that 
>> the aim should be to make things as easy as possible for the end user.
>
>
> That is indeed a valid argument.
>
>> Mapping Iterator
>>  An example could help here.  I am puzzled by "slicing syntax does 
>> not work in constructors.".
>
>
> Python allows the colon syntax only inside square brackets. x[a:b] and 
> x[a:b:c] are fine but it is not possible to write iterator(a:b).  One 
> could use iterator[a:b] instead, but this is a bit confusing, as it is 
> not the iterator that is being sliced.

Thanks.  It would be nice if a:b or a:b:c could return a slice object.

>
> Konrad.
>
Colin W.

-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
Stephen Walton | 2 Mar 2005 18:23
Favicon

Re: bug in pyfits w/ numarray 1.2

Sebastian Haase wrote:

>Hi,
>After upgrading to the latest numarray we get this error from pyfits:
>  
>
>>>>a = U.loadFits(fn)
>>>>        
>>>>
>Traceback (most recent call last):
>  File "<input>", line 1, in ?
>  File "/jws30/haase/PrLin/Priithon/useful.py", line 1069, in loadFits
>    return ff[ slot ].data
>
Are you sure the value of 'slot' and 'ff' in your code are correct.  
pyfits 0.9.3 and numarray 1.2.2 seem to work fine for me:

In [5]: f=pyfits.open(file)

In [6]: v=f[0].data

In [7]: v?
Type:           NumArray
Base Class:     <class 'numarray.numarraycore.NumArray'>
String Form:
[[ 221  171   67 ...,  112 -136   12]
            [ 125   78  159 ...,  249 -345 -260]
            [ 346   47  250 ..., <...> ...,  206 -106 -127]
            [ 187   16  218 ...,  342 -243  -59]
            [ 156  200  279 ...,  138 -209 -230]]
Namespace:      Interactive
Length:         1024
Docstring:
    Fundamental Numeric Array

    type       The type of each data element, e.g. Int32
    byteorder  The actual ordering of bytes in buffer: "big" or "little".

In [8]: pyfits.__version__
Out[8]: '0.9.3 (June 30, 2004)'

In [9]: numarray.__version__
Out[9]: '1.2.2'

-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
Bruce Southey | 2 Mar 2005 21:13
Picon

Re: Some comments on the Numeric3 Draft of 1-Mar-05

Hi, 

>>>  I suggest that Numeric3 offers the opportunity to drop the word rank  
>>> from its lexicon.  "rank" has an established usage long before  
>>> digital computers.  See: http://mathworld.wolfram.com/Rank.html 
>> 
>> 
>> The meaning of "tensor rank" comes very close and was probably the  
>> inspiration for the use of this terminology in array system. 
> 
>Yes: The total number of contravariant  
><http://mathworld.wolfram.com/ContravariantTensor.html> and covariant  
><http://mathworld.wolfram.com/CovariantTensor.html> indices of a tensor  
><http://mathworld.wolfram.com/Tensor.html>. The rank of a tensor  
><http://mathworld.wolfram.com/Tensor.html> is independent of the number  
>of dimensions <http://mathworld.wolfram.com/Dimension.html> of the space  
><http://mathworld.wolfram.com/Space.html>. 
> 
>I was thinking in terms of linear independence, as with Matrix Rank: The  
>rank of a matrix <http://mathworld.wolfram.com/Matrix.html> or a linear  
>map <http://mathworld.wolfram.com/LinearMap.html> is the dimension  
><http://mathworld.wolfram.com/Dimension.html> of the range  
><http://mathworld.wolfram.com/Range.html> of the matrix  
><http://mathworld.wolfram.com/Matrix.html> or the linear map  
><http://mathworld.wolfram.com/LinearMap.html>, corresponding to the  
>number of linearly independent  
><http://mathworld.wolfram.com/LinearlyIndependent.html> rows or columns  
>of the matrix, or to the number of nonzero singular values  
><http://mathworld.wolfram.com/SingularValue.html> of the map. 
> 
>I guess there has been a tussle between the tensor users and the matrix  
>users for some time. 
> 

If you come from the linear algebra, rank is the column or row space which is 
not the current usage in numarray but this is the Matlab usage. The matrix rank 
doesn't exist in numarray (as such, but can be computed) so the only problem 
for is remembering what rank provides and avoiding it in numarray.  

>> 
>>>  Perhaps some abbreviation for "Dimensions" would be acceptable. 
>> 
>> 
>> The equivalent of "rank" is "number of dimensions", which is a bit  
>> long for my taste. 
> 
>Perhaps nDim, numDim or dim would be acceptable. 
> 

There needs to be a clarification that by dimensions, one does not mean the 
number of rows and columns etc. However, taking directly from the numarray 
manual: 
"The rank of an array A is always equal to len(A.getshape())."  

So I would guess the best solution is to find out how people actually use the 
term 'rank' in Numerical Python applications.  

Regards 
Bruce 

-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
Garnet Chan | 2 Mar 2005 22:10
Picon
Favicon

PyObject arrays

Hi All,
Do PyObject arrays works, more specifically Numeric arrays of Numeric arrays?
I've tried:

from Numeric import *
mat = zeros([2, 2], PyObject)
mat[0, 0] = zeros([2, 2])

which gives

ValueError: array too large for destination. It seems to be calling
PyArray_CopyObject; I noticed that there was some special code to make
arrays of strings work, but not for other objects.

This is on Python 2.3.4 and Numeric 23.3

thanks,
Garnet Chan

-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
Travis Oliphant | 2 Mar 2005 23:19
Favicon

Re: PyObject arrays

Garnet Chan wrote:

>Hi All,
>Do PyObject arrays works, more specifically Numeric arrays of Numeric arrays?
>  
>
They probably don't work when the objects are Numeric arrays.    It 
would be nice if they did, but this could take some effort.

-Travis

-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
Sebastien.deMentendeHorne | 3 Mar 2005 00:23

RE: Some comments on the Numeric3 Draft of 1-M ar-05


> It might be useful to have a Table type where there is a header of some
sort to keep track, 
> for each column of the column name name and the datatype in that column,
so that the user 
> could, optionally specify validity checks.

Another useful type for arrays representing physical values would be an
array that keeps vectors for each dimension with index values. For instance,
an object representing temperature at a given time in a given location would
consist in
    data = N x M array of Float64 = [ [ 23, 34, 23], [ 31, 28,29] ]
    first_axis = N array of time = [ "01/01/2004", "02/01/2004" ]
    second_axis = M array of location = [ "Paris", "New York" ]

All slicing operation would equivalently slice the corresponding axis.
Assignment between arrays would be axis coherent (assigning "Paris" in one
array to "Paris" in another while putting NaN or 0 if there is no
correspondance).
If indexing could also be done via component of *_axis, it would be also
useful.

Several field of applications could benefit of this (econometrics, monte
carlo simulation, physical simulation, time series,...). In fact most of
real data consist usually of values for tuples of general indices (e.g.
temparature <at> ("01/01/2004","Paris"))

Hmmm, I think I was just thinking aloud :-)

=======================================================
This message is confidential. It may also be privileged or otherwise protected by work product immunity or
other legal rules. If you have received it by mistake please let us know by reply and then delete it from your
system; you should not copy it or disclose its contents to anyone. All messages sent to and from Electrabel
may be monitored to ensure compliance with internal policies and to protect our business. Emails are not
secure and cannot be guaranteed to be error free as they can be intercepted, amended, lost or destroyed, or
contain viruses. Anyone who communicates with us by email is taken to accept these risks.

http://www.electrabel.be/homepage/general/disclaimer_EN.asp
=======================================================

-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click

Gmane