Ron Adam | 10 Jan 02:11 2011

Re: Module aliases and/or "real names"


On 01/09/2011 12:18 PM, Nick Coghlan wrote:
> On Mon, Jan 10, 2011 at 3:56 AM, Ron Adam<rrr@...>  wrote:
>> On 01/09/2011 12:39 AM, Nick Coghlan wrote:
>>>> Also consider having virtual modules, where objects in it may have come
>>>> from
>>>> different *other* locations. A virtual module would need a way to keep
>>>> track
>>>> of that. (I'm not sure this is a good idea.)
>>
>>> It's too late, code already does that. This is precisely the use case
>>> I am trying to fix (objects like functools.partial that deliberately
>>> lie in their __module__ attribute), so that this can be done *right*
>>> (i.e. without having to choose which use cases to support and which
>>> ones to break).
>>
>> Yes, __builtins__ is a virtual module.
>
> No, it's a real module, just like all the others.

As George pointed out it's "builtins".  But you knew what I was referring 
to. ;-)

I wasn't saying it's not a real module, but there are differences.  Mainly 
builtins (and other c modules) don't have a file reference after it's 
imported like modules written in python.

 >>> import dis
 >>> dis
<module 'dis' from '/usr/local/lib/python3.2/dis.py'>
(Continue reading)

Mark Dickinson | 10 Jan 09:27 2011
Picon

Re: Add irange with large integer step support to itertools

On Fri, Jan 7, 2011 at 10:24 AM, Martin Manns <mmanns <at> gmx.net> wrote:
> Hi
>
> I would like to propose an addition of an "irange" function to
> itertools. This addition could reduce testing effort when developing
> applications, in which large integers show up.
>
> Both, xrange (Python 2.x) and range (Python 3.x) have limited support
> for large integer step values, for example:
>
> Python 3.1.3 (r313:86834, Nov 28 2010, 10:01:07)
> [GCC 4.4.5] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
>
>>>> range(10**10000, 10**10000+10**1000, 10**900)[5]
> Traceback (most recent call last):
>  File "<stdin>", line 1, in <module>
> OverflowError: Python int too large to convert to C ssize_t

This example strikes me as a bug in range (specifically, in
range_subscript in Objects/rangeobject.c).

> Does such an addition make sense in your eyes?

Wouldn't it be better to fix 'range' to behave as expected?

Mark
_______________________________________________
Python-ideas mailing list
Python-ideas <at> python.org
(Continue reading)

Nick Coghlan | 10 Jan 12:26 2011
Picon

Re: Module aliases and/or "real names"

On Mon, Jan 10, 2011 at 11:11 AM, Ron Adam <rrr@...> wrote:
> On the python side of things, the attributes we've been discussing almost
> never have anything to do with what most programs are written to do. Unless
> it's a program written specifically for managing pythons various parts. It's
> kind of like the problem of separating content, context, and presentation in
> web pages.  Sometimes it's hard to do.

Yep - 99.99% of python code will never care if this is ever fixed.
However, the fact that we've started using acceleration modules and
pseudo-packages in the standard library means that "things should just
work" is being broken subtly in the stuff we're shipping ourselves
(either by creating pickling problems, as in unittest, or misleading
introspection results, as in functools and datetime).

And if we're going to fix it at all, we may as well fix it right :)

Cheers,
Nick.

--

-- 
Nick Coghlan   |   ncoghlan@...   |   Brisbane, Australia
Michael Foord | 10 Jan 12:37 2011
Picon

Re: Module aliases and/or "real names"



On 10 January 2011 11:26, Nick Coghlan <ncoghlan <at> gmail.com> wrote:
On Mon, Jan 10, 2011 at 11:11 AM, Ron Adam <rrr-jwHDv/q5dxhBDgjK7y7TUQ@public.gmane.org> wrote:
> On the python side of things, the attributes we've been discussing almost
> never have anything to do with what most programs are written to do. Unless
> it's a program written specifically for managing pythons various parts. It's
> kind of like the problem of separating content, context, and presentation in
> web pages.  Sometimes it's hard to do.

Yep - 99.99% of python code will never care if this is ever fixed.
However, the fact that we've started using acceleration modules and
pseudo-packages in the standard library means that "things should just
work" is being broken subtly in the stuff we're shipping ourselves
(either by creating pickling problems, as in unittest, or misleading
introspection results, as in functools and datetime).

And if we're going to fix it at all, we may as well fix it right :)


I certainly don't object to fixing this, and neither do I object to adding a new class / module / function attribute to achieve it.

However... is there anything else that this fixes? (Are there more examples "in the wild" where this would help?)

The unittest problem with pickling is real but likely to only affect a very, very small number of users. The introspection problem (getsource) for functools and datetime isn't a *real* problem because the source code isn't available. If in fact getsource now points to the pure Python version even in the cases where the C versions are being used then "fixing" this seems like a step backwards...


Python 3.2:
>>> import inspect
>>> from datetime import date
>>> inspect.getsource(date)
'class date:\n    """Concrete date type.\n\n ...'

Python 3.1:
>>> import inspect
>>> from datetime import date
>>> inspect.getsource(date)
Traceback (most recent call last):
  ...
IOError: source code not available

With your changes in place would Python 3.3 revert to the 3.1 behaviour here? How is this an advantage?

What I'm really asking is, is the cure (and the accompanying implementation effort and additional complexity to the Python object model) worse than the disease...

All the best,

Michael Foord




 
Cheers,
Nick.

--
Nick Coghlan   |   ncoghlan <at> gmail.com   |   Brisbane, Australia
_______________________________________________



--
http://www.voidspace.org.uk/

May you do good and not evil
May you find forgiveness for yourself and forgive others
May you share freely, never taking more than you give.
-- the sqlite blessing http://www.sqlite.org/different.html
_______________________________________________
Python-ideas mailing list
Python-ideas@...
http://mail.python.org/mailman/listinfo/python-ideas
Nick Coghlan | 10 Jan 12:52 2011
Picon

Re: Add irange with large integer step support to itertools

On Mon, Jan 10, 2011 at 6:27 PM, Mark Dickinson <dickinsm@...> wrote:
>> Python 3.1.3 (r313:86834, Nov 28 2010, 10:01:07)
>> [GCC 4.4.5] on linux2
>> Type "help", "copyright", "credits" or "license" for more information.
>>
>>>>> range(10**10000, 10**10000+10**1000, 10**900)[5]
>> Traceback (most recent call last):
>>  File "<stdin>", line 1, in <module>
>> OverflowError: Python int too large to convert to C ssize_t

Note that the problem isn't actually the step value - it's the overall
length of the resulting sequence.

If you make the sequence shorter, it works (at least in 3.2, I didn't
check earlier versions):

>>> x = range(10**10000, 10**10000+(500*10**900), 10**900)
>>> len(x)
500
>>> x[5]
<snip really big number>

> This example strikes me as a bug in range (specifically, in
> range_subscript in Objects/rangeobject.c).

The main issue is actually in range_item rather than range_subscript -
we invoke range_len() there to simplify the bounds checking logic. To
remove this limitation, the C arithmetic and comparison operations in
that function need to be replaced with their PyLong equivalent,
similar to what has been done for compute_range_length().

There's a related bug where range_subscript doesn't support *indices*
greater than sys.maxsize - given an indexing helper function that can
handle a range length that doesn't fit in sys.maxsize, it would be
easy to call that unconditionally rather than indirectly via
range_item, fixing that problem as well.

>> Does such an addition make sense in your eyes?
>
> Wouldn't it be better to fix 'range' to behave as expected?

Agreed. It isn't a deliberate design limitation - it's just a
consequence of the fact that converting from C integer programming to
PyLong programming is a PITA, so it has been a process of progressive
upgrades in range's support for values that don't fit in sys.maxsize.

Cheers,
Nick.

--

-- 
Nick Coghlan   |   ncoghlan@...   |   Brisbane, Australia
Nick Coghlan | 10 Jan 13:09 2011
Picon

Re: Module aliases and/or "real names"

On Mon, Jan 10, 2011 at 9:37 PM, Michael Foord
<fuzzyman@...> wrote:
> I certainly don't object to fixing this, and neither do I object to adding a
> new class / module / function attribute to achieve it.
>
> However... is there anything else that this fixes? (Are there more examples
> "in the wild" where this would help?)
>
> The unittest problem with pickling is real but likely to only affect a very,
> very small number of users. The introspection problem (getsource) for
> functools and datetime isn't a *real* problem because the source code isn't
> available. If in fact getsource now points to the pure Python version even
> in the cases where the C versions are being used then "fixing" this seems
> like a step backwards...

unittest is actually a better example, because there *is* a solution
to your pickling problem: alter __module__ to say "unittest" rather
than "unittest.<whatever>", just as _functools.partial and the
_datetime classes do. However, you've stated you don't want to do that
because it would break introspection. That's a reasonable position to
take, so the idea is to make it so you don't have to make that choice.
Instead, you'll be able to happily adjust __module__ to make pickling
work properly, while introspection will be able to fall back on
__impl_module__ to get the correct information.

> Python 3.2:
>>>> import inspect
>>>> from datetime import date
>>>> inspect.getsource(date)
> 'class date:\n    """Concrete date type.\n\n ...'
>
> Python 3.1:
>>>> import inspect
>>>> from datetime import date
>>>> inspect.getsource(date)
> Traceback (most recent call last):
>   ...
> IOError: source code not available
>
> With your changes in place would Python 3.3 revert to the 3.1 behaviour
> here? How is this an advantage?

It's an improvement because the current answer is misleading: that
source code is *not* what is currently running. You can change that
source to your heart's content and it will do exactly *squat* when it
comes to changing the interpreter's behaviour.

That said, one of the benefits of this proposal is that we aren't
restricted to the either/or behaviour. Since the interpreter will
provide both pieces of information, we have plenty of opportunity to
make inspect smarter about the situation. (e.g. only looking in
__impl_module__ by default, but offering a flag to also check
__module__ if no source is available from the implementation module).

> What I'm really asking is, is the cure (and the accompanying implementation
> effort and additional complexity to the Python object model) worse than the
> disease...

Potentially, but I see enough merit in the idea to follow up with a PEP for it.

Cheers,
Nick.

--

-- 
Nick Coghlan   |   ncoghlan@...   |   Brisbane, Australia
Zac Burns | 10 Jan 15:55 2011
Picon

Re: Add irange with large integer step support to itertools

-1 for any proposal that adds anything differentiating int/long.


-Zac



On Mon, Jan 10, 2011 at 7:52 PM, Nick Coghlan <ncoghlan <at> gmail.com> wrote:
On Mon, Jan 10, 2011 at 6:27 PM, Mark Dickinson <dickinsm-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>> Python 3.1.3 (r313:86834, Nov 28 2010, 10:01:07)
>> [GCC 4.4.5] on linux2
>> Type "help", "copyright", "credits" or "license" for more information.
>>
>>>>> range(10**10000, 10**10000+10**1000, 10**900)[5]
>> Traceback (most recent call last):
>>  File "<stdin>", line 1, in <module>
>> OverflowError: Python int too large to convert to C ssize_t

Note that the problem isn't actually the step value - it's the overall
length of the resulting sequence.

If you make the sequence shorter, it works (at least in 3.2, I didn't
check earlier versions):

>>> x = range(10**10000, 10**10000+(500*10**900), 10**900)
>>> len(x)
500
>>> x[5]
<snip really big number>

> This example strikes me as a bug in range (specifically, in
> range_subscript in Objects/rangeobject.c).

The main issue is actually in range_item rather than range_subscript -
we invoke range_len() there to simplify the bounds checking logic. To
remove this limitation, the C arithmetic and comparison operations in
that function need to be replaced with their PyLong equivalent,
similar to what has been done for compute_range_length().

There's a related bug where range_subscript doesn't support *indices*
greater than sys.maxsize - given an indexing helper function that can
handle a range length that doesn't fit in sys.maxsize, it would be
easy to call that unconditionally rather than indirectly via
range_item, fixing that problem as well.

>> Does such an addition make sense in your eyes?
>
> Wouldn't it be better to fix 'range' to behave as expected?

Agreed. It isn't a deliberate design limitation - it's just a
consequence of the fact that converting from C integer programming to
PyLong programming is a PITA, so it has been a process of progressive
upgrades in range's support for values that don't fit in sys.maxsize.

Cheers,
Nick.

--
Nick Coghlan   |   ncoghlan <at> gmail.com   |   Brisbane, Australia
_______________________________________________
Python-ideas mailing list
Python-ideas-+ZN9ApsXKcEdnm+yROfE0A@public.gmane.org
http://mail.python.org/mailman/listinfo/python-ideas

_______________________________________________
Python-ideas mailing list
Python-ideas@...
http://mail.python.org/mailman/listinfo/python-ideas
Nick Coghlan | 10 Jan 16:23 2011
Picon

Re: Add irange with large integer step support to itertools

On Tue, Jan 11, 2011 at 12:55 AM, Zac Burns <zac256@...> wrote:
> -1 for any proposal that adds anything differentiating int/long.

It isn't about adding anything - the signature of the length slots at
the C level already uses a Py_ssize_t, so any time you get the length
of a container, you're limited to values that will fit in that size.
This is fine for real containers, as you will run out of memory long
before the container length overflows and throws an exception. It *is*
an issue for a virtual container like range() though - because it
doesn't actually *create* the whole range, it can be created with a
length that exceeds what Py_ssize_t can handle. That's fine, until you
run into one of the operations that directly or indirectly invokes
len() on the object.

Currently, indexing a range is such an operation (which is why it
fails). While it's a fairly straightforward (albeit somewhat tedious)
change to fix range_subscript and range_item to correctly handle cases
where the index and/or the length exceed sys.maxsize, it still
requires someone to actually create the issue on the tracker and then
propose a patch to fix it.

It's even theoretically possible to upgrade the __len__ protocol to
support lengths that exceed Py_ssize_t, but that's a much more
ambitious (PEP scale) project.

Cheers,
Nick.

--

-- 
Nick Coghlan   |   ncoghlan@...   |   Brisbane, Australia
Martin Manns | 10 Jan 18:01 2011
Picon
Picon

Re: Add irange with large integer step support to itertools

On Mon, 10 Jan 2011 21:52:36 +1000
Nick Coghlan <ncoghlan@...> wrote:

> > Wouldn't it be better to fix 'range' to behave as expected?
> 
> Agreed. It isn't a deliberate design limitation - it's just a
> consequence of the fact that converting from C integer programming to
> PyLong programming is a PITA, so it has been a process of progressive
> upgrades in range's support for values that don't fit in sys.maxsize.

Nick:

So the limitations is not a deliberate design choice. 
Looking at the tracker, a fix would probably be covered by issue2690.

I see that you have provided a patch there on Dec. 3. However, this
patch either does not address the problem or it has not been committed
to Py3k, yet. I checked Py3k with an svn dump and it shows the same
behavior as Python 3.1.

original message by Nick Coghlan in issue2690:

> "I brought the patch up to date for the Py3k branch, but realised just
> before checking it in that it may run afoul of the language moratorium
> (since it alters the behaviour of builtin range objects)."

Does the patch address the issue or is it a more complicated problem?

If the former is the case then could issue2690 be re-opened and the
patch committed?

However, if the latter is the case then I still would like to propose
at least adding a snippet to the itertools docs because fixing the
issue properly could take its time.

Cheers

Martin
Stefan Behnel | 10 Jan 18:04 2011
Picon

Re: Add irange with large integer step support to itertools

Martin Manns, 10.01.2011 18:01:
> original message by Nick Coghlan in issue2690:
>
>> "I brought the patch up to date for the Py3k branch, but realised just
>> before checking it in that it may run afoul of the language moratorium
>> (since it alters the behaviour of builtin range objects)."

The language moratorium does not apply here because the described behaviour 
is a limitation that is specific to the CPython implementation, not the 
language.

Stefan

Gmane