Stephen J. Turnbull | 1 Dec 02:38 2011
Picon

Re: Is there a reason why file.readlines() doesn't/can't return an iterator?

Peter Otten writes:
 > Éric Araujo wrote:
 > 
 > > Okay.  You can open a report on bugs.python.org to ask that the doc for
 > > readlines mention list(fp) as an alternative.
 > 
 > That would make sense to me only if readlines() were deprecated.

It makes sense to me.  Something like

    Note that "for line in fp:" (to iterate over the lines of the
    file) and "list(fp)" (to get a list of all lines in a file) are
    idiomatic Python.

to remind readers (especially new users) of the iterator concept,
which is even today not so familiar to beginning programmers.

_______________________________________________
Python-ideas mailing list
Python-ideas <at> python.org
http://mail.python.org/mailman/listinfo/python-ideas
Terry Reedy | 2 Dec 04:13 2011
Picon

Re: Is there a reason why file.readlines() doesn't/can't return an iterator?

On 11/30/2011 8:38 PM, Stephen J. Turnbull wrote:
> Peter Otten writes:
>   >  Éric Araujo wrote:
>   >
>   >  >  Okay.  You can open a report on bugs.python.org to ask that the doc for
>   >  >  readlines mention list(fp) as an alternative.
>   >
>   >  That would make sense to me only if readlines() were deprecated.
>
> It makes sense to me.  Something like
>
>      Note that "for line in fp:" (to iterate over the lines of the
>      file) and "list(fp)" (to get a list of all lines in a file) are
>      idiomatic Python.
>
> to remind readers (especially new users) of the iterator concept,
> which is even today not so familiar to beginning programmers.

See http://bugs.python.org/issue13510 ,
where I suggested something like that.

--

-- 
Terry Jan Reedy

_______________________________________________
Python-ideas mailing list
Python-ideas <at> python.org
http://mail.python.org/mailman/listinfo/python-ideas
Matthew Woodcraft | 2 Dec 22:02 2011
Picon

Re: Is there a reason why file.readlines() doesn't/can't return an iterator?

On 2011-11-30 13:59, Peter Otten wrote:
> My observation on the Tutor mailing list is that there are no valid uses of 
> readlines(). It's just easier to discover the readlines() method than to 
> find out that you can iterate over the file directly.

In 2.x, iterating directly can behave unexpectedly if the file object is
something other than a regular file; see eg
http://bugs.python.org/issue1633941
http://bugs.python.org/issue3907
http://utcc.utoronto.ca/~cks/space/blog/python/FileIteratorProblems

readline() and readlines() don't have these problems.

-M-
Terry Reedy | 2 Dec 22:55 2011
Picon

Re: Is there a reason why file.readlines() doesn't/can't return an iterator?

On 12/2/2011 4:02 PM, Matthew Woodcraft wrote:
> On 2011-11-30 13:59, Peter Otten wrote:
>> My observation on the Tutor mailing list is that there are no valid uses of
>> readlines(). It's just easier to discover the readlines() method than to
>> find out that you can iterate over the file directly.
>
> In 2.x, iterating directly can behave unexpectedly if the file object is
> something other than a regular file; see eg
> http://bugs.python.org/issue1633941
> http://bugs.python.org/issue3907

Both are fixed in 3.x

> http://utcc.utoronto.ca/~cks/space/blog/python/FileIteratorProblems
>
> readline() and readlines() don't have these problems.
>
> -M-


--

-- 
Terry Jan Reedy
T.B. | 3 Dec 01:12 2011
Picon

Different bases format specification

I will start by stating that it's not my original idea, but taken from Erlang. See Erlang's io:format documentation here: http://www.erlang.org/doc/man/io.html#format-1 and notice the 'B' control sequence.

I would like to have an easy built-in way to print integers in different bases (radices). There are so many half baked solutions out there:
http://bugs.python.org/issue6783
http://stackoverflow.com/questions/2267362/convert-integer-to-a-string-in-a-given-numeric-base-in-python
http://stackoverflow.com/questions/2063425/python-elegant-inverse-function-of-intstring-base
http://code.activestate.com/recipes/65212/

I suggest using the precision field in the format specification for integers for that. Examples:

"{:.16d}".format(31) #Prints '1f'

"{:.2d}".format(-19) # Prints '-10011'

"{:.36d}".format(36*5 + 35) #Prints '5Z'

"{:.3d}".format(26) # Prints '222'

The are good reasons for doing so:

1. Convenience. Instead of using a function for doing the conversion, it will always be easier to use the built-in format() function.

2. Elegance. There is already an *input* function for numbers in different bases; It is called int(). Symmetry is usually elegant, so will adding an *output* function for numbers in different bases.

3. It is almost there! There is already an internal function in CPython to convert to decimal base. See long_to_decimal_string() in http://hg.python.org/cpython/file/8d60c1c89105/Objects/longobject.c or the easier to understand int_to_decimal_string() in http://hg.python.org/cpython/file/86f699766016/Objects/intobject.c and notice the comment there.

4. It doesn't break anything. As written in http://docs.python.org/py3k/library/string.html#format-specification-mini-language: "...The precision is not allowed for integer values". No valid older code use it.

Some nitty gritty details:

1. The base will be only between 2 and 36. Like what int() allows. The default base is 10.

2. Adding 'D' integer presentation type for uppercase in bases larger than 10.

3. It might be a nice mnemonic using 'b' instead, standing for 'base'. Then the default base will be 2.

4. There should be a decision what to do with '#' alternate form option. I think that only for the current bases with a prefix (2, 8 and 16) the prefix would be printed. Notice that it also elegantly solves the problem of printing the '0B' and '0O' prefixes. They are legal in Python, but there is no pythonic way to print them.

5. What happens when you want to decide the base at run-time? Maybe doing something like in C's printf(): http://c-faq.com/stdio/printfvwid.html where there is a count before the actual number to print. So that "{:.*d}".format(16, 256) will print '100'. Doing so opens a door for crazier ideas. If the '*' is followed by a number the base will be taken from the positional argument at that number. If the '*' is followed by a somehow-quoted valid identifier, the base will be taken from a keyword argument with that name. It is handy for using the same base for several numbers: "{0:.*2d} {1:.*2d} {4:.*{current_base}d} {5:.*{current_base}d}".format(256, 16, 16, 26, 25, current_base=3) will print "100 10 222 221". The 2 in {0:.*2d} refers for the second 16.

What do you think?

Regards,

TB


P.S. Bonus question: What "{:.-909d}".format(42) would print?

_______________________________________________
Python-ideas mailing list
Python-ideas@...
http://mail.python.org/mailman/listinfo/python-ideas
MRAB | 3 Dec 02:26 2011

Re: Different bases format specification

On 03/12/2011 00:12, T.B. wrote:
> I will start by stating that it's not my original idea, but taken from
> Erlang. See Erlang's io:format documentation here:
> http://www.erlang.org/doc/man/io.html#format-1 and notice the 'B'
> control sequence.
>
> I would like to have an easy built-in way to print integers in different
> bases (radices). There are so many half baked solutions out there:
> http://bugs.python.org/issue6783
> http://stackoverflow.com/questions/2267362/convert-integer-to-a-string-in-a-given-numeric-base-in-python
> http://stackoverflow.com/questions/2063425/python-elegant-inverse-function-of-intstring-base
> http://code.activestate.com/recipes/65212/
>
> I suggest using the precision field in the format specification for
> integers for that.
[snip]

I think that the precision field should be used only for the precision
and that sometimes using it for something completely different is a bad
idea.
Nick Coghlan | 3 Dec 02:31 2011
Picon

Re: Different bases format specification

On Sat, Dec 3, 2011 at 10:12 AM, T.B. <bauertomer@...> wrote:
> I suggest using the precision field in the format specification for integers
> for that.

Supporting arbitrary bases for string formatting has been discussed
and rejected in the past (both in the context of PEP 3101's
introduction of new string formatting and on other occasions).

Nobody has ever produced convincing use cases for natively supporting
formatting with bases other than binary, octal, decimal and
hexadecimal. Accordingly, those 4 are supported explicitly via the
'b', 'o', 'd' and 'x'/'X' formatting codes, while other formats still
require an explicit conversion function.

As for "Why Not?"

1. 'd' stands for decimal. If support for arbitrary bases were added,
it would need to be as a separate format code (e.g. 'i' for integer)

2. The explicit 'b', 'o' and 'x' codes are related to integer literal
notation (0b10, 0o777, 0x1F), not to the second argument to int()

3. The use cases just aren't that strong. When you start dealing with
base36 and base64, you're not talking about formatting numbers for
human readers any more, you're talking about encoding numbers as short
pieces of text. Better to let people decide exactly the behaviour they
want by coding it themselves.

Cheers,
Nick.

--

-- 
Nick Coghlan   |   ncoghlan@...   |   Brisbane, Australia
Bruce Leban | 3 Dec 03:19 2011

Re: Different bases format specification


On Fri, Dec 2, 2011 at 4:12 PM, T.B. <bauertomer-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:

I suggest using the precision field in the format specification for integers for that. Examples:

"{:.16d}".format(31) #Prints '1f'

"{:.2d}".format(-19) # Prints '-10011'

<snip>

I don't think this belongs in format.

P.S. Bonus question: What "{:.-909d}".format(42) would print?

Any proposal which includes an inscrutable example doesn't bode well for the usability of the feature. :-) Sure, negative bases are mathematically meaningful but are they useful in Python? And why not complex bases then? Or did you have something else strange in mind?

If there's enough need for encoding in different bases, including a standard version of format_integer_in_base makes a lot more sense. We could write format_integer_in_base(15, 16) to get "F" and format_integer_in_base(64, "A23456789TJQK") to get "4K". But note that standard base 64 encoding is not at all the same -- this function encodes starting at LSB while base 64 encodes at word boundaries.

Finally, note that if you really want to mangle format strings you can do it without changing the library. Just write it this way "{:.16d}".format(arbitrarybase(31)) where you have defined

class arbitrarybase:
    def __format__(self, format_spec):
        return format_integer_in_base(parse format spec etc.) 
_______________________________________________
Python-ideas mailing list
Python-ideas@...
http://mail.python.org/mailman/listinfo/python-ideas
T.B. | 3 Dec 03:45 2011
Picon

Re: Different bases format specification

Tip for self: No more HTML e-mails.

On 2011-12-03 04:19, Bruce Leban wrote:
>
>>     P.S. Bonus question: What "{:.-909d}".format(42) would print?
>
> Any proposal which includes an inscrutable example doesn't bode well for
> the usability of the feature. :-) Sure, negative bases are
> mathematically meaningful but are they useful in Python? And why not
> complex bases then? Or did you have something else strange in mind?
>
My intention will be clear after reading 
http://bugs.python.org/issue2844. It will also ruin the surprise of 
figuring out alone.

Regards,
TB
T.B. | 3 Dec 04:16 2011
Picon

Re: Different bases format specification


On 2011-12-03 03:31, Nick Coghlan wrote:
> On Sat, Dec 3, 2011 at 10:12 AM, T.B.<bauertomer@...>  wrote:
>> I suggest using the precision field in the format specification for integers
>> for that.
>
> Supporting arbitrary bases for string formatting has been discussed
> and rejected in the past (both in the context of PEP 3101's
> introduction of new string formatting and on other occasions).
>
> Nobody has ever produced convincing use cases for natively supporting
> formatting with bases other than binary, octal, decimal and
> hexadecimal. Accordingly, those 4 are supported explicitly via the
> 'b', 'o', 'd' and 'x'/'X' formatting codes, while other formats still
> require an explicit conversion function.

For weird math scenarios I know there are already many modules and 
packages. But what about ternary? 
en.wikipedia.org/wiki/Ternary_numeral_system has some points that 
include that base 9 and 27 are used [no citation].

> As for "Why Not?"
>
> 1. 'd' stands for decimal. If support for arbitrary bases were added,
> it would need to be as a separate format code (e.g. 'i' for integer)
>
> 2. The explicit 'b', 'o' and 'x' codes are related to integer literal
> notation (0b10, 0o777, 0x1F), not to the second argument to int()
>
That one reason I wrote: "It might be a nice mnemonic using 'b' instead, 
standing for 'base'. Then the default base will be 2."
Anyway, I think there should be 'B' and 'O' presentation types, that
will be used for outputting '0B' and '0O' prefixes.

Thanks for your reply,
TB

Gmane