Scott Dial | 1 Feb 03:17 2010

Re: PEP 3147: PYC Repository Directories

On 1/31/2010 2:04 PM, Raymond Hettinger wrote:
> On Jan 30, 2010, at 4:00 PM, Barry Warsaw wrote:
>> It does this by
>> allowing many different byte compilation files (.pyc files) to be
>> co-located with the Python source file (.py file).  
> 
> It would be nice if all the compilation files could be tucked
> into one single zipfile per directory to reduce directory clutter.
> 
> It has several benefits besides tidiness. It hides the implementation
> details of when magic numbers get shifted.  And it may allow faster
> start-up times when the zipfile is in the disk cache.
> 

On a whim, I implemented a PEP302 loader that cached any important that
it could find in sys.path into a zip file.

I used running bzr as a startup benchmark, and I did my best to ensure
an empty cache by running "sync; echo 3 > /proc/sys/vm/drop_caches; time
bzr". On my particular machine, the "real" time was at minimum 3.5
seconds without using my ZipFileCacheLoader. With the loader, I found
the same was true. The average performance was all over the place (due
everything else in the operating system trying to fetch from the disk),
and I lack enough data points to reach statistical significance.

However, if the ".pyr" zip file is going to contain many versions of the
same module, then the performance impact could be more real, since you
would be forced to pull from disk *all* of the versions of a given module.

--

-- 
(Continue reading)

Curt Hagenlocher | 1 Feb 03:21 2010

Re: PEP 3147: PYC Repository Directories

On Sun, Jan 31, 2010 at 11:16 AM, Terry Reedy <tjreedy <at> udel.edu> wrote:

'pycache' would be pretty clear.
Heh -- without the underscores, I read this as "pyc ache". Seems appropriate.
 
--
Curt Hagenlocher
<div><div class="gmail_quote">On Sun, Jan 31, 2010 at 11:16 AM, Terry Reedy <span dir="ltr">&lt;<a href="mailto:tjreedy <at> udel.edu">tjreedy <at> udel.edu</a>&gt;</span> wrote:<br><blockquote class="gmail_quote">
<br>'pycache' would be pretty clear.<br>
</blockquote>
<div>Heh -- without the underscores, I read this as "pyc ache". Seems appropriate.</div>
<div>&nbsp;</div>
<div>--</div>
<div>Curt Hagenlocher</div>
<div><a href="mailto:curt <at> hagenlocher.org">curt <at> hagenlocher.org</a></div>
</div></div>
Terry Reedy | 1 Feb 03:53 2010
Picon

Re: PEP 3147: PYC Repository Directories

On 1/31/2010 4:26 PM, Tim Delaney wrote:
>

> The pyc/pyo files are just an optimisation detail, and are essentially
> temporary.

The .pycs for /Lib and similar are*not* temporarily in the sense you are 
using. They are effectively permanent for as long as the version is 
installed. They should *not* be routinely trashed as they are not 
obsolete and nearly always will be reused.

Terry Jan Reedy

R. David Murray | 1 Feb 03:53 2010

Re: PEP 3147: PYC Repository Directories

On Sun, 31 Jan 2010 19:48:19 +0100, =?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=
<martin <at> v.loewis.de> wrote:
> > By the way, the part that caused me the most confusion in the language
> > in the PEP was the emphasized *and their dependencies*, as if a package
> > having dependencies somehow turned the problem into a factorial explosion.
> > But there seems to be nothing special, according to your explanation,
> > about dependencies in this scheme.
> 
> For regular (forward) dependencies, there is indeed nothing special to
> consider - they would have to exist in all versions. In practice, this
> can (and was) problematic: python-zope.sendmail depends on
> python-pkg-resources, python-transaction, python-zope, and 10 other
> things. Before you could starting to provide python27-zope.sendmail,
> all of these dependencies would have to become available in a 2.7
> version first, meaning that ten other Debian developers need to act
> before you can. With the failure rate of Debian developers (who go
> as often on holidays as any other volunteer), upgrading to a new Python
> release could often take many months.

OK, that makes it clearer.  It's an internal (and probably unavoidable)
Debian social problem, not a technical one, and I see why it is an
important issue.

> > It seems like it would be simple enough to enhance the os packaging
> > systems to allow the install path to be specified at install time, if
> > that really is the only difference between the package versions.  And a
> > script that runs through all the installed python packages and installs
> > them for a new Python version when a new version is installed should be
> > as easy for other distributions as it is for Gentoo.
> 
> However, it's also unacceptable. I can't cite the exact piece of Debian
> policy, but I'm fairly sure that "build" activities are not allowed at
> installation time. So actually running setup.py files is out of
> question. Users who want such a thing would have to switch to Gentoo;
> Debian users just want it to work :-)

I'm less sympathetic to problems created by rigid policies, but that
doesn't mean I'm not sympathetic :)

But I don't understand how this answers the question.  If the
python26-zope.sendmail package doesn't run setup.py, then a
python-zope.sendmail package where you specify at install time which
directory to install the files to isn't going to run setup.py, either.
If the only difference between a packaged python27-zope.sendmail and a
packaged python26-zope.sendmail is the directory to which the files get
written, why can't that be controlled at install time?  Writing files
to a directory must be an install activity, not a build activity.  If the
issue is that *deciding* what directory to install to is a build time
activity...well, maybe I would be less sympathetic to a policy that is
*that* rigid.

> > (The os vendors are going to have
> > to change details of their packaging systems if the PEP is accepted,
> > so it's not as if the PEP saves the vendors work.)
> 
> Again, I'm a little bit unclear on the motivation, also. I think it
> mostly is "after years of experimentation, we have run out of ideas
> how to solve all related problems simultaneously without changing
> Python, so let's look for options that do involve changing Python".
> 
> If you *really* want a list of all the simultaneous problems that
> need to be solved, and an explanation of why each individual solution
> has flaws, prepare for this conversation to take a few more weeks.

Well, I certainly don't want the conversation to take a few more months.
I'm not against the PEP, I'm making my comments and asking my questions
in the spirit of making it a high quality PEP.  If the motivation is
"the Debian devs have concluded, after years of experimentation...",
then I suppose that's what should go in the motivation section.

--
R. David Murray                                      www.bitdance.com
Business Process Automation - Network/Server Management - Routers/Firewalls
Nick Coghlan | 1 Feb 06:34 2010
Picon

Re: PEP 3147: PYC Repository Directories

Antoine Pitrou wrote:
> Le Sat, 30 Jan 2010 21:04:14 -0800, Jeffrey Yasskin a écrit :
>> I have a couple bikesheddy or "why didn't you do this" comments. I'll be
>> perfectly satisfied with an answer or a line in the pep.
>>
>> 1. Why the -R flag? It seems like this is a uniform improvement, so it
>> should be the default. Have faith in your design! ;-)
> 
> -1 for making it a default. It is definitely ugly and useless for most 
> cases. It is fine as long as it is optional and merely used by the Debian/
> Ubuntu installers.

Would you still be a -1 on making it the new scheme the default if it
used a single cache directory instead? That would actually be cleaner
than the current solution rather than messier.

Regards,
Nick.

--

-- 
Nick Coghlan   |   ncoghlan <at> gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
_______________________________________________
Python-Dev mailing list
Python-Dev <at> python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/python-python-dev%40m.gmane.org
Chris Rebert | 1 Feb 06:53 2010

subprocess docs patch

Hello mighty Python developers,

I was wondering if someone could take a gander at, and hopefully act
upon, a patch I submitted a while ago for the subprocess module's
docs.
It's been languishing in the bug tracker:

http://bugs.python.org/issue6760

Any help you could provide would be appreciated.

Cheers,
Chris
--
If life seems jolly rotten. There's something you've forgotten.
And that's to laugh and smile and dance and sing.
Nick Coghlan | 1 Feb 07:07 2010
Picon

Re: PEP 3147: PYC Repository Directories

Martin v. Löwis wrote:
>> Exactly. How would you define where the pyr folder goes? At the root
>> of a package? What if I delete the __init__.py file there? Will the
>> existing pyr folder be orphaned and a new one created in each
>> subfolder? Unlike VCS working copies, the package / module / script
>> hierarchy is not formally defined in python.
> 
> The module name could guide the location. If you are importing
> xml.dom.minidom, it could put the pyc file into a sibling of the pyc
> folder for xml (under the name xml.dom.minidom.<label>).
> 
> If you then remove __init__, you are no longer able to import xml.dom,
> but you might import dom.minidom (assuming you put the xml folder into
> sys.path). Then, a new pyc file would be created in the pyc folder for
> the dom package.

I see three possible logical locations for the Python cache directories:

1. In each directory containing Python source files.
  Major Pro: easy to keep source files associated with their cached versions
  Major Con: proliferation of cache directories

2. In each top level directory on sys.path, flat file structure
  Major Pro: trivial to separate out all cached files
  Major Con: for path locations like the top of the standard lib, the
cache directory would get a *lot* of entries

3. In each top level directory on sys.path, shadow file heirarchy
  Major Pro: trivial to separate out all cached files
  Major Con: ??? (I got nuthin')

I didn't list a single global cache directory as a viable option as it
would create some nasty naming conflicts due to runs with different
sys.path entries and would make it impossible to create zipfiles with
precached bytecode files.

Note that with option two, creating a bytecode only zipfile would be
trivial: just add the __pycache__ directory as the top-level directory
in the zipfile and leave out everything else (assume there were no data
files in the package that were still needed).

Packages would still be identifiable by the existence of the cached pyc
file for their __init__modules.

Going back to my previous example (with one extra source file to show
how a top-level module would be handled), scheme 2 would give:

module.py
package/
  __init__.py
  foo.py
  subpackage/
    __init__.py
    bar.py
__pycache__/
  module.cpython-27.pyc
  module.cpython-27.pyo
  package.__init__.cpython-27.pyc
  package.__init__.cpython-27.pyo
  package.foo.cpython-27.pyc
  package.foo.cpython-27.pyo
  package.subpackage.__init__.cpython-27.pyc
  package.subpackage.__init__.cpython-27.pyo
  package.subpackage.bar.cpython-27.pyc
  package.subpackage.bar.cpython-27.pyo

While scheme 3 would look like:

module.py
package/
  __init__.py
  foo.py
  subpackage/
    __init__.py
    bar.py
__pycache__/
  module.cpython-27.pyc
  module.cpython-27.pyo
  package/
    __init__.cpython-27.pyc
    __init__.cpython-27.pyo
    foo.cpython-27.pyc
    foo.cpython-27.pyo
    subpackage/
      __init__.cpython-27.pyc
      __init__.cpython-27.pyo
      bar.cpython-27.pyc
      bar.cpython-27.pyo

For comparison, here is what it would look like under scheme 1:

module.py
package/
  __init__.py
  foo.py
  subpackage/
    __init__.py
    bar.py
    __pycache__/
      __init__.cpython-27.pyc
      __init__.cpython-27.pyo
      bar.cpython-27.pyc
      bar.cpython-27.pyo
  __pycache__/
    __init__.cpython-27.pyc
    __init__.cpython-27.pyo
    foo.cpython-27.pyc
    foo.cpython-27.pyo
__pycache__/
  module.cpython-27.pyc
  module.cpython-27.pyo

And the initial version proposed in the PEP:

module.py
module.pyr/
  cpython-27.pyc
  cpython-27.pyo
package/
  __init__.py
  __init__.pyr/
    cpython-27.pyc
    cpython-27.pyo
  foo.py
  foo.pyr/
    cpython-27.pyc
    cpython-27.pyo
  subpackage/
    __init__.py
    __init__.pyr/
      cpython-27.pyc
      cpython-27.pyo
    bar.py
    bar.pyr/
      cpython-27.pyc
      cpython-27.pyo

My major concern with scheme 2 is the possibility of directory size
limits affecting the caching of files, but scheme 3 looks pretty good to
me (with the higher level cache linked to the directory that is actually
on sys.path, the cache locations aren't as arbitrary as I originally
feared).

Cheers,
Nick.

--

-- 
Nick Coghlan   |   ncoghlan <at> gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
Nick Coghlan | 1 Feb 07:20 2010
Picon

Re: PEP 3147: PYC Repository Directories

Silke von Bargen wrote:
> 
>> That still leaves the question of what to do with __file__ (for which
>> even the solution in the PEP isn't particularly clean). Perhaps the
>> thing to do there is to have __file__ always point to the source file
>> and introduce a __file_cached__ that points to the bytecompiled file on
>> disk (set to None if it doesn't exist, as may be the case for __main__
>> or due to writing of bytecode files being disabled).
> And what if there isn't a source file, because I want to deploy the
> byte-code only?
> This is possible now, but would be impossible if there was this kind of
> distinction.

For a bytecode only deployment, __file__ would point to where the source
file *would* be if it was there while __file_cached__ would point to the
precompiled byte code.

Yes, this would be backwards incompatible for some uses of execfile in
conjunction with __file__ but those should be much rarer than uses of
__file__ to locate source code (which break with bytecode only
deployment anyway) and to find colocated resource files (which only care
about the path to the file and not the filename itself).

Cheers,
Nick.

--

-- 
Nick Coghlan   |   ncoghlan <at> gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
"Martin v. Löwis" | 1 Feb 07:35 2010
Picon

Re: PEP 3147: PYC Repository Directories

> But I don't understand how this answers the question.  If the
> python26-zope.sendmail package doesn't run setup.py, then a
> python-zope.sendmail package where you specify at install time which
> directory to install the files to isn't going to run setup.py, either.
> If the only difference between a packaged python27-zope.sendmail and a
> packaged python26-zope.sendmail is the directory to which the files get
> written, why can't that be controlled at install time?

It certainly would be possible to copy the files into each Python's
site-packages. They have a system that does that in place, except that
it doesn't copy the files, but symlinks them.

> Well, I certainly don't want the conversation to take a few more months.
> I'm not against the PEP, I'm making my comments and asking my questions
> in the spirit of making it a high quality PEP.  If the motivation is
> "the Debian devs have concluded, after years of experimentation...",
> then I suppose that's what should go in the motivation section.

I guess Barry will have to explain what the problem with the current
scheme is.

Regards,
Martin
Ben Finney | 1 Feb 07:56 2010
Picon

Re: PEP 3147: PYC Repository Directories

Nick Coghlan <ncoghlan <at> gmail.com> writes:

> Would you still be a -1 on making it the new scheme the default if it
> used a single cache directory instead? That would actually be cleaner
> than the current solution rather than messier.

+0 on a default of “store compiled bytecode files in a single cache
directory”. It is indeed cleaner than the current default.

I'm only +0 because I don't know whether that actually addresses the use
case that raised the issue to begin with, so I'm postponing judgement
until those who want this change in the first place chime in.

--

-- 
 \     “Once consumers can no longer get free music, they will have to |
  `\        buy the music in the formats we choose to put out.” —Steve |
_o__)                                  Heckler, VP of Sony Music, 2001 |
Ben Finney

_______________________________________________
Python-Dev mailing list
Python-Dev <at> python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/python-python-dev%40m.gmane.org

Gmane