Heng Cheong | 30 Mar 16:13 2015

Pypy - Unable to find vcvarsall.bat

I was reluctant to send this email but after more than a couple of thoughts, I decided what the heck and go for it.
I am writing in relation to installing pypy.
I was successful in installing numpy, pandas, and matplotlib, but not pypy.
My problem is when I install it, I got this error "Unable to find vcvarsall.bat" statement.
A simple Google found this problem to be widespread since many years ago, and I guess until today it is still unresolved.
How come the pypy development community does not do anything about this?
pypy-dev mailing list
pypy-dev <at> python.org
Carl Friedrich Bolz | 30 Mar 15:04 2015

CfP ICOOOLPS Workshop: Implementation Compilation Optimization of Object Oriented Languages Programming and Systems


		10th Workshop on
Implementation, Compilation, Optimization of
Object-Oriented Languages, Programs and Systems !
		(Collocated with Ecoop’15)
Prague, Czech Republic, July 6th, 2015

[The deadline has been extended to April, 13th 2015 ]

The ICOOOLPS workshop series brings together researchers and practitioners working in the field of OO
languages implementation and optimization. ICOOOLPS key goal is to identify current and emerging
issues relating to the efficient implementation, compilation and optimization of such languages, and
outlining future challenges and research directions.


Topics of interest for ICOOOLPS include, but are not limited to:
	- implementation of fundamental OO and OO-like features (e.g. inheritance, parametric types, memory
management, objects, prototypes),
	- runtime systems (e.g. compilers, linkers, virtual machines, garbage collectors),
	- optimizations (e.g. static or dynamic analyses, adaptive virtual machines),
	- resource constraints (e.g. time for real-time systems, space or low-power for embedded systems) and
relevant choices and tradeoffs (e.g. constant time vs. non-constant time mechanisms,
	- separate compilation vs. global compilation,
	- dynamic loading vs. global linking, dynamic checking vs. proof-carrying code…).


ICOOOLPS is not a mini-conference; it is a workshop designed to facilitate discussion and the exchange of
ideas between peers. ICOOOLPS therefore welcomes both position (1—4 pages) and research (max. 10
pages) papers.

Position papers should outline interesting or unconventional ideas, which need not be fully fleshed out.
Research papers are expected to contain more complete ideas, but these need not necessarily be fully
complete as with a traditional conference.
Authors will be given the option to publish their papers (short or long) in the ACM Digital Library if they wish.
Submissions must be written in English, formatted according to ACM SIGPLAN Proceedings style. Please
submit via the EasyChair submission site https://easychair.org/conferences/?conf=icooolps15.


Chair: Floréal Morandat (LaBRI/France) - Olivier Zendra (Inria, Loria/France)
Carl Friedrich Bolz (King's College London)
Eric Jul (Alcatel-Lucent Bell Labs)
Tobias Pape (Hasso-Plattner-Institute, Potsdam)
Jean Privat (Université du Québec à Montréal)
Jeremy Singer (University of Glasgow)
Gaël Thomas (Telecom SudParis)
Laurence Tratt (King's College London)
Jan Vitek (Northeastern University)
Mario Wolczko (Oracle Labs)

Carl Friedrich Bolz
pypy-dev mailing list
pypy-dev <at> python.org
Timothy Baldridge | 27 Mar 23:46 2015

Mod typing error

I have some RPython that does this:

return a % b 

while in another function I'm calling the same, but the two functions differ in the types of a and b (mix of ints and floats). 

However, during translation I'm getting a blocked block exception. 

translation:ERROR] AnnotatorError:
[translation:ERROR] Blocked block -- operation cannot succeed
[translation:ERROR]     v12 = mod(v10, v11)
[translation:ERROR] In <FunctionGraph of (pixie.vm.numbers:1)_rem_Float_Integer at 0x105b80490>:
[translation:ERROR] no source!
[translation:ERROR] Known variable annotations:
[translation:ERROR]  v10 = SomeFloat()
[translation:ERROR]  v11 = SomeInteger(knowntype=int, nonneg=False, unsigned=False)
[translation:ERROR] Blocked block -- operation cannot succeed
[translation:ERROR]     v15 = mod(v13, v14)
[translation:ERROR] In <FunctionGraph of (pixie.vm.numbers:1)_rem_Integer_Float at 0x1059ac550>:
[translation:ERROR] no source!
[translation:ERROR] Known variable annotations:
[translation:ERROR]  v13 = SomeInteger(knowntype=int, nonneg=False, unsigned=False)
[translation:ERROR]  v14 = SomeFloat()
[translation:ERROR] Blocked block -- operation cannot succeed
[translation:ERROR]     v18 = mod(v16, v17)
[translation:ERROR] In <FunctionGraph of (pixie.vm.numbers:1)_rem_Float_Float at 0x10544a650>:
[translation:ERROR] no source!
[translation:ERROR] Known variable annotations:
[translation:ERROR]  v16 = SomeFloat()
[translation:ERROR]  v17 = SomeFloat()

Do we need another entry in rtyper? Looking at rtyper/rfloat.py I see entries on how to type add, sub, etc, but nothing for mod. 


pypy-dev mailing list
pypy-dev <at> python.org
John Camara | 26 Mar 17:29 2015

vmprof compression

Hi Fijal,

To recap and continue the discussion from irc.

We already discussed that the stack id are based on a counter which is good but I also want to confirm that the ids have locality associated with the code.  That is similar areas of the code will have similar ids.  Just to make sure are not random with respect to the code otherwise compression will not be helpful.  If the ids are random that would need to be corrected first.

Right now the stack traces are written to the file repeating the following sequence


In order to get a high compression ratio it would be better to combine multiple stacktraces and rearrange the data as follows


In order to build the compress data you will want to 3 pairs of 2 buffers.  A pair of buffers for counts, depths, and stacks.  Your profiller would be writing to one set of buffers and another thread would be responsible for compressing buffers that are full and writing them to the file.  Once a set of buffers are full the profiller would start filling up the other set of buffers.

For each set of buffers you need a variable to hold the previous count, depth, and stack id.  They will be initialized to 0 before any data is written to an empty buffer.  In stead of writing the actual count value into the counts buffer you will write the difference between the current count and the previous count.  The reason for doing this is that the delta values will mostly be around 0 which will significantly improve the compression ratio without adding much overhead.  Of course you would do the same for depths and stack ids.

When you compress the data you compress each buffer individually to make sure like data is being compressed.  Like data compresses better the unlike data and by saving deltas very few bits will be required to represent the data and you are likely to have long strings of 0s and 1s.

I'm sure now you can see why I don't want stack ids being random.  As if they are random then the deltas will be all over the place so you wont end up with long strings of 0s and 1s and random data itself does not compress.

To test this out I wouldn't bother modifying the c code but instead try it out in Python to first make sure the compression is providing huge gains and figure out how to tune the algorithm without having to mess with the signal handlers and writing the code for the separate thread and dealing issues such as making sure you don't start writing to a buffer before the thread finished writing the data to the file, etc.  I would just read an existing profile file and rewrite it to a different file by rearranging the data and compressing the delta as I described.  You can get away with one set of buffers as you wouldn't be profiling at the same time.

To tune this process you will need to determine the appropriate number of stack traces that is small enough to keep memory down but large enough so that the overhead associated with compression small.  Maybe start of with about 8000 stack traces.  I would try gzip, bz2, and lzma and look at their compression ratios and times.  Gzip is general faster than bz2 and lzma is the slowest.  On the other hand lzma provides the best compression and gzip the worse.  Since you will be compressing deltas you most likely can get away with using the fastest compression options under each compressor and not effect the compression ratio.  But I would test it to verify this as it does depend on the data being compressed whether or not this is true.  Also one option that is available in lzma is the ability to set the width of the data to look at when looking for patterns.  Since you are saving 32 or 64 bit ints depending on the platform you can set the option to either 4 or 8 bytes based on the platform.  I don't believe qzip or bz2 have this option.  By setting this option in lzma you will likely improve the compression ratio.

You may find that counts and depths give similar compression, between the 3 compression types in which case just use the fastest which will likely be gzip.  On the other hand maybe the stack ids will be better off using lzma.  This is also another reason to separate out, like data, as it gives you an option to use the fastest compressors for some data types while using others to provide for better compression.

I would not be surprised if this approach achieves a compression ratio better than 100x but that will be heavily dependent on how local the stack ids are.  Also don't forget about simple things like not using 64 bit ints when you can get away with smaller ones.

Also for a slight variation to the above.  If you find most of your deltas are < 127 you could write them out as 1 byte and when greater than 127 write them out as a 4 byte int with the high bit set.  If you do this then don't set the lzma option to 4 or 8 byte boundaries as now your data is a mixture of 1 and 4 byte values.  This sometimes can provide huge reductions in compression times without much effect on the overall compression ratio.

Hope you find this helpful.

pypy-dev mailing list
pypy-dev <at> python.org
Matti Picus | 26 Mar 05:57 2015

PyPy 2.5.1 is live, please help get the word out

Tell your friends, your employer, your employees. A Python 2.7.9 compatible PyPy has been released.
Thanks to all who contributed
Release notice: http://morepypy.blogspot.com/2015/03/pypy-251-released.html
pypy-dev mailing list
pypy-dev <at> python.org
Henry Gomersall | 22 Mar 21:02 2015

Custom types for annotating a flow object space

I'm looking at using PyPy's flow object space for an experimental 
converter for MyHDL (http://www.myhdl.org/), a Python library for 
representing HDL (i.e. hardware) models. By conversion, I mean 
converting the MyHDL model that represents the hardware into either 
Verilog or VHDL that downstream tools will support. Currently, there is 
a converter that works very well in many situations, but there are 
substantial limitations on the code that can be converted (much greater 
restrictions than RPython imposes), as well as somewhat frustrating 
corner cases.

It strikes me that much of the heavy lifting of the conversion problem 
can be handled by the PyPy stack.

My question then regards the following. MyHDL represents certain low 
level structures as python objects. For example, there is a notion of a 
signal, represented by a Signal object, that has a one to one mapping to 
the target HDL language. All the attributes of the Signal object 
describe how it should be converted. So, during annotation, the Signal 
object should be maintained as a base type, rather than burying deeper 
into the object to try and infer more about its type (which invariably 
breaks things due to RPython non-conformity). There are probably a few 
other types (though not many) that should be handled similarly.

How does one instruct the translator to do this? Is it a case of writing 
a custom TranslationDriver to handle the custom types?

Thanks for any help,

w0mTea | 22 Mar 15:31 2015

GSoC 2015

Dear developers,

I'm a student interested in the idea of copy-on-write list slicing.

I noticed that on the PSF's GSoC wiki, students are suggested to fix a beginner-friendly bug, but after some searching, I eventually failed in finding some appropriate ones. May you help me about this?

Another question is about building pypy. I'm novice in pypy's implementation, so I decide to modify it's source code to help me understand it clearly. But it's really slow to build pypy. According to the pypy doc, I use this command to build it:
pypy rpython/bin/rpython --opt=jit pypy/goal/targetpypystandalone.py It takes about an hour to complete on my computer. Then I modify a file, adding only one line. But rebuilding through this command also takes an hour. Is there any faster way to rebuild pypy with little modification?

Your answer and advice will be highly appreciated.

Yule Zhao
pypy-dev mailing list
pypy-dev <at> python.org
Ludovic Gasc | 22 Mar 14:45 2015

How to help PyPy 3 ?


I want to try to help PyPy 3, especially to run AsyncIO on PyPy 3.

For now, I've tried to compile PyPy from 3.3 branch, and install AsyncIO.
I'd an issue with time.monotonic() and time.get_clock_info() in AsyncIO, because it isn't implemented in PyPy 3.

I've replaced that by time.time() and hardcoded value of time.get_clock_info() return, only to see until AsyncIO should work on PyPy 3.

I've launched several examples from AsyncIO documentation, it works, except when I launch a TCP client: https://docs.python.org/3/library/asyncio-protocol.html#tcp-echo-client-protocol
I've the same issue with aiotest: https://bitbucket.org/haypo/aiotest All tests pass, except:
$ ./pypy-c -Wd ../tulip/run_aiotest.py 
Run tests in debug mode
ERROR: test_tcp_hello (aiotest.test_network.NetworkTests)
Traceback (most recent call last):
  File "/home/lg/Documents/IDEA/pypy/site-packages/aiotest/test_network.py", line 78, in test_tcp_hello
  File "/home/lg/Documents/IDEA/pypy/site-packages/asyncio/base_events.py", line 317, in run_until_complete
    return future.result()
  File "/home/lg/Documents/IDEA/pypy/site-packages/asyncio/futures.py", line 275, in result
    raise self._exception
  File "/home/lg/Documents/IDEA/pypy/site-packages/asyncio/tasks.py", line 238, in _step
    result = next(coro)
  File "/home/lg/Documents/IDEA/pypy/site-packages/asyncio/coroutines.py", line 79, in __next__
    return next(self.gen)
  File "/home/lg/Documents/IDEA/pypy/site-packages/asyncio/base_events.py", line 644, in create_connection
    sock, protocol_factory, ssl, server_hostname)
TypeError: '_SelectorSocketTransport' object is not iterable

Ran 16 tests in 0.337s

FAILED (errors=1)

Finally, I've tested to launch the aiohttp server example: https://github.com/KeepSafe/aiohttp/blob/master/examples/srv.py
The servers starts, but when I launch a HTTP request, I've a strange stackstrace:
Error handling request
Traceback (most recent call last):
  File "/home/lg/Documents/IDEA/pypy/site-packages/aiohttp/server.py", line 240, in start
    yield from handler
  File "/home/lg/tmp/asyncio_examples/7_aiohttp_server.py", line 25, in handle_request
    message.method, message.path, message.version))
AttributeError: 'str' object has no attribute 'method'
Traceback (most recent call last):
  File "/home/lg/Documents/IDEA/pypy/site-packages/aiohttp/server.py", line 240, in start
    yield from handler
  File "/home/lg/tmp/asyncio_examples/7_aiohttp_server.py", line 25, in handle_request
    message.method, message.path, message.version))
AttributeError: 'str' object has no attribute 'method'

After debugging a little bit with pdb, I see that aiohttp.HttpRequestParser parses correctly the request, but "message = yield from httpstream.read()" in aiohttp.server at line 226 returns "GET" string instead of a RawRequestMessage object.

I'm a total newbie about PyPy internals, implement a monotonic timer seems to be over-complicated to me.
But, I should maybe help with tests and fix small issues like with aiohttp.server if I've a clue to fix that.
Are you interested if I create issues on PyPy project for each problem, or I will add only noise in PyPy project ?

BTW, to help PyPy project not only with my brain time, I've did a small recurring donation for PyPy 3.

Ludovic Gasc (GMLudo)
pypy-dev mailing list
pypy-dev <at> python.org
Timothy Baldridge | 21 Mar 01:43 2015

Specialization for app level types

I'd like to add some optimization to app level types in Pixie. What I'm thinking of is something like this (in app level PyPy code):

class Foo(object):
   def __init__(self, some_val):
     self._some_val = some_val
   def set_value(self, val):
     self._some_val = val

In a perfect world the JIT should be able to recognize that ._some_val is only ever an int, and therefore store it unboxed in the instance of the type, hopefully this would decrease pressure on the GC if ._some_val is modified often. Also in a perfect world, the value of _some_val should be auto promoted to an object if someone ever decides to set it to something besides an int. 

How would I go about coding this up in RPython? I can't seem to figure out a way to do this without either bloating each instance of the type with an array of object, an array of ints and an array of floats. 

Currently app level objects in Pixie are just a wrapper around a object array. The type then holds the lookups for name->slot_idx. 

Thanks in advance. 

Timothy Baldridge

pypy-dev mailing list
pypy-dev <at> python.org
John Zhang | 18 Mar 23:42 2015

Re: Compiling PyPy interpreter without GC

Hi Carl,
	Great! It worked!
	So the option disables all modules, and IO as well?
	John Zhang
> On 19 Mar 2015, at 4:18 am, Carl Friedrich Bolz <cfbolz <at> gmx.de> wrote:
> On 18/03/15 01:01, John Zhang wrote:
>> Hi all,
>>     I'm working on developing a MicroVM backend for PyPy. It's a
>> virtual machine under active research and development by my colleagues
>> in ANU. It aims to capture GC, threading and JIT in the virtual machine,
>> and frees up the burden of the language implementers.
>>     Since MicroVM provides GC, I need to remove GC from the PyPy
>> interpreter. As I was trying to compile it with the following command:
>>     pypy $PYPY/rpython/bin/rpython \
>>           -O0 \
>>           --gc=none \
>>           --no-translation-rweakref \
>>           --annotate \
>>           --rtype \
>>           --translation-backendopt-none \
>>           $PYPY/pypy/goal/targetpypystandalone.py
> Hey John,
> Try the following:
> rpython -O0 --gc=none --no-translation-rweakref --annotate --rtype
--translation-backendopt-none targetpypystandalone.py --no-allworkingmodules --withoutmod-_io
> Cheers,
> Carl
黄若尘 | 18 Mar 16:49 2015

Some summary and questions about the 'small function' problem

Hi Fijal, 

   This is Ruochen Huang, I want to begin to write my proposal and I think actually there is not so much time left. I tried to make a summary of what I have understood until now and the questions I want to know. Please feel free to point out any incorrect things in my summary, and for the questions, if you think the question is meaningless, you can just skip it, or provide some possible document link or source code path if you think it will take too much time to explain it. 

   As far as I understood,
  1. The ‘small function’ problem occurred when one trace try to call another trace. In source code level, it should be the situation that, inside one loop, there is a function invocation to a function which has another loop.
  2. Let me take the small example we discussed before, function g() tried to call function f(a,b,c,d) in a big loop, and there is another loop inside f(a,b,c,d). So in current version of PyPy, the result is that, two traces were generated:
    1. the trace for the loop in g(), let me call it T1, actually, g() tried to inline f(a,b,c,d), but since there is a loop in f, so the result is that T1 will inline only the first iteration of the loop in f, let’s say f was taken apart into f1(for the first iteration) and f’(for the rest iterations), so what T1 exactly does is start the loop in g -> do f1 -> do some allocations of PyFrame (in order to call f’) -> call_assembler for f’.
    2. the trace for the loop in f’. let me call it T2. T2 firstly unpack the PyFrame prepared by T1, then do a preamble work, which means f’ is once again taken apart into f2 (for the 1st iteration in f’, and it actually is also the 2nd iteration in original f), and f’’(the iterations from 3rd iteration to the last iteration), for f2 and f’’, there is a label in the head of them, respectively. So finally we can say T2 consist of 3 parts: T2a (for PyFrame unpacking), T2b(with label1, do f2), T2c(with label2, do f’’).
  3. As mentioned above, we have T1 -> T2a -> T2b -> T3c, from the viewpoint of the loop in f, f is distributed into: T1(f1) -> T2a -> T2b(f2) -> T2c(f’’), which means the loop in f was peeled twice, so T2b might be not needed, further more, the work for PyFrame before call_assembler in T1, and the unpacking work in T2a is a waste. I can’t understand why it’s a waste very well, but I guess it’s because T2c(f’') actually do the similar thing as f1 in T1, (or, T2c is already *inside* the loop) Anyway, T2b is also not needed, so we want to have T1 -> T2c, and since the work in PyFrame in T2a is eliminated, the allocation for PyFrame in T1 can also be eliminated. So ideally we want to have T1’ (without PyFrame allocation) -> T2c.

Some questions until now:
  1. What’s the bridge you mentioned? To be honest I have only a very slight understand of bridge, I know it is executed when some guard failed, but as far as I knew, in normal trace JIT compiler, only one path of a loop will be traced, any guard failure will make the execution escape from the native code and return to VM, but I guess the bridge is a special kind of trace (native code), is it right?
  2. Could you please explain more about why T2b is not needed? I guess the answer may be related to the “virtualizable” optimization for PyFrame, so what if PyFrame is not virtualizable? I mean, if in that situation, does the problem disappear? or become easier to solve?
  3. What’s the difficulties in solving this problem? I’m sorry I’m not so familiar with the details of RPython JIT, but in my opinion, we need just to make the JIT know that, 
    1. when tries to inline a function, and encounter a loop so the inline work has to stop, it’s time to do optimization O.
    2. what O does is to delete the allocation instructions about PyFrame before call_assembler, and them tell call_assembler to jump to 2rd label of target trace. (In our example is T2c).
       So It may seem not so difficult to solve.

Best Regards,
Ruochen Huang

pypy-dev mailing list
pypy-dev <at> python.org