William Pietri | 1 Mar 2010 04:19
Favicon
Gravatar

Re: hiphop! :)

On 02/28/2010 01:33 PM, Domas Mituzas wrote:
>
> not 10x. I did concurrent benchmarks for API requests (e.g. opensearch) on modern boxes, and saw:
>
> HipHop: Requests per second:    1975.39 [#/sec] (mean)
> Zend: Requests per second:    371.29 [#/sec] (mean)
>
> these numbers seriously kick ass. I still can't believe I observe 2000 mediawiki requests/s from a single
box ;-)
>    

Bravo! That's fantastic. Thanks for both the work and the testing.

William
Ævar Arnfjörð Bjarmason | 1 Mar 2010 05:49
Picon
Gravatar

Re: hiphop! :)

On Sun, Feb 28, 2010 at 21:33, Domas Mituzas <midom.lists <at> gmail.com> wrote:
>>
>> Nevertheless - a process isn't the same process when it's going at 10x
>> the speed. This'll be interesting.
>
> not 10x. I did concurrent benchmarks for API requests (e.g. opensearch) on modern boxes, and saw:
>
> HipHop: Requests per second:    1975.39 [#/sec] (mean)
> Zend: Requests per second:    371.29 [#/sec] (mean)
>
> these numbers seriously kick ass. I still can't believe I observe 2000 mediawiki requests/s from a single
box ;-)

Awesome. I did some tryouts with hiphop too before you started overtaking me.

Is this work on SVN yet? Maybe it would be nice to create a branch for
it so that other people can poke it?

_______________________________________________
Wikitech-l mailing list
Wikitech-l <at> lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Ævar Arnfjörð Bjarmason | 1 Mar 2010 05:58
Picon
Gravatar

Re: hiphop! :)

On Sun, Feb 28, 2010 at 21:39, David Gerard <dgerard <at> gmail.com> wrote:
> On 28 February 2010 21:33, Domas Mituzas <midom.lists <at> gmail.com> wrote:
>
>> these numbers seriously kick ass. I still can't believe I observe 2000 mediawiki requests/s from a
single box ;-)
>
>
> So ... how restricted is HipHop PHP, and what are the hotspots in
> MediaWiki that would most benefit from it?

 Most of the code in MediaWiki works just fine with it (since most of
it is mundane) but things like dynamically including certain files,
declaring classes, eval() and so on are all out.

It should be possible to replace all that at the cost of code that's a
bit more verbose.

Even if it wasn't hotspots like the parser could still be compiled
with hiphop and turned into a PECL extension.

One other nice thing about hiphop is that the compiler output is
relatively readable compared to most compilers. Meaning that if you
need to optimize some particular function it's easy to take the
generated .cpp output and replace the generated code with something
more native to C++ that doesn't lose speed because it needs to
manipulate everything as a php object.
Domas Mituzas | 1 Mar 2010 11:10
Picon

hiphop progress

Howdy,

> Most of the code in MediaWiki works just fine with it (since most of
> it is mundane) but things like dynamically including certain files,
> declaring classes, eval() and so on are all out.

There're two types of includes in MediaWiki, ones I fixed for AutoLoader and ones I didn't - HPHP has all
classes loaded, so AutoLoader is redundant. 
Generally, every include that just defines classes/functions is fine with HPHP, it is just some of
MediaWiki's startup logic (Setup/WebStart) that depends on files included in certain order, so we have
to make sure HipHop understands those includes.
There was some different behavior with file including - in Zend you can say require("File.php"), and it
will try current script's directory, but if you do require("../File.php") - it will 

We don't have any eval() at the moment, and actually there's a mode when eval() works, people are just scared
too much of it. 
We had some double class definitions (depending on whether certain components are available), as well as
double function definitions ( ProfilerStub vs Profiler )

One of major problems is simply still not complete function set, that we'd need:

* session - though we could sure work around it by setting up our own Session abstraction, team at facebook is
already busy implementing full support
* xdiff, mhash - the only two calls to it are from DiffHistoryBlob - so getting the feature to work is
mandatory for production, not needed for testing :) 
* tidy - have to call the binary now

function_exists() is somewhat crippled, as far as I understand, so I had to work around certain issues there.
There're some other crippled functions, which we hit through the testing... 

(Continue reading)

Tei | 1 Mar 2010 14:11
Picon
Gravatar

Re: hiphop progress

Looks like a loot of fun :-)

On 1 March 2010 11:10, Domas Mituzas <midom.lists <at> gmail.com> wrote:
...
>> Even if it wasn't hotspots like the parser could still be compiled
>> with hiphop and turned into a PECL extension.
>
> hiphop provides major boost for actual mediawiki initialization too - while Zend has to reinitialize
objects and data all the time, having all that in core process image is quite efficient.
>
>> One other nice thing about hiphop is that the compiler output is
>> relatively readable compared to most compilers. Meaning that if you
>
> That especially helps with debugging :)
>
>> need to optimize some particular function it's easy to take the
>> generated .cpp output and replace the generated code with something
>> more native to C++ that doesn't lose speed because it needs to
>> manipulate everything as a php object.
>
> Well, that is not entirely true - if it manipulated everything as PHP object (zval), it would be as slow and
inefficient as PHP. The major cost benefit here is that it does strict type inference, and falls back to
Variant only when it cannot come up with decent type.
> And yes, one can find offending code that causes the expensive paths. I don't see manual C++ code
optimizations as way to go though - because they'd be overwritten by next code build.
>

this smell like something that can benefict from metadata.

/* [return  integer] */  function getApparatusId($obj){
(Continue reading)

Ævar Arnfjörð Bjarmason | 1 Mar 2010 14:34
Picon
Gravatar

Re: hiphop progress

On Mon, Mar 1, 2010 at 10:10, Domas Mituzas <midom.lists <at> gmail.com> wrote:
> Howdy,
>
>> Most of the code in MediaWiki works just fine with it (since most of
>> it is mundane) but things like dynamically including certain files,
>> declaring classes, eval() and so on are all out.
>
> There're two types of includes in MediaWiki, ones I fixed for AutoLoader and ones I didn't - HPHP has all
classes loaded, so AutoLoader is redundant.
> Generally, every include that just defines classes/functions is fine with HPHP, it is just some of
MediaWiki's startup logic (Setup/WebStart) that depends on files included in certain order, so we have
to make sure HipHop understands those includes.
> There was some different behavior with file including - in Zend you can say require("File.php"), and it
will try current script's directory, but if you do require("../File.php") - it will
>
> We don't have any eval() at the moment, and actually there's a mode when eval() works, people are just
scared too much of it.
> We had some double class definitions (depending on whether certain components are available), as well as
double function definitions ( ProfilerStub vs Profiler )
>
> One of major problems is simply still not complete function set, that we'd need:
>
> * session - though we could sure work around it by setting up our own Session abstraction, team at facebook
is already busy implementing full support
> * xdiff, mhash - the only two calls to it are from DiffHistoryBlob - so getting the feature to work is
mandatory for production, not needed for testing :)
> * tidy - have to call the binary now
>
> function_exists() is somewhat crippled, as far as I understand, so I had to work around certain issues there.
> There're some other crippled functions, which we hit through the testing...
(Continue reading)

Domas Mituzas | 1 Mar 2010 14:35
Picon

Re: hiphop progress

Howdy,

> Looks like a loot of fun :-)

Fun enough to have my evenings and weekends on it :) 

> this smell like something that can benefict from metadata.
> /* [return  integer] */  function getApparatusId($obj){
>  //body
> }

Indeed - type hints can be quite useful, though hiphop is smart enough to figure out it will be an integer
return from code :)

It is quite interesting to see the enhancements to PHP that have been inside facebook and now are all
released - XHP evolves PHP syntax to fit the web world (
http://www.facebook.com/notes/facebook-engineering/xhp-a-new-way-to-write-php/294003943919
), the XBOX thing allows background/async execution of work without standing in the way of page
rendering, etc. 

> What we can expect?  will future versions of MediaWiki be "hiphop
> compatible"? there will be a fork or snapshot compatible?  The whole
> experiment looks like will help to profile and enhance the engine,
> will it generate a MediaWiki.tar.gz  file we (the users) will able to
> install in our intranetss ??

Well, the build itself is quite portable (you'd have to have single binary and LocalSettings.php ;-) 

Still, the decision to merge certain changes into MediaWiki codebase (e.g. relative includes, rather
than $IP-based absolute ones) would be quite invasive. 
(Continue reading)

Ævar Arnfjörð Bjarmason | 1 Mar 2010 14:59
Picon
Gravatar

Re: hiphop progress

On Mon, Mar 1, 2010 at 13:35, Domas Mituzas <midom.lists <at> gmail.com> wrote:
> Still, the decision to merge certain changes into MediaWiki codebase (e.g. relative includes, rather
than $IP-based absolute ones) would be quite invasive.
> Also, we'd have to enforce stricter policy on how some of the dynamic PHP features are used.

I might be revealing my lack of knowledge about PHP here but why is
that invasive and why do we use $IP in includes in the first place? I
did some tests here:

    http://gist.github.com/310380

Which show that as long as you set_include_path() with $IP/includes/
at the front PHP will make exactly the same stat(), read() etc. calls
with relative paths that it does with absolute paths.

Maybe that's only on recent versions, I tested on php 5.2.
Marco Schuster | 1 Mar 2010 15:22

Re: hiphop progress

The point of $IP is that you can use multisite environments by just
having index.php and Localsettings.php (and skin crap) in the
per-vhost directory, and have extensions and other stuff centralized
so you can update the extension once and all the wikis automatically
have it.
However, the Installer could be patched, to resolve $IP automatically
if the user wishes to run a HipHop environment.

Marco

On Mon, Mar 1, 2010 at 2:59 PM, Ævar Arnfjörð Bjarmason
<avarab <at> gmail.com> wrote:
> On Mon, Mar 1, 2010 at 13:35, Domas Mituzas <midom.lists <at> gmail.com> wrote:
>> Still, the decision to merge certain changes into MediaWiki codebase (e.g. relative includes, rather
than $IP-based absolute ones) would be quite invasive.
>> Also, we'd have to enforce stricter policy on how some of the dynamic PHP features are used.
>
> I might be revealing my lack of knowledge about PHP here but why is
> that invasive and why do we use $IP in includes in the first place? I
> did some tests here:
>
>    http://gist.github.com/310380
>
> Which show that as long as you set_include_path() with $IP/includes/
> at the front PHP will make exactly the same stat(), read() etc. calls
> with relative paths that it does with absolute paths.
>
> Maybe that's only on recent versions, I tested on php 5.2.
>
> _______________________________________________
(Continue reading)

Daniel Kinzler | 1 Mar 2010 15:26
Picon
Favicon
Gravatar

Re: hiphop progress

Marco Schuster schrieb:
> The point of $IP is that you can use multisite environments by just
> having index.php and Localsettings.php (and skin crap) in the
> per-vhost directory, and have extensions and other stuff centralized
> so you can update the extension once and all the wikis automatically
> have it.

That'S a silly multi-host setup. Much easier to have a single copy of
everything, and just use conditionals in localsettings, based on hostname or path.

-- daniel

Gmane