HipHop PHP - Nifty Trick?
In a response to a question from ReadWriteWeb, among other things, I wrote:
I also noted that most sites on the Web have a lot of lower hanging fruit that would provide a much bigger performance improvement, if fixed, than doubling the speed of the PHP execution phase. The ReadWriteWeb site, for example, needs 160 separate HTTP requests and 41 distinct DNS lookups to load the front page. And once you get beyond the frontend inefficiencies you usually find Database issues, inefficient system call issues and general architecture problems that again aren't solved by speeding up PHP execution.
If you have done your homework and find that your web servers are cpu-bound, you are already using an opcode cache like APC and your Callgrind callgraph shows you that the PHP executor is a significant bottleneck, then HipHop PHP is definitely something you should be looking at.
My main worry here is that people think this is some kind of magic bullet that will solve their site performance problems. Generating C++ code from PHP code is a nifty trick and people seem to have gotten quite excited about it. I'd love to see those same people get excited about basic profiling and identifying the most costly areas of an application. Speeding up one of the faster parts of your system isn't going to give you anywhere near as much of a benefit as speeding up, or eliminating, one of the slower parts of your overall system.The "nifty trick" part of that seems to have become the story, and them injecting a "just" in front it of it makes it sound more derogatory. Anyone who knows me knows that I am a big fan of nifty tricks that solve the problem. When I first heard about the Facebook effort I was assuming they were writing a JIT based on LLVM V8 or something along those lines. Writing a good JIT is hard. Doing static code analysis and generating compilable C++ from it is indeed a nifty trick. It's not "just" a nifty trick, it is a cool trick that takes advantage of a number of characteristics of PHP. The main one being that you can't overload PHP functions. strlen() is always strlen, for example. In Python, this would be harder because you can overload everything.
I also noted that most sites on the Web have a lot of lower hanging fruit that would provide a much bigger performance improvement, if fixed, than doubling the speed of the PHP execution phase. The ReadWriteWeb site, for example, needs 160 separate HTTP requests and 41 distinct DNS lookups to load the front page. And once you get beyond the frontend inefficiencies you usually find Database issues, inefficient system call issues and general architecture problems that again aren't solved by speeding up PHP execution.
If you have done your homework and find that your web servers are cpu-bound, you are already using an opcode cache like APC and your Callgrind callgraph shows you that the PHP executor is a significant bottleneck, then HipHop PHP is definitely something you should be looking at.
Comments
Display comments as Linear | Threaded
Lenin on :
PHP Gangsta on :
Do you think it will be difficult to make HipHop PHP5.3 ready? Are namespaces etc. problematic?
Rasmus on :
Bertrand on :
Martin on :
Jay R. on :
Mauro on :
Rasmus on :
Sam Shull on :
function x ($a, $b)
{
return $a * $b;
}
might out put an extension skeleton that contained a function like
PHP_FUNCTION(x)
{
/*I have no idea on how to implement this in the Zend API, but you get the idea*/
long *a, b;
if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "ll", &a, &b) == FAILURE) {
return;
}
return_value = a b;
}
or something to that extent.
I think it would be useful, even if it had to require type hinting and return type hinting. But even that could be overcome with the zval struct couldn't it? Have you heard of anyone trying to do this type of code transformation? I feel like it would make these really complex frameworks so much simpler to implement IMHO.
(Couldn't figure out how to prevent ** from being parsed as bold, sorry)
Rasmus on :
Sam Shull on :
Nifty Future Tips on :
Dan Beam on :
For instance, setting long Expires headers (at the advice of ySlow, your old employer) was not worth it when I've done it. Nobody I worked with understood why files weren't refreshing without changing the URL. Since then I have seen many people not enthralled with the same procedure.
This said, I haven't yet been in a situation where CPU optimization was necessary (yet). If the situation ever arose, though, I would like first consider APC or another opcode caching system before considering the (relatively drastic) compiling to C++ (only difference is JIT -> binary, though this can be a big sometimes).
My worry with HipHop is that small shops or developers will pick it up because they think it will solve any scalability problems. But what it may do in reality is just be more confusing, complex to release / maintain (someone might even have to know C++, =O), and be more prone to error. This would be especially angering if their weakest link is more likely to be an unshielded RDBMS or the algorithms themselves.
"Premature optimization is the root of all evil" - Knuth