HHVM Q&A with Paul Tarjan, Facebook Engineer

As announced in our previous post, Box is currently deploying HHVM to our webapp, working closely with the HHVM team at Facebook. Paul Tarjan, a member of the Facebook HHVM team, offers his own insights into HHVM, Hack and a peek into the future.

What initially inspired you to develop HHVM? Tell us about the breaking point that led to its development.

We were using HPHPc (the PHP to C++ transpiler) for many years, but that type of architecture was an enormous pain to maintain. We also had a totally different runtime that engineers used in their sandboxes (hphpi) that was just slightly incompatible. So people would time-and-time-again introduce bugs where it did one thing in your development machine and another in production.

[This article] has some more background.

Obviously Facebook operates at a massive scale— how did you architect HHVM to support something that large from day one? Can you tell us about the roll out and deployment?

Thankfully our web servers don't share state with each other, so all you have to do to scale is buy more machines.

Our initial rollout was very straightforward when converting from HPHPc to HHVM. Put it on a few machines, measure everything (error rates, data fetches, html bytes sent, etc), and make sure the new HHVM is running the site just like the old HPHPc. And most of all, measure performance (CPU cycles per request). Once those looked good, we increased the rollout size and few times, checking the metrics again. Once that was good, ship it to the whole fleet.

As for updating Facebook.com each day, we take the PHP code, run a pre-compile step from PHP to bytecode, and then send the bytecode down to our fleet using bittorrent. Once a machine receives the new version of the site, tells the load balancer infront to not send it any requests for a few minutes, kills its HHVM process, restarts the process pointing it at the new byte code, warms up the server so the code layout in memory is good, then tells the load balancer to start sending traffic again.

How did knowing the project would be open-sourced affect the way it was designed and architected? How has the OSS community affected ongoing development of the project?

It thankfully didn't affect it much. We built an extension framework similar to PHP's so the only Facebook-specific thing we have is some custom extensions that talk to our various internal services.

As for development, the major change has been openness. Our design decisions happen in a public Facebook group. We publish our roadmaps. We build change logs and have regular releases. We test open source frameworks for HHVM compatibility. And most of all we care about github issues. Right now we're focused on fixing as many issues for the community as we can.

We also receive many contributions from the community (~700 pull requests last year alone) which we happily incorporate back. It helps us at Facebook (now engineers here won't run into the bugs) and everyone else that uses HHVM.

What's on your wishlist for future development for HHVM?

Our main priority right now is closing github issues. There are many good feature requests in there that we just haven't had time to do yet. Other major projects are opening up our code review process so the world can watch us bicker back and forth while we code instead of just seeing the final product when it launches. Also we're working on a performance testing suite so we have a good baseline to compare to PHP5. We're also building many extensions to take advantage of our asynchronous data fetching semantics.

HHVM works with both PHP and your own language, Hack. How does the transition from PHP to Hack affect day-to-day development for teams at Facebook and what are some tips for other developers making a similar transition in their code base?

PHP developers pick up Hack with ease and it is basically a dialect of PHP. You have a few extra type annotations, some more APIs, and a few awful language features are banned. Other than that it feels the same. The beauty comes when you run the type checker and it finds a bug in your code without even having to run it. It saves you hours and hours of painful debugging why a row is null in your database, just by finding the right point in your code where you should have checked the input for null.

We've documented how to convert a codebase to hack here. This is the same process we've used ourselves and are now 98% Hack.

Another tool we released is for projects that want to use Hack but still want their code to be runnable on the PHP5 runtime. With this you can get the productivity improvements of Hack while not forcing all your users to switch to HHVM.