‘We had to flee:’ Thuan Pham discusses his journey to the United States.
As a child, Thuan Pham never imagined he’d become a top executive at one of the world’s most valuable companies. He was more worried about survival.
In late 1979, as a boy of just 11, Pham escaped Vietnam on a crowded and rickety boat with his mother and brother. They encountered pirates and unfriendly authorities in Malaysia and Indonesia before arriving in the United States as refugees in March 1980.N
Pham attended MIT, earning a bachelor’s and master’s in computer science and engineering. He worked as an engineer at Hewlett-Packard and Silicon Graphics. He held more senior roles at the large software firm VMware and at NetGravity (a startup that was acquired by the ad-tech giant DoubleClick).
In April 2013, Pham joined Uber. At the time, the entire company had about 200 people, with only 40 engineers, and was providing about 30,000 rides daily. By mid-2017, the company had 12,000 employees and had completed 5 billion ridesN in total. In his first five years at Uber, Pham felt like he had been through three tours of duty.
The first tour of duty was roughly about my first couple of years where we were just fighting for our survival as a business, as a technology platform. The next two years after that was massive expansion of the business, dealing with all the challenges of hyper-growth such as scaling the technology platform, growing the organization, wrestling with technical debt and organizational debt—all with very little time and margin for errors. And then, the third tour of duty, which was about a year or so short--that was about dealing with the crisis moment of the company.
--Thuan Pham
Fighting for Survival: 2013
Uber had offered its first ride in San Francisco in 2010, and in three years had expanded to a few dozen cities.N With about 30,000 rides a day happening on the platform when he joined in April 2013 — but volume projected to grow 10x to 300,000 rides daily by the end of that year — Pham recalled 2013 to be a very frantic year.
“Essentially, the way I would phrase it is, it was like cheating death,” he recalled. “Just a survival instinct and a survival fight. In order to keep up that fight for survival, you have to predict what’s going to kill you next.”
Pham initially set to work understanding the core dispatch system, which matched riders and drivers. Improving its reliability was job #1 because without a dispatch system working 24/7, there is no Uber business. He recalled:
We had four kids on the dispatch engineering team, all in their early 20s. In the first design review, I asked them, “How did you build this? What is the architecture of this thing?” They didn't even have an architecture. They had just cobbled a few things together. Then I asked them questions like, “How many machines do you run to power these cities?” They said, “Well, 30, 40 machines.” I said, “Okay, what will happen if I go to a data center and pull the power plug on one of those machines?” They answered, “Oh, of course, whatever city that that machine was powering would go down with it.” Then I said, “Well, that's not acceptable, is it?”
Next, I asked the question, “What is the biggest city that we have right now, and what's the trip volume per day for that city?" They said, "New York City." I then asked, "Well, what kind of box is it running on? What if that city grows larger? Then how do you scale that?" The engineering team said to me, "Oh, of course, we have to move whatever we run for New York City onto a bigger and a faster box." I entertained them a little bit, and asked another question … “At what point will New York City be at such a scale that you will even outgrow the biggest box you can buy off the shelf?” The answer was “October.” The Uber service was careening toward a brick wall and imminent death just five months away, but the engineering team didn’t recognize that to be a problem.
Data and analytics, Pham believed, were crucial to charting the right course. Instinctively, he knew that the team would have to rewrite code and deploy new hardware to keep Uber afloat. But how quickly did they have to move? How much did they need to buy? Only by being able to model and predict the number of rides and amount of data the system would be required to handle could his team save Uber from sinking. He said:
The power of analytics that we used to navigate engineering decision-making to build the system, to maintain the system, in order to power the business, was incredibly important. At the pace that we were growing, actually, we didn’t have a second chance. We make a wrong decision and we are dead in the water. We didn’t have time to go back and re-do any decision, given the pace that we were growing. We had to be dead-on pretty much every single time. … Over time, we worked from a survivability window of a week to a few weeks, to a month, to a few months, to a quarter, etc.
2014
After about a year, Pham and his growing crew of engineers had made Uber’s technology stack a bit more stable. But he knew Uber needed to do much more to get where it wanted to go; it needed to move fast in multiple directions at the same time.
Operationally, Uber was taking a highly decentralized approach. As it expanded its service, local operation teams in each city and region around the world were empowered to make decisions that would work for their geography and its unique habits, customs, laws and regulations, competitive landscape, market dynamics, and more. This decentralization enabled Uber teams to run their regional businesses effectively—onboarding drivers and signing up customers quickly, setting prices, and offering incentives best suited to their business environment. It also gave them flexibility in dealing with regulatory bodies.
On the technical front, Pham and his team realized that to move at speed, they too, needed to be able to work on multiple fronts at once. There was just one problem: Uber’s software was architected as a centralized system, or monolith. That meant all components of Uber’s service, from signing up riders to billing them, paying drivers and sending messages, were interconnected and interdependent in the same code base; if developers wanted to make changes or updates, they needed to redeploy the entire stack all at once. As the code base grew, Pham knew, it would become increasingly complex, brittle, hard to understand, and hard to maintain—at an exponential rate. Meanwhile, ride volume just kept surging, also at an exponential rate.
“What was happening at the time was every single Friday night … was Uber's busiest night ever. They had a bunch of single points of failure in both people and in technology. There were certain humans that understood how to basically duct tape and plug the dike as it was exploding every single Friday night. And if they weren't around, you were down. … It was intense, but there was a really strong sense of camaraderie, a really tight bond, and a really strong sense of belonging. Literally if you weren't here, doing [your job], a piece of the system is down, and therefore the system is down. So people felt really, really important. And they were.”
-- Ryan Sokol, who joined Uber in fall 2014 as senior manager of the dispatch systems
Pham and Sokol knew that they needed to build scalable systems and a scalable organization. They began transitioning Uber’s software architecture from a monolithic system to a microservices structure, which would allow many teams of engineers and developers to work in parallel on different small components.
Over the course of about 15 months, Sokol’s team expanded from about 10 people to nearly 60. Meanwhile, he recalled, “every metric we were tracking from Uber grew about 10 times — from the number of drivers on the platform, to the number of rides taken on the platform, to the amount of money coming to the platform. So everything grew by about 10 times, and my team grew by five times.”
The move to microservices, said Pham, was “the key ingredient for us to grow super fast. When the company grew in volume, but also in locations around the world, and also in the level of complexity of all the features that we kept on adding to it, it takes more teams working on all of these things in parallel without tripping all over each other.”
And yet, he added: “Now, there were problems with doing what we did too. Because nothing comes for free.”