Inside Facebook’s AI Workshop

por Scott Berinato

Within Facebook’s cavernous Building 20, about halfway between the lobby (panoramic views of the Ravenswood Slough) and the kitchen (hot breakfast, smoothies, gourmet coffee), in a small conference room called Lollapalooza, Joaquin Candela is trying to explain artificial intelligence to a layperson.

Candela — bald, compact, thoughtful — runs Facebook’s Applied Machine Learning (AML) group, the engine room of AI at Facebook, which increasingly makes it the engine room of Facebook in general. After some verbal searching, he finally says:

“Look, a machine learning algorithm really is a lookup table, right? Where the key is the input, like an image, and the value is the label for the input, like ‘a horse.’ I have a bunch of examples of something. Pictures of horses. I give the algorithm as many as I can. ‘This is a horse. This is a horse. This isn’t a horse. This is a horse.’ And the algorithm keeps those in a table. Then, if a new example comes along — or if I tell it to watch for new examples — well, the algorithm just goes and looks at all those examples we fed it. Which rows in the table look similar? And how similar? It’s trying to decide, ‘Is this new thing a horse? I think so.’ If it’s right, the image gets put in the ‘This is a horse’ group, and if it’s wrong, it gets put in the ‘This isn’t a horse’ group. Next time, it has more data to look up.

One challenge is how do we decide how similar a new picture is to the ones stored in the table. One aspect of machine learning is to learn similarity functions. Another challenge is, What happens when your table grows really large? For every new image, you would need to make a zillion comparisons…. So another aspect of machine learning is to approximate a large stored table with a function instead of going through every image. The function knows how to roughly estimate what the corresponding value should be. That’s the essence of machine learning — to approximate a gigantic table with a function. This is what learning is about.”

There’s more to it than that, obviously, but it’s a good starting point when talking about AI because it makes it sound real, almost boring. Mechanical. So much of the conversation around AI is awash in mystical descriptions of its power and in reverence for its near-magic capabilities. Candela doesn’t like that and tries to use more-prosaic terms. It’s powerful, yes, but not magical. It has limitations. During presentations, he’s fond of showing a slide with a wizard and a factory, telling audiences that Facebook thinks of AI like the latter, because “wizards don’t scale.”

And that’s what Facebook has done with AI and machine learning: scaled it at a breakneck pace. A few years ago the company’s machine learning group numbered just a few and needed days to run an experiment. Now, Candela says, several hundred employees run thousands of experiments a day. AI is woven so intricately into the platform that it would be impossible to separate the products — your feed, your chat, your kid’s finsta — from the algorithms. Nearly everything users see and do is informed by AI and machine learning.

Understanding how and why Facebook has so fully embraced AI can help any organization that’s ready to invest in an algorithmic future. It would be easy to assume that Facebook, with all its resources, would simply get the best talent and write the best algorithms — game over. But Candela took a different approach. Certainly the talent is strong, and the algorithms are good. Some of them are designed to “see” images or automatically filter them. Some understand conversations and can respond to them. Some translate between languages. Some try to predict what you’ll like and buy.

But in some ways the algorithms are not his main focus. Instead, he’s been busy creating an AI workshop in which anyone in the company can use AI to achieve a goal. Basically, Candela built an AI platform for the platform. Whether you’re a deeply knowledgeable programmer or a complete newbie, you can take advantage of his wares.

Here’s how he did it and what you can learn from it.

Soyuz

Candela, a veteran of Microsoft Research, arrived at Facebook in 2012 to work in the company’s ads business. He and a handful of staffers inherited a ranking algorithm for better targeting users with ads.

Candela describes the machine learning code he inherited as “robust but not the latest.” More than once he compares it to Soyuz_,_ the 1960s Soviet spacecraft. Basic but reliable. Gets the job done even if it’s not the newest, best thing. “It’ll get you up there and down. But it’s not the latest covnet [convolutional neural net] of the month.”

You might assume, then, that the first thing Candela set out to do was to upgrade the algorithm. Get rid of Soyuz in favor of a space plane. It wasn’t. “To get more value, I can do three things,” he says. “I can improve the algorithm itself, make it more sophisticated. I can throw more and better data at the algorithm so that the existing code produces better results. And I can change the speed of experimentation to get more results faster.

“We focused on data and speed, not on a better algorithm.”

Candela describes this decision as “dramatic” and “hard.” Computer scientists, especially academic-minded ones, are rewarded for inventing new algorithms or improving existing ones. A better statistical model is the goal. Getting cited in a journal is validation. Wowing your peers gives you cred.

It requires a shift in thinking to get those engineers to focus on business impact before optimal statistical model. He thinks many companies are making the mistake of structuring their efforts around building the best algorithms, or hiring developers who claim to have the best algorithms, because that’s how many AI developers think.

But for a company, a good algorithm that improves the business is more valuable than vanguard statistical models. In truth, Candela says, real algorithmic breakthroughs are few and far between — two or three a year at best. If his team focused its energies there, it would take lots of effort to make marginal gains.

He hammers these points home constantly: Figure out the impact on the business first. Know what you’re solving for. Know what business challenge you need to address. “You might look for the shiniest algorithm or the people who are telling you they have the most advanced algorithm. And you really should be looking for people who are most obsessed with getting any algorithm to do a job. That’s kind of a profound thing that I think is lost in a lot of the conversation. I had a conversation with our resident machine learning geek at our office, and we were just talking about different people doing AI. He said, ‘Nobody really thinks their algorithms are very good or whatever.’ It makes me think, maybe that’s fine.

“I’m not saying don’t work on the algorithm at all. I’m saying that focusing on giving it more data and better data, and then experimenting faster, makes a lot more sense.”
So rather than defining success as building the best natural language processing algorithm, he defines it as deploying one that will help users find a restaurant when they ask their friends, “Where can I get a good bite around here?” Instead of being thrilled that some computer vision algorithm is nearing pixel-perfect object recognition, he gets excited if that AI is good enough to notice that you post a lot of pictures of the beach and can help you buy a swimsuit.

The strategy worked when he started at Facebook. Ad revenues rose. Candela’s profile rose. It was suggested that AML become a centralized function for all of Facebook. Candela said no. Twice. “I was concerned about the ‘If you build it, they will come’ phenomenon.” Just creating bits of artificial intelligence in the hope that people would see the value and adopt it wouldn’t work.

But he did pick his spots. He collaborated with the feeds team while saying no to many other groups. Then he worked with the Messenger team. His team grew and took on more projects with other teams.

By 2015 Candela could see that his group would need to centralize, so he turned his attention to how he’d build such an operation. He was still worried about the “build it and they will come” phenomenon, so he focused less on how his team would be structured and more on how the group would connect to the rest of Facebook. “You build a factory that makes amazing widgets, and you forget to design the loading docks into your factory?” He laughs. “Well, enjoy your widgets.”

Only then, about three years in, did Candela think about upgrading some of his algorithms. (Incidentally, even today, the emergency escape spacecraft attached to the International Space Station is a Soyuz_._)

H2

Candela goes to a whiteboard to describe how he built his AI factory inside Facebook. The key, he says, was figuring out where on the product development path AI fits. He draws something like this:

AI_Bryn_McA_WHEREAIFITS_320px

H3 — Horizon 3, or three years out from product — is the realm of R&D and science. Often, data scientists who work on AI think of themselves as here, improving algorithms and looking for new ways to get machines to learn. Candela didn’t put his team here for the reasons already mentioned. It’s too far from impact on the business. H1, approaching product delivery, is where the product teams live — the feeds team, the Instagram team, the ads team. AI doesn’t go here either, because it would be difficult to retrofit products this deeply developed. It would be like building a car and then deciding that it should be self-driving after you started to put it together.

That leaves H2, between the science and the product, as the place AML lives at Facebook. AML is a conduit for transferring the science into the product. It does not do research for research’s sake, and it does not build and ship products. As the upward slope in the product’s readiness shows, it’s a dynamic space. Pointing to H2, Candela says, “This needs to feel uncomfortable all the time. The people you need to hire need to be okay with that, and they need to be incredibly selfless. Because if your work is successful, you spin it out. And you need to fail quite a bit. I’m comfortable with a 50% failure rate.”

If the team is failing less, Candela suspects its members are too risk averse, or they’re taking on challenges that are sliding them closer to H1’s product focus. “Maybe we do something like that and it works, but it’s still a failure, because the product teams should be taking that on, not us. If you own a piece of technology that the ads team should operate themselves to generate value, give it to them, and then increase your level of ambition in the machine learning space before something becomes product.”

So Candela’s team is neither earning the glory of inventing new statistical models nor putting products out into the world. It’s a factory of specialists who translate others’ science for others’ products and fail half the time.

Push/Pull

All that being said, the lines between the three realms — H3, H2, and H1 — still aren’t crisp. In some cases Candela’s team does look at the science of machine learning to solve specific problems. And sometimes it does help build the products.

That was especially true as AML got off the ground, because many people in the business hadn’t yet been exposed to AI and what it could do for them. In one case AML built a translation algorithm. The team dipped into the research space to look at how existing translation algorithms worked and could be improved, because bad translations, which either don’t make sense or create a misleading interpretation, are in some ways worse than no translation.

“Early on it was more push, more tenacity on our part,” Candela says. “But it was gentle tenacity. We weren’t going to throw something over the fence and tell the product team, ‘This is great, use it.’” That meant that his team helped write some product code. Doing a little bit of the science and a little bit of the product in addition to its core function was meant to inspire the product team members to see what AML could do for them.

What the two teams built — a product that allowed community pages to instantly translate into several languages — worked. Other projects were similarly pushed out, and now the international team and other product groups at Facebook are pulling from AML, asking to use code in their products themselves.

“Look, it’s nowhere near where I want it to be,” Candela says. “I’d like to have all the product leaders in the company get together quarterly for AI reviews. That will certainly happen. But the conversation in the past two years has completely changed. Now if I walk from one end of this building to the other and I bump into, I don’t know, the video team or the Messenger team, they’ll stop me and say, ‘Hey, we’re excited to try this. We think we can build a product on this.’ That didn’t happen before.”

AML’s success, though, has created a new challenge for Candela. Now that everyone wants a piece of AML, the factory has to scale.

Layer Cake

Candela couldn’t scale just by saying yes to every project and adding bodies to get the work done. So he organized in other ways. First he subdivided his team according to the type of AI its members would focus on:

This created common denominators so that one team — say, computer vision — could work on any machine learning application involving parsing images and reuse its work whenever possible.

Next came a large-scale engineering effort to build Facebook’s own AI backbone, called FBLearner Flow. Here algorithms are deployed once and made reusable for anyone who may need them. The time-consuming parts of setting up and running experiments are automated, and past results are stored and made available and easily searchable. And the system runs through a serious hardware array, so many experiments can be run simultaneously. (The system allows for more than 6 million predictions a second). All of this is to increase the velocity of running experiments on the data and scale.

The system was also designed to accommodate many kinds of possible users. Candela believes that for AI to work, and to scale even further, he must help people outside AML do the work themselves. He created what he calls a layer cake of artificial intelligence.

The bottom layers focus on AML’s work: refining the core engine (with a strong focus on optimizing performance, especially for mobile) and working with machine learning algorithms. The upper layers focus on tools that make it possible for those outside AML to exploit the algorithms with less AML involvement. “It’s all about what you expose to the user,” Candela says. In some cases he’s built systems that developers outside AML can take advantage of to build and run their own models.

Rex

A good example of Candela’s team structure and the push/pull dynamic comes from some AI built to surface content on the basis of what you type. The natural-language machine learning team created an engine to understand conversational typing.

This bit of intelligence first found its way into the Messenger chat client. AML developed the models while the product team developed use cases and “intents” — lingo for the types of tasks you want the engine to learn. For example, training natural language AI to recognize and reliably respond to a phrase like “I’m looking for the best…” is an intent.

The first few such intents were deployed to Messenger through a product called M Suggestions.

If you sent a chat to a friend that said “I’ll meet you there in 30 minutes,” M Suggestions might prompt you with an offer to hire a car.

As the tools for building intent models developed and the product team became more conversant with them, AML’s role diminished. Now the Messenger team has improved M Suggestions by building dozens more intents on its own.

Still, this bit of natural language AI wasn’t built just for chat. It’s reusable. It was codified as CLUE, for “conversational learning understanding engine.” It found its way into more Facebook applications. It’s being adapted for status updates and feeds. Social recommendations — or social rex, as everyone calls them — are increasingly driven by AI. If you typed “I’m traveling to Omaha and I really want to find a good steak downtown,” AI might respond as if it were one of your friends, with a comment on your post, rex such as a list of steakhouses, and a tagged map of where they are relative to downtown. If your friend replied to you and said, “It also has some great vegetarian restaurants,” the algorithm might again reply with pertinent data.

Social rex intents are not yet being developed without AML, but the goal is to have them move out of Candela’s group, just as M Suggestions did.

In general, the idea is to make product teams AI-capable themselves. “We’ll teach you to fish,” Candela says, “and you go fish, and we’ll drag up the next thing. We’ll build a fishing boat. And once you’re using the fishing boat, I’m going to build a cannery, right?”

At the moment, about 70% of the AI work on the backbone is done by people outside Candela’s team. That’s possible in part because of the interface with AI. In some cases, as with a tool called Lumos, machine learning can be used by nondevelopers.

Horseback Riding and Cereal Boxes

Lumos is computer vision AI, a tool that can comb through photos on Facebook or Instagram or other platforms and learn what they contain. You can train it to see anything. It has helped automate the discovery and banning of pornographic or violent content, IP appropriation (improper use of brands and logos), and other unwelcome content. It can also help identify things you like and do (to drive personalized advertising and recommendations) on the basis of photos in your feeds.

I watch a demo in which engineers select “horseback riding” as our intent, the thing we’ll be looking for. The interface is simple: a few clicks, a couple of forms to fill out — What are you looking for? How much data do you want to look at? — and the algorithm gets to work finding pictures of horseback riding. Thumbnails start to fill the page.

The algorithm has searched for horseback riding before, so it’s already quite good at finding it. My guess is that north of 80% of the images that pop up are indeed of horseback riding, and they show remarkable variety. Here’s one with someone posing at a standstill. Here’s one with the horse rearing. Here’s an equestrian jumping. The algorithm finds shapes and boundaries between shapes and builds on previous knowledge of what those interactions mean. It knows things about what combination of pixels is most likely a person, for example, and what’s a horse. It knows when it “sees” a person and a horse together with the person situated close above the horse. And it decides that this looks like horseback riding.

We also find pictures that aren’t horseback riding — one is a person standing next to a horse; another is a person on a mule — and check those off as not matches. They’re framed in red, in case there’s any doubt. The algorithm internalizes that information — adds it to the lookup table — for use next time. A simple chart at the top of the page shows the algorithm’s accuracy and confidence over time. It’s always an S curve, slow to learn at first, then rapidly improving, then tapering off on how much more accurate it can get. It’s very good at seeing horseback riding.

Other potentially valuable pictures are harder for AI to parse. “Receipts” is tricky to suss out because it can look to a computer just like type on a page; but there would be some interesting apps for AI that could recognize and “read” receipts. The engineers show how bowling alleys and escalators often confuse the algorithm because they have similar shapes and visual properties.

I ask, “What about something like ‘food’?” This brings us to an important point about machine learning: It’s only as good as its training.

We call up food as a topic to train. Indeed, we see lots of pictures of fruits and vegetables, a few of plates at restaurants. All food. We also see a cereal box. Is that food?

Well, yes. Or no. It’s a box. But there’s food in it. When we buy it, we’re buying food, not the box. If I asked if there was any food in the cupboard, you wouldn’t say, “No, just a cereal box.” (Or, more pertinent to Facebook, if I posted a picture of a cereal box, should it think I’m posting about food or about a box?) As a picture, as a piece of data, it’s a box.

Should we mark this as a match or a miss? Here’s part of the art of machine learning. When training algorithms, one needs to use clearly definable categories. Food is probably too general in some ways, and the algorithm will either improperly hit or miss on images because it’s hard to know what we mean when we say “Show me pictures of food.” “Vegetable” is a better idea to train on. And when training, everyone must define terms in the same way. Imagine two people training the algorithm when one always marks cereal boxes as food and the other marks them as not food. Now imagine that happening at scale, on terabytes of visual data.

The same applies to natural language processing. Humans are very good at interpreting text in context to find sophisticated meaning. For example, I may type, “Gee, I love that movie about the superheroes. It’s so, so original! I hope they make a hundred more of them.” My friends, who know me and know some of the mechanics of sarcasm, may readily understand that my meaning is the opposite of what I’m typing. Artificial intelligence is still learning how to decide the meaning of something like that. To figure out if I’m being sarcastic, it has to go much further than just learning how to parse grammar and vocabulary. It has to see what else I’ve said and posted and try to find other clues that will tell it whether I really loved the movie and I want 100 more or I actually detested it — because getting that wrong is not good for a platform that wants to create affinities with me. If I was being sarcastic and my feed starts filling up with superhero movie ads, I’m probably not enjoying the experience.

Not Magic

It’s details like these — showing where AI is still limited, how humans have such a core role in training it, and how solving problems and creating value are more important than finding great models — that Candela is thinking about near the end of the day, when he’s talking about the mythic status AI has gained. He’s railing against what he perceives as laziness in those who find the idea of AI-as-magic-bullet appealing and don’t apply critical thinking to it.

Next In

AI, For Real

AI Can Be a Troublesome Teammate

[

AI is a focused intelligence, groomed for maximum perfection. That’s why, research shows, most people don’t trust it.

](/2017/07/ai-can-be-a-troublesome-teammate)

“What frustrates me,” he says, “is that everybody knows what a statistician is and what a data analyst can do. If I want to know ‘Hey, what age segment behaves in what way?’ I get the data analyst.

“So when people skip that, and they come to us and say, ‘Hey, give me a machine learning algorithm that will do what we do,’ I’m like, ‘What is it that I look like? What problem are you trying to solve? What’s your goal? What are the trade-offs?’” Sometimes they’re surprised that there are trade-offs. “If that person doesn’t have answers to those questions, I’m thinking, ‘What the hell are you thinking AI is?’”

They are thinking it’s magic.

“But it’s not. That’s the part where I tell people, ‘You don’t need machine learning. You need to build a data science team that helps you think through a problem and apply the human litmus test. Sit with them. Look at your data. If you can’t tell what’s going on, if you don’t have any intuition, if you can’t build a very simple, rule-based system — like, Hey, if a person is younger than 20 and living in this geography, then do this thing — if you can’t do that, then I’m extremely nervous even talking about throwing AI at your problem.’

“I’m delighted when other executives come to me and start not from wanting to understand the technology but from a problem they have that they’ve thought very, very deeply about. And sometimes — often, in fact — a simple, good old rule-based system, if you have the right data, will get you 80% of the way to solving the problem.

“And guess what? It’s going to have the benefit that everybody understands it. Exhaust the human brain first.”The Big Idea

Anterior Siguiente