Algorithms to Live By: The Computer Science of Human Decisions. Brian Christian

Чтение книги онлайн.

Читать онлайн книгу Algorithms to Live By: The Computer Science of Human Decisions - Brian Christian страница 10

Algorithms to Live By: The Computer Science of Human Decisions - Brian  Christian

Скачать книгу

tradeoff.

      Explore/Exploit

      In English, the words “explore” and “exploit” come loaded with completely opposite connotations. But to a computer scientist, these words have much more specific and neutral meanings. Simply put, exploration is gathering information, and exploitation is using the information you have to get a known good result.

      It’s fairly intuitive that never exploring is no way to live. But it’s also worth mentioning that never exploiting can be every bit as bad. In the computer science definition, exploitation actually comes to characterize many of what we consider to be life’s best moments. A family gathering together on the holidays is exploitation. So is a bookworm settling into a reading chair with a hot cup of coffee and a beloved favorite, or a band playing their greatest hits to a crowd of adoring fans, or a couple that has stood the test of time dancing to “their song.”

      What’s more, exploration can be a curse.

      Part of what’s nice about music, for instance, is that there are constantly new things to listen to. Or, if you’re a music journalist, part of what’s terrible about music is that there are constantly new things to listen to. Being a music journalist means turning the exploration dial all the way to 11, where it’s nothing but new things all the time. Music lovers might imagine working in music journalism to be paradise, but when you constantly have to explore the new you can never enjoy the fruits of your connoisseurship—a particular kind of hell. Few people know this experience as deeply as Scott Plagenhoef, the former editor in chief of Pitchfork. “You try to find spaces when you’re working to listen to something that you just want to listen to,” he says of a critic’s life. His desperate urges to stop wading through unheard tunes of dubious quality and just listen to what he loved were so strong that Plagenhoef would put only new music on his iPod, to make himself physically incapable of abandoning his duties in those moments when he just really, really, really wanted to listen to the Smiths. Journalists are martyrs, exploring so that others may exploit.

      In computer science, the tension between exploration and exploitation takes its most concrete form in a scenario called the “multi-armed bandit problem.” The odd name comes from the colloquial term for a casino slot machine, the “one-armed bandit.” Imagine walking into a casino full of different slot machines, each one with its own odds of a payoff. The rub, of course, is that you aren’t told those odds in advance: until you start playing, you won’t have any idea which machines are the most lucrative (“loose,” as slot-machine aficionados call it) and which ones are just money sinks.

      Naturally, you’re interested in maximizing your total winnings. And it’s clear that this is going to involve some combination of pulling the arms on different machines to test them out (exploring), and favoring the most promising machines you’ve found (exploiting).

      To get a sense for the problem’s subtleties, imagine being faced with only two machines. One you’ve played a total of 15 times; 9 times it paid out, and 6 times it didn’t. The other you’ve played only twice, and it once paid out and once did not. Which is more promising?

      Simply dividing the wins by the total number of pulls will give you an estimate of the machine’s “expected value,” and by this method the first machine clearly comes out ahead. Its 9–6 record makes for an expected value of 60%, whereas the second machine’s 1–1 record yields an expected value of only 50%. But there’s more to it than that. After all, just two pulls aren’t really very many. So there’s a sense in which we just don’t yet know how good the second machine might actually be.

      Choosing a restaurant or an album is, in effect, a matter of deciding which arm to pull in life’s casino. But understanding the explore/exploit tradeoff isn’t just a way to improve decisions about where to eat or what to listen to. It also provides fundamental insights into how our goals should change as we age, and why the most rational course of action isn’t always trying to choose the best. And it turns out to be at the heart of, among other things, web design and clinical trials—two topics that normally aren’t mentioned in the same sentence.

      People tend to treat decisions in isolation, to focus on finding each time the outcome with the highest expected value. But decisions are almost never isolated, and expected value isn’t the end of the story. If you’re thinking not just about the next decision, but about all the decisions you are going to make about the same options in the future, the explore/exploit tradeoff is crucial to the process. In this way, writes mathematician Peter Whittle, the bandit problem “embodies in essential form a conflict evident in all human action.”

      So which of those two arms should you pull? It’s a trick question. It completely depends on something we haven’t discussed yet: how long you plan to be in the casino.

      Seize the Interval

      “Carpe diem,” urges Robin Williams in one of the most memorable scenes of the 1989 film Dead Poets Society. “Seize the day, boys. Make your lives extraordinary.”

      It’s incredibly important advice. It’s also somewhat self-contradictory. Seizing a day and seizing a lifetime are two entirely different endeavors. We have the expression “Eat, drink, and be merry, for tomorrow we die,” but perhaps we should also have its inverse: “Start learning a new language or an instrument, and make small talk with a stranger, because life is long, and who knows what joy could blossom over many years’ time.” When balancing favorite experiences and new ones, nothing matters as much as the interval over which we plan to enjoy them.

      “I’m more likely to try a new restaurant when I move to a city than when I’m leaving it,” explains data scientist and blogger Chris Stucchio, a veteran of grappling with the explore/exploit tradeoff in both his work and his life. “I mostly go to restaurants I know and love now, because I know I’m going to be leaving New York fairly soon. Whereas a couple years ago I moved to Pune, India, and I just would eat friggin’ everywhere that didn’t look like it was gonna kill me. And as I was leaving the city I went back to all my old favorites, rather than trying out new stuff.… Even if I find a slightly better place, I’m only going to go there once or twice, so why take the risk?”

      A sobering property of trying new things is that the value of exploration, of finding a new favorite, can only go down over time, as the remaining opportunities to savor it dwindle. Discovering an enchanting café on your last night in town doesn’t give you the opportunity to return.

      The flip side is that the value of exploitation can only go up over time. The loveliest café that you know about today is, by definition, at least as lovely as the loveliest café you knew about last month. (And if you’ve found another favorite since then, it might just be more so.) So explore when you will have time to use the resulting knowledge, exploit when you’re ready to cash in. The interval makes the strategy.

      Interestingly, since the interval makes the strategy, then by observing the strategy we can also infer the interval. Take Hollywood, for instance: Among the ten highest-grossing movies of 1981, only two were sequels. In 1991, it was three. In 2001, it was five. And in 2011, eight of the top ten highest-grossing films were sequels. In fact, 2011 set a record for the greatest percentage of sequels among major studio releases. Then 2012 immediately broke that record; the next year would break it again. In December 2012, journalist Nick Allen looked ahead with palpable fatigue to the year to come:

      Audiences will be given a sixth helping of X-Men plus Fast and Furious 6, Die Hard 5, Scary Movie 5 and Paranormal Activity 5. There will also be Iron Man 3, The Hangover 3, and second outings for The Muppets, The Smurfs, GI Joe and Bad Santa.

      From a studio’s perspective, a sequel is a movie with a guaranteed fan base: a cash cow, a sure thing, an exploit. And an overload of sure things signals a short-termist approach, as with

Скачать книгу