Weapons of Math Destruction is a book about the perils of big data, written by Cathy O’Neil, a mathematician and the author of mathbabe.org. It explores how algorithms are used in ways that reinforce pre-existing inequality.
O’Neil was originally a researcher at Barnard College, but she left academia in 2007 to work at the hedge fund D. E. Shaw (the one Jeff Bezos originally worked for.)
After the 2007/2008 financial crisis, she became disillusioned with the industry.
“The crash made it all too clear that mathematics, once my refuge, was not only deeply entangled in the world’s problems but also fueling many of them. The housing crisis, the collapse of financial institutions, the rise of unemployment — all had been aided and abetted by mathematicians wielding magic formulas.
What’s more, thanks to the extraordinary powers that I loved so much, math was able to combine with technology to multiply the chaos and misfortune, adding efficiency and scale to systems that I now recognized as flawed.”
She then started her blog and wrote Weapons of Math Destruction.
I’ll be honest. While this book has garnered widespread praise and won numerous awards, I wasn’t a big fan.
The book’s premise is great. It tackles a very important issue. But I didn’t think the execution was well done.
She devoted 95% of the book to raising “big data” issues in multiple industries, including insurance, education, advertising, policing and politics. But most of them, as she describes further, became less of “algorithms causing issues” as opposed to simply bad actors and malpractice.
And despite her expertise in the field, she didn’t give much recommendations on how to improve things. She did try to say something in the conclusion, but she squeezed it in only a few pages.
Plus, her suggestions were more idealistic than actual, nuanced advice. For example, she wrote:
“Like doctors, data scientists should pledge a Hippocratic Oath, one that focuses on the possible misuses and misinterpretations of their models.”
Maybe I’m just skeptical of human nature in general, but I don’t really see how this would help. With some research, I can probably find enough doctors that swore by the oath but still ruined people’s lives.
Plus, as she mentioned in the book, a reason for the proliferation of bad algorithms is due to misaligned incentives. So, I’m not sure how an oath would persuade someone to give up, say a certain financial incentive, to do better.
With all that said, I still learnt quite a bit from this book. So, here are my key takeaways:
Algorithms are basically models
What are models?
“A model is nothing more than an abstract representation of some process, be it a baseball game, an oil company’s supply chain, a foreign government’s actions, or a movie theatre’s attendance.
Whether it’s running in a computer program or in our head, the model takes what we know and uses it to predict responses in various situations. All of us carry thousands of models in our heads. They tell us what to expect, and they guide our decisions.”
Models are, by nature, simplistic:
“There would always be mistakes, however, because models are, by their very nature, simplifications. No model can include all of the real world’s complexity or the nuance of human communication. Inevitably, some important information gets left out.”
Models are about choices:
“To create a model, we make choices about what’s important enough to include, simplifying the world into a toy version that can be easily understood and from which we can infer important facts and actions. We expect it to handle only one job and accept that it will occasionally act like a clueless machine, one with enormous blind spots.”
A model reflects the goals and priorities of its creators
“Models, despite their reputation for impartiality, reflect goals and ideology. It’s something we do without a second thought. Our own values and desires influence our choices, from the data we choose to collect to the questions we ask. Models are opinions embedded in mathematics.”
The definition of success is also defined by its creator:
“Whether or not a model works is also a matter of opinion. After all, a key component of every model, whether formal or informal is its definition of success. In each case, we must ask not only who designed the model but also what that person or company is trying to accomplish.”
Models require constant feedback to work well
“Equally important, statistical systems require feedback — something to tell them when they’re off track. Statisticians use errors to train their models and make them smarter.
If Amazon.com, through a faulty correlation, started recommending lawn care books to teenage girls, the clicks would plummet, and the algorithm would be tweaked until it got it right.
Without feedback, however, a statistical engine can continue spinning out faulty and damaging analysis while never learning from its mistakes.”
However, in real life, this is not always the case.
When a school uses a scoring algorithm to judge the effectiveness of a teacher, it doesn’t get this feedback.
“When Mathematica’s scoring system tags Sarah Wysocki and 205 other teachers as failures, the district fires them. But how does it ever learn if it was right? It doesn’t. The system itself has determined that they were failures, and that is how they are viewed.
Two hundred and six “bad” teachers are gone. That fact alone appears to demonstrate how effective the value-added model is. It is cleansing the district of underperforming teachers. Instead of searching for the truth, the score comes to embody it.”
Sometimes, algorithms use poor data as their input
“The folks running weapons of math destruction (WMDs) routinely lack data for the behaviours they’re most interested in. So they substitute stand-in data, or proxies. They draw statistical correlations between a person’s zip code or language patterns and her potential to pay back a loan or handle a job. These correlations are discriminatory, and some of them are illegal.”
An example of how companies combine faulty data as their input:
“When applying for federal housing assistance, Taylor and her husband met with an employee of the housing authority to complete a background check. This employee, Wanda Taylor was using information provided by Tenant Tracker, the data broker.
It was riddled with errors and blended identities. It linked Taylor, for example, with the possible alias of Chantel Taylor, a convicted felon who happened to be born on the same day. It also connected her to the other Catherine Taylor, who had been convicted in Illinois of theft, forgery, and possession of a controlled substance.”
Yet, algorithms are treated like gods
“The math-powered applications powering the data economy were based on choices made by fallible human beings. Some of these choices were no doubt made with the best intentions. Nevertheless, many of these models encoded human prejudice, misunderstanding, and bias into the software systems that increasingly managed our lives.
Like gods, these mathematical models were opaque, their workings invisible to all but the highest priests in their domain: mathematics and computer scientists. Their verdicts, even when wrong or harmful, were beyond dispute or appeal. And they tended to punish the poor and the oppressed in our society, while making the rich richer.”
There is a legitimate reason as to why, and an equally legitimate opposite reason:
“Verdicts from WMDs land like dictates from the algorithmic gods. The model itself is a black box, its contents a fiercely guarded corporate secret. This allows consultants like Mathematica to charge more, but it serves another purpose as well: if the people being evaluated are kept in the dark, the thinking goes, they’ll be less likely to attempt to game the system.
Instead, they’ll simply have to work hard, follow the rules, and pray that the model registers and appreciates their efforts. But if the details are hidden, it’s also harder to question the score or to protest against it.”
This creates a paradox:
“An algorithm processes a slew of statistics and comes up with a probability that a certain person might be a bad hire, risky borrowers, terrorist or a miserable teacher. That probability is distilled into a score, which can turn someone’s life upside down.
And yet when the person fights back, “suggestive” countervailing evidence simply won’t cut it. The case must be ironclad. The human victims of WMDs are held to a far higher standard of evidence than the algorithms themselves.”
One reason for these issues is that the people who are creating algorithms have mismatched incentives
The proxy for success used by many companies is revenue or profit. As long as they’re making money, it means that the algorithm is working.
“For many of the businesses running these rogue algorithms, the money pouring in seems to prove that their models are working. Look at it through their eyes and it makes sense.
When they’re building statistical systems to find customers or manipulate desperate borrowers, growing revenue appears to show that they’re on the right track. The software is doing its job. The trouble is that profits end up serving as a stand-in, or proxy, for truth. We’ll see this dangerous confusion crop up again and again.”
It doesn’t help that data scientists don’t regularly get in contact with the people their algorithms affect.
“This happens because data scientists all too often lose sight of the folks on the receiving end of the transaction. They certainly understand that a data-crunching program is bound to misinterpret people a certain percentage of the time, putting them in the wrong groups and denying them a job or a chance at their dream house.
But as a rule, people running the WMDs don’t dwell on those errors. Their feedback is money, which is also their incentive. Their systems are engineered to gobble up more data and fine-tune their analytics so that more money will pour in. Investors feast on these returns and shower WMD companies with more money.”
As a result, these algorithms disproportionately affect two groups of people: minorities and the poor
“This underscores another common feature of WMDs. They tend to punish the poor. This is, in part, because they are engineered to evaluate large numbers of people. They specialize in bulk, and they’re cheap. That’s part of their appeal. The wealthy, by contrast, often benefit from personal input.”
What happens when you use algorithms to judge if a student should be accepted into a university? Same issue — it punishes the poor:
“The victims are the vast majority of Americans, the poor and middle-class families who don’t have thousands of dollars to spend on courses and consultants. They miss out on precious insider knowledge. The result is an education system that favours the privileged. It tilts against needy students, locking out the great majority of them — and pushing them down a path towards poverty. It deepens the social divide.”
What about policing?
“Delving deeper into a person’s life, it’s easy to imagine how inmates from a privileged background would answer one way and those from tough inner-city streets another. Ask a criminal who grew up in comfortable suburbs about “the first time you were ever involved with the police”, and he might not have a single incident to report other than the one that brought him to prison.
Young black males, by contrast, are likely to have been stopped by police dozens of times, even when they’ve done nothing wrong. A 2013 study by the New York Civil Liberties Union found that while black and Latino males between the ages of 14 and 24 made up only 4.7% of the city’s population, they accounted for 40.6% of the stop-and-frisk checks by police. More than 90% of those stopped were innocent.”
Bad algorithms also cause the participants in the algorithm to be mis-incentivized
For example, what happens when you use algorithms to evaluate teachers?
“First, teacher evaluation algorithms are a powerful tool for behavioral modification. That’s their purpose, and in the Washington schools, they featured both a stick and a carrot. Teachers knew that if their students stumbled on the test their own jobs were at risk. This gave teachers a strong motivation to ensure their students passed, especially as the Great Recession battered the labor market.
At the same time, if their students outperformed their peers, teachers and administrators could receive bonuses of up to $8,000. If you add those powerful incentives to the evidence in the case — the high number of erasures and the abnormally high test scores — there were grounds for suspicion that fourth-grade teachers, bowing either to fear or to greed, had corrected their students’ exams.”
Similarly, what happens when you use algorithms (or ranking systems) to judge schools?
“U.S News’ first data-driven ranking came out in 1988, and the results seemed sensible. However, as the ranking grew into a national standard, a vicious feedback loop materialized. The trouble was that the rankings were self-reinforcing.
If a college fared badly in U.S News, its reputation would suffer, and conditions would deteriorate. Top students would avoid it, as would top professors. Alumni would howl and cut back contributions. The ranking would tumble further. The ranking, in short, was destiny.”
Algorithms aren’t bad if they only affect a few people. But scale is how math kills
“As a statistician would put it, can it scale? This might sound like the nerdy quibble of a mathematician. But scale is what turns WMDs from local nuisances into tsunami forces, ones that define and delimit our lives.
A formula, whether it’s a diet or a tax code, might be perfectly innocuous in theory. But if it grows to become a national or global standard, it creates its own distorted and dystopian economy.”
That’s the exact problem with the U.S News model:
“The problem isn’t the US New model but its scale. It forces everyone to shoot for exactly the same goals, which creates a rat race — and lots of harmful unintended consequences.”
In the end, algorithms are unfair because fairness can’t be (easily) quantified
“While looking at WMDs, we’re often faced with a choice between fairness and efficacy. Our legal traditions lean strongly toward fairness. The Constitution for example presumes innocence and is engineered to value it. From a modeler’s perspective, the presumption of innocence is a constraint, and the result is that some guilty people go free, especially those who can afford good lawyers.
Even those found guilty have the right to appeal their verdict, which chews up time and resources. So the system sacrifices enormous efficiencies for the promise of fairness.
WMDs, by contrast, tend to favour efficiency. By their very nature, they feed on data that can be measured and counted. But fairness is squishy and hard to quantify. It is a concept. And computers struggle with concepts.
So fairness isn’t calculated into WMDs. And the result is massive, industrial production of unfairness.”