Artificial intelligence bias can create problems ranging from bad business decisions to injustice. Use these questions to fight off potential biases in your AI systems
Researchers have studied and found significant racial bias in facial recognition technology, for example, and in particular in the underlying algorithms. That alone is a massive problem.
When you more broadly consider the role AI and ML will play in societal and business contexts, the problem of AI bias becomes seemingly limitless – one that IT leaders and others need to pay close attention to as they ramp up AI and ML implementations.
AI bias often begins with people, which runs counter to the popular narrative that we’ll all soon be controlled by AI robot overlords. Along with people, data becomes a key issue.
“Bias in AI is really a reflection of bias in the training data,” says Rick McFarland, chief data officer at LexisNexis Legal & Professional. “The best place to look for bias in your AI is in your training data.”
Guess who usually has their hands on that training data? People.
In a TED Talk, “How I’m fighting bias in algorithms,” MIT researcher Joy Buolamwini said of training data and bias in AI and ML technologies: “Why isn’t my face being detected? We have to look at how we give machines sight. … You create a training set with examples of faces. However, if the training sets aren’t really that diverse, any face that deviates too much from the established norm will be harder to detect.”
We asked a range of experts to weigh in on the questions IT leaders should be asking to identify and root out potential biases in their own AI systems. These could range from issues such as racial or gender bias to matters of bad analytics and confirmation bias. There’s not much business value in simply training your AI to tell you what you want to hear.
We also asked these experts for their insights on how organizations can be thinking through and formulating their own answers to these questions. Here’s their advice, grouped into three overlapping categories: people, data, and management.
People questions to ask about AI bias
1. Who is building the algorithms?
Perhaps someday the sci-fi stories about AI replacing humans will come true. For now, though, AI bias exposes a fallacy in that narrative: AI is highly dependent on human input, especially in its early phases.
Phani Nagarjuna, chief analytics officer at Sutherland, recommends that IT leaders looking to reduce bias start by examining the teams that work most closely with the company’s AI applications.
“Often times, AI becomes a direct reflection of the people who assemble it,” Nagarjuna says. “An AI system will not only adapt to the same behaviors of its developers but reinforce them. The best way to prevent this is by making sure that the designers and developers who program AI systems incorporate cultural and inclusive dimensions of diversity from the start.”
Consider it another competitive edge for diverse and inclusive IT teams.
“Business and IT leaders should ask themselves: Does my team embody enough diversity in skills, background, and approach?” Nagarjuna says. “If this is something the team is lacking, it’s always best to bring in more team members with different experiences who can help represent a more balanced, comprehensive approach.”
2. Do your AI & ML teams take responsibility for how their work will be used?
Harry Glaser, co-founder and CEO of Periscope Data, notes that bias is less likely to occur when the people programming your AI and ML have a real stake in the outcomes. It’s somewhat like developers taking longer-term ownership of their code in DevOps culture.
Glaser adds a couple of follow-up questions to be asking here:
“Do they have personal ownership over the short and long-term impact of their technical work? Are they empowered to be business owners for their projects – and not just technicians?”
3. Who should lead an organization’s effort to identify bias in its AI systems?
Rooting out AI bias is not an ad hoc task; it requires people who are actively seeking to find and eliminate it, according to Tod Northman, a partner at the law firm Tucker Ellis.
“It takes a unique background and skill set to identify potential bias in an AI system; even the question may be foreign to AI developers,” Northman says. “An organization that deploys an AI system must identify a lead for evaluating potential AI bias – one who has sufficient gravitas to command respect but who also has the right training and temperament to identify possible bias.”
4. How is my training data constructed?
“Some training data can be made without human involvement, such as data collected from devices, computers, or machines – think of your cell phone,” McFarland at LexisNexis says. “However, much of the AI training data used today is constructed by humans or has some sort of human involvement. Any AI built from any source of training data will amplify any biases in the training data – no matter the source.”
McFarland notes that as AI and ML platforms become easier to use by a wider range of people across an organization, the possibility for bias – or simply a lack of awareness of bias – increases.
“Developers are not really thinking about data requirements. Instead, they’re thinking about building the model and the product,” McFarland says. “That means untrained developers are not performing the critical, time-consuming training data tests. The consequence is that the model that uses the biased training data will amplify the biases 1,000-fold.”
Data questions to ask about AI bias
5. Is the data set comprehensive?
Nagarjuna, the chief analytics officer at Sutherland, cites a Gartner prediction that an estimated 85 percent of all AI projects during the next several years will deliver flawed outcomes because of initial bias in data and algorithms.
This again reveals the fundamental relationship between people and data.
“An AI model is only as good as the data used to train it; it’s vital that decision-makers take a hard look at the external data and ensure it is both comprehensive and representative of all variables,” Nagarjuna says. “Environmental data is going to be instrumental in enabling outputs to be contextually sensitive beyond being just accurate in predictions.”
6. Do you have multiple sources of data?
This has been a hot-button in data science and analytics in general: You’re only as good as your data sources. AI is not much different in this sense.
“Have you thought holistically about the biases in your data, and worked to minimize them?” Glaser asks. “You can never eliminate bias in data, but you can be aware of it and work to mitigate it. And you can reduce bias by incorporating multiple sources of data.”
Management questions to ask about AI bias
If you’re actually worried about those robot overlord scenarios, then it’s probably a good idea to have some controls in place to prevent them from coming true. More seriously, ongoing monitoring, auditing, and optimization of AI applications is absolutely necessary to guard against systemic bias. Like security and other IT concerns, this is an indefinite project.
7. What proportion of resources is appropriate for an organization to devote to assessing potential bias?
“A thoughtful organization will scale the resources dedicated to assessing bias based on the potential impact of any bias and the sensitivity to potential bias of the team charged with developing the AI system,” says Northman, the Tucker Ellis attorney.
This is itself an ongoing project of analysis and optimization.
“As a rule, some of the most important assessment will take place following deployment,” Northman says. “It is critical that an organization study and evaluate the results the AI system produces.”
8. Have you thought deeply about what metrics you use to evaluate your work?
AI needs measurement – like any other worthwhile IT endeavor. Glaser from Periscope Data notes that short-term, bottom-line metrics like revenue or traffic are more fertile terrain for AI bias.
“AI systems are less likely to go off the rails if the team is evaluating them using long-term holistic metrics, including pairing metrics so that they balance each other out,” Glaser says.
9. How can we test for bias in training data?
McFarland notes that AI success, as with so many other IT systems, requires ongoing testing, though he acknowledges that teams might encounter challenges here, including costs or population assessment accuracy. But not testing at all is leaving the door wide open for bias.
Frequency bias is a common issue with AI; it’s also one you can test for in your training data and on an ongoing basis.
“Compare counts of different population types in your training data and compare that to the actual distribution of your target population,” McFarland says. “Any skews in the counts can indicate a bias for or against a certain population type.”
Coming full circle, human biases are another common issue. You can test for those, too.
“Have several experts summarize the same document and then compare their results,” McFarland says. “If you have a good summarization framework, these summaries will be mostly interchangeable. Looking for variances across these summaries can help identify biases in the experts.”