AI will represent a fundamental challenge to anti-discrimination regimes that seek to limit discrimination, according to research due for publication in 2020.
In its nature
AI uses training data to look for correlations that can then be used to make predictions. AI doesn’t necessarily need a human to guide it with a theory. Nor does it need a human to add any intuitions. Powerful AI will find correlations that humans can’t find. Which is why it’s valuable and used by the most powerful companies in the world.
However, there’s a new danger. If an AI isn’t allowed to use data because it’s illegal to use – say race or gender – and if this data is predictive of a certain outcome, then an AI will naturally find proxies without a human knowing. As an AI looks for less and less intuitive proxies, the AI will not be able to disassociate these predictions from what shouldn’t be used. And if an AI can’t, then a human most certainly won’t be able to.
The researchers call this “proxy discrimination” and say that, in many cases, it will be almost impossible for people to figure out that it’s happening. It’s especially important now that so many choices are guided algorithmically. Job listings are a good example of an area where proxy discrimination can occur because there are many ways an AI can discover correlations that are proxies for race, age or gender.
AI as a weaver and builder of relationships
Another example explains how discrimination can have counter-intuitive results. Take a life insurer that uses AI to price its policies. Say the company charges more for applicants who have visited the website of an organization that provides free testing for the BRCA mutation which is highly predictive of certain cancers. In this situation, the insurer would almost certainly be proxy discriminating for genetic information – the AI would latch on to the link between the website visit and genetic history. However, if the AI had access to genetic history, then visiting the website would not be predictive of any applicant’s genetic disposition to cancer.
The researchers point out that it is this weaving and building of relationships among suspect classifiers, neutral variables and desired outcomes that makes AI so good at what it does and so dangerous in this context. All sorts of new big data streams can now proxy for things rather than the traditional proxies that were easier to identify. Instead of headgear, hairstyles, height and weight, proxies can theoretically be such things as Netflix viewing preferences and social media posts.
One solution is to teach AI about discrimination
With regulatory interest in AI increasing, businesses need to consider solutions. The authors offer specific suggestions relevant to data collection and use. These include removing access to substitute variables, mandating transparency, requiring causal connections and implementing ethical algorithms.
This solution is counter-intuitive and comes from an “important, though little appreciated” economics paper from 2011. In simple terms, it involves teaching the AI what discrimination is first, then removing the discriminatory variables at the level of the individual. First, the model is run on all the discriminatory variables – essentially teaching the AI that certain variables predict discriminatory outcomes. This works because a model that explicitly includes all suspect variables means that non-suspect variables only have predictive power for reasons that have nothing to do with their correlation to the prohibited characteristics. Then, once this predictive power is stripped, any individualized data associated with prohibited characteristics are removed from the model and replaced with population averages. This step means that the model cannot link discriminatory characteristics with any one individual.
It’s a conceptually easy solution but the practicalities would be fraught; AI companies would need to collect data on legally prohibited characteristics and then make predictions on outcomes based on these, which would increase the risk that bad actors could be even more discriminatory. The authors suggest that government regulators would quickly be out of their depth and likely to be easily manipulated by technically-savvy companies.
Designing for what you can’t know
Again, it’s the things we value in AI – the fact it is not-intuitive and comes up with things humans wouldn’t and can’t explain how or why – that makes designing it so much more complex. Reducing discrimination is a core goal of many AI practitioners but this paper points out how this will not happen without careful design.