The Unintended Chain Reactions of AI Bias

April 3, 2023 Blogs AI bias Artificial Intelligence Machine Learning

Written By: Aaron Schneider, Associate Solutions Engineer – Couchbase

Turn on your browser. You will see a vast number of Artificial Intelligence (AI) and AI-generated content—from writing, art and even music. Keying in inputs and watching your text morph into an image, sound or intelligent response can feel rather magical. To the keen observer, however, you will begin to see flaws in the replies. The images and text sometimes do not make sense. They may contain factual inaccuracies or be awkwardly phrased. Despite these limitations, AI has been highly praised on social media.

At the same time, a wave of scepticism has also surfaced. Much of it is on copyright and biased perspectives critics have uncovered. This has left the legal, social and ethical consequences of AI tools unaddressed. With such bias and errors occurring, how can users trust AI-based outcomes, especially when AI provides high levels of utility?

How Can AI Be More Trustworthy?

When the ChatGPT story broke, there were reactions to how it would first affect search engines. If users could obtain answers to queries from AI, how would companies such as Google, Bing and Yahoo respond? The potential risk of bias and factual errors could cost businesses revenue and reputation damage no matter how much the casual user trusts the AI outcomes.

A researcher from the University of California found out that ChatGPT would rank the value of a human being by gender and skin colour. These types of responses are risky, discriminatory and inflammatory. It also puts a dent in the reputation of businesses genuinely harnessing AI for good. This is not the AI’s fault to be fair as it lies solely on the inputs the technology receives. But it would behoove developers to nip bias and errors early on.

Under the Hood

At the end of the day, AI is based on Machine Learning (ML) models—with well-researched models and techniques used in creating predictive systems. Artificial intelligence systems such as ChatGPT and DALL-E need to ingest a vast volume of inputs from the internet before they could relationally match text to responses. Additionally, other tools and language models may be used to help the AI predict the words used in each response.

As such, it would be unsurprising if those AI systems produce offensive, racist or even sexist remarks. That is because the dataset used could contain potentially millions of highly illicit content. Unfiltered profanities in song lyrics or disrespectful language used on social media may have been included at some point. In such case, that would train the AI to say such phrases. This situation is not any different from teaching a toddler to speak with swear words.

It comes as no surprise that tech companies seem hesitant to release their complex AI models. That would be akin to handing over an unpredictable Pandora’s Box to users. For the business, it would be considered a highly risky manoeuvre as it has no control over the AI outputs. Any unreliable and obscene responses will be absolutely damaging to their reputation.

“To the keen observer, however, you will begin to see flaws in the replies. The images and text sometimes do not make sense. They may contain factual inaccuracies or be awkwardly phrased. Despite these limitations, AI has been highly praised on social media.”

Still, this conundrum is not something relatively new. AI bias occurs when AI models showcase the biases of their human authors through the datasets. What you give is what you get back in return. The challenge then is how businesses can reduce such biases in their product offerings and avoid the risk of putting perilous AI models into production. A successful intervention of atrocious AI models could herald a new age of information. This will, in turn, benefit technologies in various verticals from marketing tools to search engines and automation.

Built-In Limitations?

A quick study of ChatGPT tells us that OpenAI was well aware of its AI bias. And that is why it instinctively added limitations to its AI. This is as straightforward as it gets and prevents ChatGPT from making inappropriate responses. A list of prohibited keywords, phrases and guides could be put in place to prevent AI models from going erratic.

In most scenarios, this method has stopped ChatGPT from going overboard. This has not stopped AI enthusiasts from drawing out this bias through the DAN jailbreak. This only shows that underneath, AI models do suffer from bias from harmful content in its dataset. With time, OpenAI will add more guardrails and block these biases from escaping. This strategy, while successful, clearly is not foolproof, as gatekeeping content and processes is delimiting.

Rethinking an Approach for Reputation

A longer and more sustainable strategy would be rethinking what datasets are being ingested. When biases are removed before the AI could learn it, you can effectively negate the possibility of a biased AI. An attempt to ingest datasets the size of the internet will prove to be prohibitively expensive and sophisticated. Additionally, humans may have their own biases. What is offensive to one person may be harmless to another. This makes it harder to identify content in a consensual way. However, it works very well for smaller use cases.

A Preemptive Method: Removing Bias Automatically

For technology to remain a force of good, the user must be empowered to build unbiased AI models. By using a real-time event processing solution, researchers can utilise user-defined business logic to automatically remove undesired information from an AI dataset. This may require a cloud solution with a memory-first architecture that delivers unparalleled performance and makes SQL++ queries fast and highly efficient. Al projects will also benefit from the flexibility and potential of cloud NoSQL databases.

By empowering businesses and users with the ability to develop, manage and deploy data-driven business logic in a seamless environment, AI functions can be allowed to grow untethered to bias. By aggregating data from various sources, businesses can extract the diverse insights necessary to drive their business, drive AI-based Customer 360 automation or even form intelligent inventories and practice real-time logistics.

(0)

Covering Disruptive Technology Powering Business in The Digital Age

How Can AI Be More Trustworthy?

Under the Hood

Built-In Limitations?

Rethinking an Approach for Reputation

A Preemptive Method: Removing Bias Automatically

Archive