Covering Disruptive Technology Powering Business in The Digital Age

image
Preventing Bias and Enhancing Trust in AI Requires Better Data Governance and Collaboration
image

Written By: Remus Lim, Vice President, Asia Pacific & Japan, at Cloudera

 

The age of artificial intelligence (AI) is here. Generative AI models, like ChatGPT and DALL-E 2, have skyrocketed in popularity, partly due to their ease of use and public availability. AI Appreciation Day, held every 16 July, serves as a reminder of how fast AI has broken into the mainstream consciousness in the last few years.

With generative AI being able to assist with answering queries, organising data and executing repetitive tasks, enterprises too are increasingly turning towards AI to supercharge productivity and drive innovation.

From accelerating creative work, to optimising supply chains, and designing medical drugs—AI is driving society towards a new industrial revolution. IDC research recently revealed that two out of three organizations in Asia Pacific are exploring or have invested in generative AI technologies in 2023.

Inaccuracies and Bias Are Pressing Issues

As AI enters the mainstream, questions are rising on the trust we can place in these tools and if there are enough safeguards. While AI-generated deepfakes are being used widely to perpetrate disinformation, intellectual property and personal information continue to be compromised on platforms like ChatGPT.

More concerning are AI hallucinations. Users of AI today often encounter instances where generative models produce inaccurate or irrelevant results that appear confident and well-phrased enough to be easily believable.

Data bias is another worrying issue that threatens to magnify AI’s negative impact on society. Humans possess many forms of ingrained biases, like confirmation bias, that are magnified in large data sets. When this biased data is used to train AI models, their outputs will also present bias.

For example, image generation models which are not well trained can present outputs with outdated stereotypes. Examples are images of an older man in a suit when asked to generate a CEO, or a young woman when asked to visualise a flight attendant.

These concerns about AI can have severe implications on enterprises. For example, a medical imaging AI designed to diagnose patients could be useful at diagnosing patients with common conditions, but due to training data biased against unrepresented groups, end up prescribing ineffectual or harmful treatments to patients with rare health conditions.

If left unchecked now, AI could perpetrate unfairness across society—in hiring, loan disbursements, school admissions. and more. This can then lead to loss of trust, financial losses. or worse.

Addressing Inaccuracy and Bias as AI Technologies Continue to Sprawl

Addressing today’s emerging AI issues requires us to take a closer look at its building blocks—data. Most issues seen in generative AI are caused by training them on ‘untrusted’ data that is erroneous, outdated, or not diverse enough for its use case.

“Users of AI today often encounter instances where generative models produce inaccurate or irrelevant results that appear confident and well-phrased enough to be easily believable.”

As this ‘untrusted’ data is passed through processes and algorithms to produce results, this may distort the data even more, magnifying inaccuracies and bias in AI’s outputs. Many enterprises also lack visibility on the traceability of their training data, which prevents users from understanding how it changes over time and how AI derives its results.

For AI applications, including generative AI, to be successful, they need to be trusted. Enterprises must have confidence in their data and its quality. This starts with implementing good data governance practices, which ensures that data is clean, securely managed, traceable, and ready for movement and analysis across environments.

Tackling bias requires organisations to re-look at their data strategies and how they plan the scope of their data. Training datasets for AI must contain sample sizes representative enough for its planned use case. Assigning humans to continuously audit training data and decisions made by AI for bias is essential to making iterative improvements. Organisations must also ensure the diversity of human teams, which will help address issues like confirmation bias.

Enhancing Accuracy

This careful selection and management of datasets also helps enterprises enhance the accuracy of results as they use AI. Enterprises seeking to create their versions of ChatGPT and large language models must use contextually-relevant data, rather than models trained on publicly available data, to ensure the best relevant results to their use case. For example, for AI to help enterprises improve customer service, its training data must be relevant or derived from the exact areas of focus.

To better handle the vast amounts of data used by AI, organisations can use data management platforms to facilitate the cleaning, governance, tracking, auditing, and movement of data—wherever it resides. Platforms with security, observability, and traceability built-in as features also help enterprises manage their proprietary data sources securely and make AI models more explainable.

Creating Safer and Better AI Requires Open Collaboration

To fight bias, inaccuracy, and misuse in AI at scale, AI players, business sectors, and governments need to work together. Setting up more comprehensive regulations governing AI, common technology and ethical standards, and open, collaborative workflows is key to success.

For instance, open standards can give financial services organisations the means to detect AI-generated content being used for fraudulent, synthetic applications for bank accounts. In turn, finance players can share information they gather on fraud, abuse, bias and other trends driven by AI to the wider public—advising better development of safeguards and regulations.

Recently, 60 industry leaders, including Google, Microsoft, and Meta, formed an alliance with the Singapore authorities to tackle pressing issues in AI such as bias and misuse—and to create a neutral platform for collaboration on governing AI.

“Tackling bias requires organisations to re-look at their data strategies and how they plan the scope of their data. Training datasets for AI must contain sample sizes representative enough for its planned use case. Assigning humans to continuously audit training data and decisions made by AI for bias is essential to making iterative improvements.”

Alliances like these are a great start to tackling AI issues, and we hope to see such efforts replicated across regions. However, it takes more than agreements and discussion papers to enact meaningful change in the long run.

Translating plans into action requires organizations to adopt modern data architectures that facilitate open data sharing across environments, better data governance, and faster speed to insight. These data technologies are crucial to forming a foundation for safer and better AI development, and key to building trust in new game-changing AI applications.

(0)(0)

Archive