Building trust: Strategies for creating ethical and trustworthy AI systems
Summary
TLDRThe video discusses the importance of AI governance in businesses, highlighting challenges and risks associated with generative AI, such as bias, data privacy, and security. It emphasizes the need for comprehensive governance strategies to ensure legal, ethical, and operational compliance. IBM's Watson Governance platform is introduced as a solution that automates AI lifecycle governance, manages risks, and ensures regulatory compliance. The video also addresses the evolving AI landscape, the necessity of collaboration between technical and non-technical teams, and the role of governance in ensuring trustworthy AI implementation across enterprises.
Takeaways
- 🧑💼 IBM emphasizes the need for AI governance in business to address ethical concerns and ensure AI is used safely.
- 📈 Generative AI could increase global GDP by 7% within 10 years, with 80% of enterprises planning to adopt it.
- ⚖️ Business leaders are concerned about ethical issues like bias, safety, and lack of transparency in generative AI.
- 💼 Common generative AI use cases include content generation, summarization, entity recognition, and insight extraction.
- 🚨 AI risks involve data bias, legal concerns, and the potential for adversarial attacks or misuse during model training and inference.
- 🧠 IBM's AI governance focuses on lifecycle monitoring, risk management, and regulatory compliance for both predictive and generative models.
- ⚙️ Automating governance processes, such as tracking model performance, metadata, and compliance, is essential for reducing risks and improving efficiency.
- 🔐 Managing sensitive data and ensuring models handle it responsibly is key to maintaining business trust and meeting regulatory standards.
- 📊 IBM's governance platform aims to ensure transparency, automate documentation, and monitor AI models continuously throughout their lifecycle.
- 🚀 IBM's Watson Governance platform provides end-to-end governance for AI, helping organizations balance performance, risk, and compliance across different environments.
Q & A
What are the key issues AI introduces at the business level?
-AI introduces challenges such as ethical concerns, lack of explainability, safety risks, and biases in generative AI. These issues require careful governance to avoid reputational damage, legal risks, and operational inefficiencies.
Why is AI governance necessary for businesses?
-AI governance ensures that AI models are transparent, accountable, and compliant with legal and ethical standards. It helps businesses mitigate risks, ensure AI models remain fair and accurate, and prevent misuse or harm.
What are the main use cases of generative AI mentioned in the script?
-Generative AI use cases include retrieval-augmented generation, summarization, content generation, named entity recognition, insight extraction, and classification.
What are the risks associated with the training phase of AI models?
-Training-phase risks include biases present in the training data, data poisoning attacks, and legal restrictions related to the use of sensitive or copyrighted data.
What are adversarial attacks during the inference phase, and how can they affect AI models?
-Adversarial attacks, such as evasion or prompt injection, occur when attackers manipulate input during the inference phase to produce harmful or biased outputs, compromising the AI model’s reliability.
What are some real-world cases illustrating AI model risks?
-Examples include a dealership's AI bot mistakenly selling a Chevy Tahoe for $1 and Microsoft's Twitter chatbot turning offensive due to learning inappropriate behavior from user interactions.
How does IBM's Watson Governance platform address AI governance needs?
-The Watson Governance platform automates life cycle management, risk governance, and regulatory compliance for AI models. It helps businesses ensure model accuracy, fairness, and transparency across development and deployment.
What are the three critical capabilities identified by IBM for AI governance?
-The three capabilities are monitoring and evaluating models, tracking facts and metrics, and managing the life cycle and risks of AI models.
What is 'prompt governance,' and why is it important for foundation models?
-Prompt governance involves tracking and evaluating text-based instructions (prompts) used with foundation models. It is essential to ensure that prompts are properly managed, evaluated for quality, and monitored for safety to avoid generating harmful content.
What role does monitoring model performance play in AI governance?
-Monitoring model performance ensures that AI models and prompts remain accurate, efficient, and safe over time. It helps detect issues like performance degradation, data drift, and the presence of toxic language or personal information in outputs.
Outlines
🤖 Introduction to AI Governance and Its Importance
Igor PV, an AI engineer at IBM, introduces the topics of AI governance, discussing the impact of AI, especially generative AI, on business. He highlights the rapid adoption of AI technologies, noting that 80% of enterprises are either working with or planning to use foundation models. The business leaders’ concerns around safety, ethical issues, and biases in AI systems are underscored. Igor discusses the common use cases for generative AI, such as content generation and summarization, while acknowledging the inherent risks in integrating these models into organizations.
🚗 Real-World Risks in AI Systems
This section presents real-world cases illustrating the risks in AI, such as a dealership's AI bot being tricked into selling a Chevy Tahoe for $1 and Amazon Alexa mistakenly ordering a dollhouse. It also covers Microsoft's Twitter chatbot, which turned offensive due to its training data. These examples underscore the dangers AI poses to reputation, legal compliance, and operational integrity if not properly governed. Igor categorizes AI risks into three main buckets: regulatory, reputational, and operational.
📊 The Need for Comprehensive AI Governance
Igor emphasizes the importance of AI governance to manage AI projects effectively and ensure trustworthiness in their deployment. He introduces IBM’s approach, detailing three critical capabilities needed for AI governance: monitoring model accuracy, tracking key metrics, and managing the entire AI lifecycle. He also points out the challenge companies face in recruiting AI talent and the lack of standardized best practices for AI governance. He stresses that AI governance must be adaptable to different regulatory and organizational needs.
🛠️ Watson Governance Platform for AI Management
IBM's Watson Governance platform is introduced as a solution for AI lifecycle governance. It provides tools to automate and streamline processes from model development to deployment, including model approval, risk assessment, and prompt governance. The platform offers full configurability to handle both traditional machine learning models and generative AI, enabling better collaboration between technical and non-technical stakeholders. Igor outlines how the platform ensures compliance with regulatory standards while improving business outcomes and reducing governance costs.
📈 Monitoring and Managing AI and LLM Performance
Igor highlights the capabilities of Watson’s governance platform in tracking and improving the performance of large language models (LLMs). The platform automates monitoring for technical bottlenecks, prompt quality, and safety concerns such as toxic language. It also helps track performance drift, ensuring models remain effective over time. This automation reduces manual efforts in governance, ensuring businesses maintain accurate, reliable, and compliant AI systems.
Mindmap
Keywords
💡AI Governance
💡Generative AI
💡Foundation Models
💡Risk Management
💡Bias
💡Regulatory Compliance
💡Prompt Injection
💡Model Drift
💡Explainability
💡Lifecycle Governance
Highlights
Discussion on AI governance and its critical importance at the business level.
80% of enterprises are working with or planning to leverage foundation models and generative AI.
Generative AI could raise global GDP by 7% within 10 years.
Business leaders are worried about ethical concerns like explainability, safety, and bias in generative AI.
Common generative AI use cases include summarization, content generation, and insight extraction.
Risks of using generative AI include biased training data, data poisoning, and legal restrictions.
AI systems can be vulnerable to adversarial attacks like prompt injection, leading to inaccurate outputs.
Real-world AI failures, such as Microsoft’s Twitter chatbot turning offensive and a Chevy Tahoe being sold for $1 due to an AI error.
AI governance is needed to mitigate legal, reputational, and operational risks for businesses.
IBM’s governance platform focuses on monitoring, tracking, and managing AI models across their lifecycle.
Three key capabilities for AI governance: lifecycle management, risk management, and regulatory compliance.
Governance must involve both technical and non-technical members to ensure informed decisions.
IBM’s Watson Governance solution automates governance processes to reduce workload and improve transparency.
The governance solution supports both predictive and generative AI models across different environments.
The platform offers real-time performance monitoring tools, ensuring the safety and effectiveness of deployed AI models.
Transcripts
[Music]
my name is Igor PV I am AI engineer in
client engineering team of IBM so in the
first part of our meeting we will
discuss the
issues uh AI introduces in the business
level uh why I governance is necessary
and how it can be implemented and in
second part we will conduct a Hands-On
lab to demonstrate how this process
might look like
CH GPT created significant interest
around the notion of large language
models uh while chpt technology has
found a home in some consumer and
business applications large language
models are only part of a broader
discussion about using AI technology to
produce business results this slide
details the speed scope and scale of the
impact Genera AI will have on the market
80% of Enterprises are working with or
planning to leverage Foundation models
and adopt generative AI generative AI
could raise Global GDP by 7% within 10
years and generative AI expected to
represent 30% of overall Market by
2025 Business Leaders struggle to grow
AI in their companies safely 80% of them
are seriously worried about at least one
ethical problem like that generative fi
are not sufficiently explainable or
about safety and ethical aspects of
generative AI some concerns about
established biases in generative AI or
someone simply doesn't trust
it let's remember the most common
generative a use cases it can be
retrieval augmented generation
summarization content generation named
entity recognition
Insight extraction or
classification all these use cases can
be affected by risks which will we will
discuss on the next
slide many risks essential in using
generative models organizations failing
to address this when inte integrating
generative AI they can face significant
damage to their public reputations as
well as legal and Regulatory penalties
there are risks associated with input on
training phase bias present in the
training data can often lead to biased
outcomes data poisoning attacks take
place when malicious users try and take
advantage of the iterative training
features of some large language models
by feeding them toxic or hateful
content legal restrictions on data must
be considered when training the llm as
the owner of the model could be
responsible for copyright and
intellectual property infringement and
improper use of personal information and
sensitive personal
information there are some risks
associated with input on inference phas
disclosure of personal uh information
Central personal information or
copyright information may occur during
the inference phase in which the model
developers ask the model to generate
content based on unseen
information or adversarial attacks like
evasion prompt injection or others in
the infinite phase can include not only
data poisoning but attempts to guide the
model to certain types of output by
tailoring the questions asked during
this
stage and some risks associated with
output many of the risks present during
model training are also relevant to
model output model train on biased
material May reflect that in their
answers which can cause both
reputational damage for the model owner
and lead to legal action personal
information and Central personal
information and copyrighted materials
present in the training data may also
either appear directly in the output or
clearly influence it performance
disparity in which the quality of the
model results May not meet certain stats
making the model unusable or misuse in
which the model is used to perform
unethical tasks or value alignment
issues such as hallucination in which
model presents effectually incorrect
answer at the truth and a lot of other
risks that should be
considered here are some real cases that
related to risks that we discussed on
previous slide case when 2024 shev V
Tahoe was sold only for $1 a
California's dealership AI bot
programmed to agree with all customer
statements was exploited by a driver who
convinced it to sell a Chevy Taha only
for $1 the bot confirmed this that this
is a legally binding deal due to a
mistake in how it was
programmed or when child ordered a
dollhouse and cookies on Amazon parents
were shocked when an expensive dollhouse
and cookies were mistakenly ordered by
their daughter talking to Amazon
Alexa or case with Microsoft Twitter
chatbot when it turned vulgar and
offensive Microsoft launched its AI
Twitter chatbot called T designed to
learn from interactions to enhance its
convention conversational
capabilities however they quickly
started mimicking the offensive and
appropriate inappropriate language from
user tweets leading to a rapat
degradation into a vulgar and offensive
online
presence risks in AI models can come
from many
sources uh like the ones shown on this
slide some related to Classic machine
learning models and better understood
some uner concerns rising from
generative AI there are three buckets
that can help categorize this risks
first one is
regulatory with AI regulation
progressing in many parts of the world
organizations that do not have this
under control risk big
finds second is reputational even if
something is legal organizations must
consider whether they want to end up in
the news as an example of AI gun bad for
example what happens if the output from
AI violates social norms by generating
offensive or suggestive
content and third bucket is operational
many AI projects do not make it into
production due to lack of trust in the
technology robbing their organizations
of the potential benefits of the
solution so as we saw AI needs
governance the process of directing
monitoring and managing the AI
activities of an
organization IBM has identified three
critical iCal capabilities necessary for
a proper AI governance solution first
Monitor and
evaluate governance platform should
monitor AI models to ensure they remain
accurate and fair it should oversee
predictive models and ensure generative
models handle sensitive data
responsibly also it would provide clear
explanations of how these models make
their decisions helping everyone
understand
their
processes secondly track facts and
metrics platform should automatically
collect and organize all important model
data making it easily accessible and
searchable this ensures transparency and
accountability from the model's
Creations through its
deployment and lastly manage life cycle
and risk platform should allow
customization of the AI model
development and deployment process it
would track every aspect to minimize
risk and offer realtime performance
monitoring tools this ensures safe and
effective Management in summary this
envisioned capabilities would help
manage track and govern AI applications
effectively ensuring a reliable and
trustworthy AI operation within any
organization but AI governance is
complicated governing a rapidly evolving
field such as AI has always presented
problems which will only become more
difficult as organizations seek to
incorporate new generative models in
addition to more traditional predictive
models they already have today companies
struggle to find enough qualified data
Engineers data scientists and the AI
Engineers just to develop and test new
moduls let alone perform manual tasks
like tracking and documenting the
changes in training data or performing
runtime analysis and additional
development on models and
production what's more there are no
standard best practices or tools for
deployment or governance which results
in companies using highly fragmented
difficult to govern development and
deployment
environments while open Source Solutions
do exist many of the open source
governance Frameworks are primarily
aimed at data scientists and other
coders making them challenging to
understand for non-technical
stakeholders and blocking collaboration
between technical experts and subject
matter
experts finally the governance needs uh
of each organization can vary wildly
depending on the industries and
countries in which they operate and the
regulatory standards they must meet an
AI governance solution must be able to
automate the routine tracking tasks
while being flexible enough to deal with
multiple Environ environments for
development and deployment it must also
be fully configurable for different
regulatory standards and approval
workflows and allow Technical and
nontechnical stakeholders to collaborate
together as easily as possible
so IBM introduces what's next governance
an endtoend automated life cycle
governance toolkit it is a single
automated configurable platform for
collaboratively managing and monitoring
predictive and generative AI
models this platform helps to build
enduring consumer trust with your brand
boost productivity and accelerate
business outcomes and mitigate risk and
minimize cost of
compliance the whatson governance
platform handles three key aspects
necessary for y governance throughout
the entire model life cycle and across
the entire
Enterprise first is life cycle
governance AI governance involves the
entire organization not just the data
science department it covers from
initial model request to deployment
including stages like resource approval
data management and testing effective
governance requires the involvement of
both again Technical and non-technical
members and aims to automate processes
to reduce data science workload it also
ensures decision makers have the
necessary data to make informed
decisions W's governance faciliates this
by tracking and cataloging metadata like
training data and model evaluations
providing a complete overview of models
deployment uh and development
performance second is risk management
before organizations can trust AI to
make business decisions or interact with
customers they must understand and
quantify the risks that AI presents and
be able to measure the AI performance to
monitor their ref
exposure and third one is Regulatory
Compliance increasing government
regulation of AI presents serious
problems for organizations hoping to
adapt AI without a comprehensive
configurable governance
system the wsn governance solution
allows Enterprises to track their models
against regulatory standards in areas
such as accuracy and
fairness it also provides the ability to
EXP explain decisions and automatically
collect metadata so Auditors can
determine how modules were trained and
why they generated that output finally
what governance allows for governance of
all predictive and generative models
regardless of deployment
platform what governance provides three
main capabilities shown in blue blocks
AI documentation AI risk governance and
AI evaluation and monitoring that work
together with different AI Stacks which
you will see soon in white
blocks as an example of endtoend govern
process first is a department identifies
a business challenge solvable by AI
initiating a new use
case the AI use case under goes approval
with model document documentation being
developed and updated in
sync during model op Pro development all
metadata is automatically captured and
updated using tools from both popular
open source Frameworks and whatson
XI custom metadata tracking is also
supported the model's preproduction
evaluation captures performance data
leading to production
approval in the preferred platform the
model is deployed and once again the
relevant metadata is captured and synced
and lastly the production model is
continuously monitored and the
performance data captured and synced as
well and the model owner keeps an eye on
the performance metrics in their
dashboard here is example of endtoend
life cycle for foundation
model first is model approval with
Foundation models organizations will
need to evaluate and approve those
multi-purpose models before it is used
in any use case a foundation model is
Upstream from the use
case second is use case approval in this
step a lot will stay the same when
compared with a use case with a
traditional model you will still need to
have an accountable owner of the use
case describe the purpose do risk
assessment and decide on the appropriate
risk controls and
metrics however with the foundation
model you now also need to specify what
tasks are required to deliver the use
case is it simiz or a classification or
some other
one model
selection this is a new step with
Foundation models organizations will
have many Foundation models approved to
be used including the tasks that are
allowed in this step user will make a
match between tasks allowed for Approved
models and tasks required for the use
case if multiple models are available
users will be able to make right size
tradeoff decisions considering quality
cost energy consumption
Etc model fine tuning when you fine
tuning model you change the weights of
the model it won't be always necessary
to do this but if you choose to do so if
finetune model should be considered a
new object that is distinct from the
base model from which it is derived it
should be gared in its own right in
addition to the based
model prom development with prompts as
the primary way to interact with
Foundation models organizations need to
add prompt governance to their
repertoire in addition to model
governance at the development step this
means capturing the prompt metadata that
you need for your governance activities
including the new model parameters
described
earlier evaluation and monitoring as
mentioned earlier many of the tasks
supported by Foundation models come with
new metrics and explainability methods
organizations should look to adapt this
into their Frameworks and
projects and last one is change request
with a traditional Model A change
request almost automatically meant a
retraining of your use case specific
model but with the foundation model
there are several things that a change
request could relate to it can be
changing the model selection as new
models are developed and approved there
might be a better choice for your use
case another option is train or retrain
a finetune model maybe you had some
initial success with the base model and
now looking to improve on that by
fine-tuning model with some of your own
data or business has changed and you
want to update your finetune model and
another option is adjust your proms of
course all steps mentioned here are
covered by whats governance platform as
we will see in the next slides or in the
lab let's review what's next governance
capabilities again most of them we will
see soon and try in the
lab model risk governance is managing
and automating the activities around
attestation review validation change
management and issue resolution of youri
models it is important for meeting the
compliance requirements of model Focus
regulations across regions and
authorities and for reducing governance
costs
what's governance brings together all
stakeholders in one process with clear
roles and
responsibilities combines a flexible
data model with workflow calculation
questionary and business intelligence
capabilities AI documentation is
tracking the life cycle of your models
and prompts from credle to grave and
fact sheets view for AI assets that
track lineage events and facilate
efficient model Ops governance it can
reduce manual efforts to document models
and prompts and increase transparency of
models to do it we capture facts about
use cases models and prompts throughout
the model life cycle a toog from
common python Frameworks and from
whatson C prompt lab extend it with
custom facts capture attachments
automated reporting for different
stakeholders design time evaluation of
llm prom templates as AI Engineers are
creating their prompts they can evaluate
them directly from within their
development environment it is important
to identify and mitigate prompt issues
as early as possible in the process and
it will save time with integration into
development environment to achieve that
prompt lab gets the evaluate option
directly in its UI matrics automatically
selected based on task type like test
text summerization Tech content
generation extraction
Q&A and uh you can test prompts for
quality and safety metrics such as toxic
language and personal identifiable
information monitoring model health is
to track performance metric for model
and prompt inferencing such as number of
Records number of tokens payload size
latency and throughput it helps to
identify technical bottlenecks in
predictive model and llm inferencing
imagine if at some moment your perfect
model is starting to answer 10x times
longer of course you would like to catch
this changing as soon as possible and it
is reducing effort of monitoring
technical workings of deployed models
governance monitors and reports on
number of Records number of scoring re
requests throughput and latency number
of users amount of input and output
tokens for LMS and payload size for
predictive
models monitoring Genera equality is
monitoring how well your llm prompts
perform it is important to maintain the
business benefits of your deployed
prompts
and it can help to reduce manual efforts
to track model
performance you can consider this model
quality Matrix as a blood test with
defined normal thresholds and actual
values to do it we monitor deployed
promts metric automatically selected uh
based on task type this metrics can be
like Rouge meture similarity or
others and we are still monitoring for
quality drift and safety metrics again
such as toxic language and the personal
identifiable
information monitoring drift drift
occurs when over time the production
data or outcomes start differing from
the time of training and testing the
model imagine if your your well set
prompt and model are starting to provide
not so accurate results as it was before
on the training stage or testing you
definitely would like to see this Drift
We need it because loss of accuracy has
a negative impact on business outcomes
to reduce manual efforts to detect drift
and uh because drift drift shows changes
in real world Behavior you might
otherwise not recognize it to catch
drift will automatically monitor
production data for different types of
it like uh output drift model quality
drift feature drift or input and output
metadata
drifts governing large language model
prompts prompts as we remember are text
based instructions to Foundation models
such as LMS they are created and used
differently than traditional predictive
machine learning models why we need it
to govern both types of assets on one
platform it leads to reduce cost and
effort having consistent rules and
methods applies to both types and
reducing cost of compliance through
automation WN gance tracks prompts
throughout their life cycle in an AI use
case automatically captures prompt
metadata evaluates prompt during prompt
design and monitor prompts when deployed
into production
Browse More Related Video
5.0 / 5 (0 votes)