Make Your Applications Smarter with Oracle Machine Learning and AutoML | Oracle DatabaseWorld AI Edi
Summary
TLDRこんにちは、マルコス・アランシビアです。今日は、オラクルの機械学習とAutoMLを使ってアプリケーションを賢くする方法をご紹介します。機械学習モデルをデータベース内で実行することで、顧客やプロセスの分析が向上し、外部パッケージのコスト削減が可能です。また、OracleのAutoML UIを使って簡単にモデルを構築・展開し、さまざまなAPIを通じて利用できます。特に、顧客離脱の予測モデルのデモを通じて、その有用性を示します。オラクルの機械学習は、データ移動の手間を省き、迅速なプロジェクト展開を可能にします。
Takeaways
- 😀 自律型データベースの製品管理チームの一員として、Oracle機械学習とAutoMLを使用してアプリケーションをスマートにする方法を紹介します。
- 📈 機械学習をアプリケーションに導入することで、顧客やプロセスの分析が向上します。
- 🛠 AutoML UIを使用して機械学習モデルを構築し、APIを通じてアプリケーションの性能を向上させます。
- 🔒 データベース内で実行されるモデルにより、機械学習モデルの信頼性とセキュリティが向上します。
- 💡 機械学習の使用事例はあらゆる業界に存在し、顧客セグメンテーション、ロイヤルティ、顧客離脱などに役立ちます。
- 🔧 AutoMLは反復的なタスクを排除し、アルゴリズム選択、サンプリング、特徴選択、モデル調整を自動化します。
- 📊 Oracleデータベース内で直接データを処理することで、データ移動のコストと時間を削減します。
- 🚀 Oracle MovieStreamの事例を使用して、顧客の離脱を予測し、プロモーションを提供する方法を示します。
- 📝 AutoML UIを使用してモデルを構築し、PythonやSQLでスコアリングおよびデプロイメントを実行する方法を説明します。
- 🔍 データとモデルの監視機能を提供し、データの変動やモデルのドリフトを防止します。
Q & A
オートマチック機械学習 (AutoML) とは何ですか?
-AutoML は、機械学習モデルの構築、評価、およびチューニングのプロセスを自動化する技術です。データサイエンティストがアルゴリズムを選択し、モデルを最適化するための手動作業を減らします。
Oracle 機械学習の主な利点は何ですか?
-Oracle 機械学習の利点には、データベース内で直接機械学習を実行することでデータ移動を減らし、ソリューションのアーキテクチャを簡素化することが含まれます。また、信頼性とセキュリティが向上し、モデルの構築と展開が容易になります。
どのようなユースケースで Oracle 機械学習が使用されますか?
-Oracle 機械学習は、顧客セグメンテーション、ロイヤルティ、顧客離反(チャーン)、製品の次のベストオファー、クロスセル、予知保全など、さまざまな業界のユースケースで使用されます。
Oracle Autonomous Database の利点は何ですか?
-Oracle Autonomous Database は、クラウド環境で Oracle データベースを運用し、パッチ適用やバックアップなどのインフラストラクチャ管理を自動化することで、ユーザーが分析や開発に集中できるようにします。
顧客チャーンを予測するためにどのアルゴリズムが使用されましたか?
-顧客チャーン予測には、決定木、ランダムフォレスト、サポートベクターマシン (SVM) の線形カーネルが使用されました。
AutoML がどのように機能するか説明してください。
-AutoML は、アルゴリズムの選択、適応サンプリング、特徴選択、およびモデルチューニングのプロセスを自動化します。これにより、最適なアルゴリズムとモデルパラメータを選定し、データサイエンティストの作業を効率化します。
Oracle Machine Learning Notebooks とは何ですか?
-Oracle Machine Learning Notebooks は、データサイエンティストが機械学習モデルを構築、評価、デプロイするためのインタラクティブなノートブック環境です。Python や SQL などの言語でコードを実行できます。
AutoML によるモデルのチューニングとは何ですか?
-AutoML によるモデルのチューニングは、モデルの精度を向上させるために、各モデルのパラメータを自動的に調整するプロセスです。これにより、より高性能な機械学習モデルが得られます。
Oracle Machine Learning Services とは何ですか?
-Oracle Machine Learning Services は、機械学習モデルを REST エンドポイントとしてデプロイし、API を通じてモデルの予測を実行するサービスです。これにより、アプリケーション開発者は簡単に機械学習モデルを統合できます。
モデルモニタリングの重要性について説明してください。
-モデルモニタリングは、時間の経過によるデータの変化やモデルの劣化を検出するために重要です。これにより、モデルのパフォーマンスを維持し、適切な更新や再トレーニングを行うことができます。
Outlines
😀 アプリケーションをスマートにする方法
マルコス・アランシビアがオートノーマスデータベースの製品管理チームの一員として、Oracle機械学習とAutoMLを使用してアプリケーションをスマートにする方法を紹介。主なポイントは、機械学習をアプリケーションに統合することで顧客やプロセスの分析が向上し、AutoML UIを使用してさまざまなAPIを通じて機械学習モデルを構築し、信頼性とセキュリティを向上させること。また、機械学習のユースケースがすべての業界に存在し、異なるアルゴリズムの使用法や、Oracleデータベース内のホワイトリストにある多数のアルゴリズムについて説明。
🚀 オラクルムービーストリームのユースケース
クラウド向けのオラクルデータベースを使用し、パッチ適用やバックアップなどのインフラ管理を自動化。例としてNetflix競合のOracle MovieStreamを取り上げ、顧客維持のために機械学習を使用して顧客の離脱理由(チャーン)を特定する方法を説明。データレイクを利用してデータをクリーンアップし、機械学習モデルを構築・展開するプロセスを示し、AutoMLが如何にデータサイエンティストの作業を効率化するかを紹介。
🛠️ AutoML実験の実演
オラクルクラウドインフラストラクチャ内でAutoMLを使用して実験を行うデモ。データソースの選択、特徴量の確認、予測対象の定義、モデルチューニングなどのステップを実行。特定のアルゴリズム(例:ランダムフォレスト)が選択され、その特徴量の重要度や精度の確認方法を解説。また、SQLやPythonを用いたスコアリングの実行方法も紹介。
💻 モデルのノートブック生成とデプロイ
AutoMLで生成されたモデルをPythonノートブックに変換し、再利用可能な形で保存する方法を説明。ノートブックはモデルの実験内容やデータの準備方法、最適化設定を含む。さらに、モデルをRESTエンドポイントとしてデプロイし、他のユーザーがすぐに使用できるようにするプロセスも紹介。
📊 スコアリングとモデル利用
生成されたノートブックやRESTエンドポイントを使用して、顧客の離脱確率を予測する方法を実演。PythonやSQLでスコアリングを行い、顧客IDや予測確率を含む結果を取得。高速なスコアリングのためのマイクロバッチ処理の利用についても説明。
🧪 モデルとデータの監視
データドリフトやモデルの劣化を防ぐためのOracle機械学習サービスの監視機能を紹介。UIを使用してデータやモデルの変動を監視し、特定のメトリクスを使用して評価する方法を説明。データや特徴量の変動、複数のアルゴリズムの監視などの機能も紹介。
🌟 Oracle機械学習の今後の展望
Oracle機械学習コンポーネントの今後のロードマップについて説明。GPUの利用可能性やLLM推論、埋め込みなど、今年の残りの期間に向けて発表される新機能に言及。セッションの締めくくりとして、機械学習をアプリケーションに統合する利点と信頼性の向上について再度強調。
Mindmap
Keywords
💡Oracle Machine Learning
💡AutoML
💡顧客離反(Customer Churn)
💡予測メンテナンス(Predictive Maintenance)
💡分類アルゴリズム(Classification Algorithm)
💡アルゴリズムのホワイトリスト(Algorithm Whitelist)
💡Oracle Converged Database
💡AutoML UI
💡データマイニング(Data Mining)
💡モデルの信頼性とセキュリティ(Model Reliability and Security)
Highlights
Introduction to making applications smarter with Oracle machine learning and AutoML.
Key benefits: improved analysis of customers and processes by infusing machine learning into applications.
Building machine learning models using AutoML UI and integrating them through different APIs.
Oracle machine learning models increase reliability and security by running inside the database.
Machine learning use cases span across industries involving customers, products, equipment, and employees.
Examples of use cases include customer segmentation, loyalty, churn, predictive maintenance, and employee attributes.
Different classes of algorithms used for customer lifetime value, loan analysis, probability of default, etc.
Overview of Oracle machine learning algorithms available inside every Oracle database.
Demonstration of Oracle MovieStream use case to predict customer churn using machine learning.
Explanation of AutoML process: algorithm selection, adaptive sampling, feature selection, and model tuning.
AutoML eliminates repetitive tasks, enabling even inexperienced data scientists to save time.
Detailed steps of using AutoML in Oracle machine learning UI, including data source selection and model configuration.
Comparison of different models and evaluation metrics within Oracle machine learning UI.
Deployment of machine learning models through REST endpoints and scoring with dynamic SQL.
Use of Python and R for building and deploying machine learning models, emphasizing flexibility for data scientists.
Introduction of Oracle machine learning services for data and model monitoring to avoid data drift and ensure model reliability.
Roadmap for Oracle machine learning components, including upcoming features and enhancements.
Summary of session takeaways: critical intelligence for applications, ease of building models with AutoML UI, and database security.
Transcripts
Hi, everyone. Welcome.
I am Marcos Arancibia,
part of the product management team for autonomous database.
And today, you're gonna see
how to make your applications smarter
with the use of Oracle machine learning and AutoML.
So, the key takeaways for today is gonna be,
you're gonna see how making machine learning
and infusing that into your applications
makes that much better
for any of your analysis of customers and processes.
Building also machine learning models
using AutoML UI is gonna allow you
to use it through different APIs
and make that application work better.
And also increasing the reliability
and security of machine learning models.
And that comes because of the models
that are running inside the database.
It basically lowered the cost
and the usage of third party packages or engines
that you need to move data to.
Machine learning use cases are part of every industry.
You see all the industries have customers
and/or products and/or equipment and/or employees.
And there are many, many different types
of use cases you can see.
So, if you have customers,
you are worried about customer segmentation, loyalty,
customer attrition or churn.
And then, if you have products,
you're looking for the next best offer,
cross-selling, things like that.
Looking at equipment, you're trying
to identify predictive maintenance
and things along those lines.
And employees, you're also looking
at all their best attributes
trying to find similar good employees.
But what these use cases have
is all of them have a different classes of techniques.
So, when you're doing customer lifetime value
or loan analysis or probability of default,
you are using one
or even more of these classes of algorithms.
So, you're using classification or using regression
or anomaly detection, and things like that.
And we're gonna look at today
at a usage of a customer churn.
So, that's part of the classification family of algorithm.
But we're gonna see that technique there.
And every technique
has actually their algorithms themselves.
So, this is a very large list of algorithms.
This whitelist is actually available
inside the Oracle database and every Oracle database,
and it's part of the Oracle machine learning.
So, we have algorithms for classification and regression,
and all of these were created basically looking
at what our customers were using most frequently.
As part of the AI and ML ecosystem across Oracle,
the machine learning and database
is part of the machine learning
for data platforms right there in the middle.
And that's what we're gonna be focusing today.
We're also part of the Oracle converged database.
So, when you think about the database,
we are actually in there.
So, we're part of that process.
If you have an Oracle database, we are in there.
We're part of that process no matter where.
So, you can see that on the left-hand side here,
I have the components listed.
This is basically the different APIs of a SQL,
APIs available everywhere.
And we have Python and R
that are available in the autonomous database side
and also on database on-premises, base database service,
and exadata and things like that.
There are three components that are exclusive to autonomous,
which are the Oracle machine learning notebooks
and OML AutoML UI.
And also the Oracle machine learning services,
our REST endpoint server there.
But then, Oracle Data Miner
is available again against all of these platforms as well.
But we're very flexible here.
The idea for us,
and the reason why we created the Oracle machine learning
in the first place was that there's a gravity on your data,
on the volumes of data, large volumes of data.
Why would you take terabytes of data
out of the database to crunch numbers on a little platform
where you have open source algorithm?
When you can actually do that
directly where the data resides.
So, saving and eliminating all the data movement
and simplifying the solution architecture
is what we were going for.
So, that's where we run
those basic 30 different algorithms there.
At the end of the day, that actually saves a lot of time.
And basically, it's the data access time
and the export, the results time,
all of those things are eliminated,
but also crunching the time
for the machine learning modeling,
the data preparation and exploration and things like that.
Because those things inside the database
and where the database is,
when your data is, is running faster.
So, it gives you better time to production on your project.
So, we're gonna be using the autonomous database today
for demonstrating to you guys this process.
Basically, the autonomous database helps you
with running the Oracle database, but for the cloud
and without requiring you to do any patching,
any backup, any worrying about any of those activities.
We are actually managing the infrastructure.
We're managing the automating,
all the database, process and management.
We're managing all the data center operations,
so we have all of that automated for you,
so you have more time to develop,
more time to work with analytics.
That's the idea for us is having you to do that.
So, I'm going to be using now a use case,
and this use case is gonna be the Oracle MovieStream.
So, this is like your Netflix competitor here
where I have a lot of customers.
So, I have a lot of customers that I need to take care of
and I'm worried about,
and I need to actually acquire more customers,
but keeping the customers that I have
is probably the cheapest way of continuous revenue.
So, I'll be evaluating these customers
and I'll try to identify which customers
are the most important ones
in terms of customers that I can keep.
So, to understand whether I can keep a customer,
I need to understand why they leave.
And the reasons
why they're leaving is what we call the churn.
So, I'm gonna try to identify the probability
of a customer leaving,
so probability to churn using machine learning.
And then, in the way that I can potentially offer something
like a pizza, promotion to that customer,
to try to make him come back
or make him stay with our server.
But for that, I'm gonna assume that I have a data lake
in our Oracle cloud infrastructure.
I got lots of different sources.
I can enterprise applications,
MovieStream events, third party data.
I also have an object storage
where I have landing zone, gold zone, and sandbox.
And the idea here
is that I'm going to be using the autonomous database
for working with that data, once that data
is loaded in and clean up and prepared for me.
I'm gonna be showing you the model portion
and the deployment portion
of the machine learning components there.
There are many other tools
that are available in autonomous database for self-service,
like loading and transformations
and graph and things like that.
I'm gonna be focusing specifically
on the machine learning side.
So, the problem with the traditional machine learning
is that usually what happens
for data scientists like myself,
is you start the process
by selecting one of those algorithm
that we talked about before.
Now, I pick up one of those algorithms at a time.
I then go to evaluate that algorithm against data,
come up with one algorithm or one model.
But then, I have to create and optimize that model.
So, I start playing with all the different components
and the different parameters
that each model has until I get one model
or I get a bunch of versions
that are the best versions for that model.
Those are the things that I'm doing.
But then, I have to rinse and repeat that process.
So, now, I have to restart the entire process again
using the next algorithm.
So, what AutoML or automated machine learning does
is actually eliminates that repetitive task.
All of those process of building
and evaluating every different algorithm.
It's gonna use something
called the auto algorithm selection.
So, it's gonna look into all of these algorithms,
and then it's gonna go
and say, okay, out of all of these algorithms,
which ones are the ones
that actually best work for your problem, for your data?
Next step is gonna go and do an adaptive sampling.
It's gonna identify what is the right size of a sample,
make it proportional,
and do the analysis
that it needs to best perform for each model.
And then, it's gonna do an auto feature selection.
So, things like color of the eyes,
the name of the customer or the shoe size shouldn't matter
to evaluate the process
of whether the customer's gonna stay or not.
So, it will eliminate all of those things automatically.
And then, you can go into auto model tuning.
It's gonna do an increase the accuracy of each of the models
by tweaking a little bit all of the parameters that it has.
So, that process basically is gonna enable even someone
that is an experience data scientist
to actually save time as well.
So, I'm gonna be showing you
and simulating that I am that data scientist
right there in the middle here.
And I'm gonna be working
on deploying this model in several different ways.
You're gonna see the different capabilities
and different UIs that we have available for that.
And then, finally, I'll talk about how that developer,
that person, that app developer person can actually use that
to really at that point, create the application there.
So, I'm gonna start then on my demo.
Basically, here is the Oracle cloud infrastructure.
So, I'm inside the autonomous database console here.
So, I have my autonomous database console.
And in the console, I have something called database actions
where you can create users and things like that.
But I'm gonna click on the View all database actions.
So, this is the database actions then.
And you can see that I have a lot of different tools here.
As I mentioned, tools for data loading
and transformations and things like that
and managing things.
But on top here, I have the development tool.
And specifically on this one,
I have the Oracle machine learning.
So, when you click on that, you get here.
This is the Oracle machine learning UI,
and I'm the user, MovieStream.
And what I'm gonna do here is I'm gonna click on AutoML.
And I'm gonna look at the AutoML experiment.
There is an experiment there already,
but I'm gonna create a brand new one
to show you guys step by step.
So, we want to detect
customers about to leave.
I can put any
comments.
And the first thing I need to do
is I need to get a data source.
If you look at the table, and then on the right,
I have here at the table, I have the schema.
So, all the schemas
that my Oracle database user has access to
and all the tables in there.
So, I have access to this table. I'm just gonna click on it.
And now, what the system is doing
is actually showing me down here,
you can see all of the different features that I have there.
So, I have things like age, average number of transactions.
I have transactions three months ago,
four months ago, five months ago.
Sales, same thing.
Discount, average number of transactions
in the last quarter, CD, credit balance, customer ID.
All of these different discounts. Education, email.
And then, things like gender.
And then, I start genre, movie genres.
How many movies of genre action the customer watches,
how many movies of the adventure,
animation, biography, comedy, things like that.
And then, at the end here,
I have even more things like household size, income level.
And the most important here in this case is churner.
That specific feature is telling me
whether a customer left last month or he stayed,
which means he had zero watches on the service last month.
Churner definition is something that all companies,
all businesses do differently.
You could have thought about something like,
wow, the customer reduced their watch levels
by 90% or something.
Which it is an indication
that they're probably watching movies somewhere else.
So, this is just an ample,
a definition that all companies do differently.
In this case, I have a zero, one,
a binary target here that says the customer left.
Last month, he didn't watch any movies with me.
And then, finally, a few other columns like marital status,
the location, latitude and longitude,
and years as customer, and things like that.
Very well. So, I define then the prediction.
What I wanna try to predict
is whether the customer is going to stay or going to leave.
So, it's the IS_CHURNER.
And then, the case ID here, I'm gonna go to customer ID.
Now, because I selected this churner
and the churner is a binary zero, one,
it's going to automatically detect
the prediction type of classification.
I can see that it could have done regression
if I'm trying to predict something else
like customer credit limit or other things like that.
All right. So, back to the features here then.
I can manually remove things
like I don't even want the system to try to use.
So, maybe first name and last name,
I know they have nothing to do with
whether a customer is gonna stay.
And if they did, it probably would've been wrong.
So, you don't want to use that.
Then, in additional settings here,
I will reduce the number of maximum top models to three.
So, let me check the top three models here.
I have the maximum duration, it's okay, the default.
Database service level.
So, autonomous database have low, medium,
and high service levels.
Medium allows me to do parallelism.
So, I'm just gonna turn it to medium.
And then, the model metrics.
So, the data scientist
is gonna choose maybe a different model metric
like accuracy, like area under the ROC curve,
F1, or Precision or Recall,
depending on what they're optimizing for.
And the list of algorithms.
So, this is just something you can uncheck if you want.
Something like companies in the financial services
that actually work with credit scoring,
they are not allowed in the USA
to use a neural network algorithms
to define their credit score.
So, they need to use something else, GLM or decision trees.
So, you could potentially uncheck neural network here
and just not make use of that,
but let it run through the other ones
and see which ones are the best.
All right, so having said that, I'm gonna click on here,
I'm gonna click Faster Result.
That's basically, it's going to run faster.
If I selected better accuracy,
it's just gonna do a little more tuning,
a little more fine tuning.
So, it's taking a little more time.
So, I'm gonna just let it run.
So, it's starting my experiment right now.
So, after a basic initialization phase here,
it's going to then start working in the second step here.
That's gonna be the leaderboard
where you're gonna see the selection of the best algorithm.
So, the first step's gonna be now algorithm selection.
That's gonna be a happening here.
So, every step that it takes,
so it's gonna do algorithm selection.
After the algorithms are selected,
remember that I requested three, I'm gonna go
and it's gonna go automatically and do adaptive sampling.
It's gonna go through feature selection,
and then it's gonna do a final model tuning
and a prediction impact analysis
of every feature for all the models.
So, I can see the first initial balanced accuracy
of that table, of that database.
And now, once it runs through the algorithm selection
like it just did, here in the leaderboard,
you can see it's selected the decision tree,
a random forest,
and a support vector machine that is a linear kernel.
These three algorithms the first ones
are the best ones that it identified initially.
Now, it's gonna go through an optimization here for them.
So, I'm gonna just leave this experiment running
and I'm gonna open up another one that just ran.
So, you can see it's running here on top.
So, this other one that I just ran,
actually completed in around three minutes.
And you can see all of the steps that it took.
So, it ended up that the random forest
is the best one for balanced accuracy.
If you click on that model in the link, this is what...
This is showing basically
all of the different features and their weight.
They're important for that model in particular.
So, basically saying, age and gender,
gender, education,
how many thriller movies the customer watches,
average number of transactions in the last quarter,
years of residence, credit balance,
genre family and genre war.
These are the features
that actually make this model important.
And the confusion matrix here tells me the errors.
So, I predicted that the customers were not gonna churn
and they did not.
The model was correct 64% of the time here
or 64% of the population falls into this guys.
And then, 28% of the population falls here
where I predicted they were going to churn, they did.
Where the model missed was 6% of the people
I thought they were gonna churn and they didn't.
It's not terrible, because if I'm offering a pizza,
it's fine, I'm gonna be sending pizza
to people that did not necessarily were gonna churn.
But again, it's okay.
Now, this 1% of people here, I might be more important,
more critical to look at later,
because those guys, I could not predict.
They actually churned,
but I did not predict they were going to.
So, again, it's not a large one compared to this.
But it's still okay.
So, these are things that I can do later.
All right. Now that model is built, I can do a rename.
So, I give the model an interesting name.
I already did that for CHURN_PRED,
so I can reuse that model.
It's already there in the database with these names here.
But I can reuse that model
for other things like scoring immediately.
I can then create a notebook or let me first select metrics,
so you can select different metrics here and compare models.
So, that just shows me different metrics
that I was not optimizing for,
but I can see how they look like after the models are done.
And then, I can create a notebook
that basically takes this model here
and it builds
an entire notebook
that actually is written in Python
using all of the steps that AutoML actually did.
All of this optimizations,
every single thing that the AutoML was doing,
it's gonna be translated into Python.
And then, you're gonna be able to see that notebook
and we're gonna go there soon.
And then, the other thing
that I can do here is I can deploy the model.
So, I can deploy this model to a new URI.
So, if I give it a URI name, a version, and a namespace,
this goes to what we call
the Oracle machine learning services.
That's our server
for autonomous database users or customers.
And with that, you can actually score immediately
using REST against that model there.
So, we're gonna see all of that.
So, now, first things first,
if I come here to the Notebooks then,
I can see that notebook that was generated for me.
So, this notebook that is generated
is going to have some comments
about the model originally, the experiment.
And then, it's gonna talk about the data.
So, it's gonna actually step by step here
is going to import
the Oracle machine learning package for Python.
Inside a Python session,
it's going to create a new, what we call proxy object,
that points to this table with these columns.
So, it's basically creating a view behind the scenes
that is a query based on these columns
that were actually the ones identified by AutoML
to be the best ones,
the ones that were needed by this random forest.
And then, it creates what we call a proxy object.
This is not pulling the data back to Python's memory.
This is just a pointer to the data
that is in the database, that new view that was created.
I prepared the data for building the model.
And then, finally, here, I have the exact settings
that the AutoML model identified
as being the most important settings for that random forest.
So, that I can reproduce exactly
that random forest model that I have.
So, exactly how many the impurity metrics that I'm using,
how many minimum percent split per node,
how many the tree depth, max depth.
So, all of these things were the settings
that random forest requires
that actually were identified by AutoML process.
And then, I'm building the model,
so I can build that model and you can take that model
and work with it or rename it, give it a name,
run it on a different server, and things like that.
Now, how can I use that?
So, first things first,
I'm gonna show you here I have a scoring.
So, I have a scoring process and I'm going to show you
where I can actually use that model for scoring.
So, first things first,
another user doesn't have to be my own user.
Another user imports the OML library,
does the send requests to the proxy object
to be the entire table.
It's okay.
But then, I bring in
my random forest user that I call CHURN_PRED.
And I can just run a prediction.
With the output of that prediction,
I have probability to churn now.
So, I have all of the customer ID,
the information about the customer,
I have now probability that this customer is going to leave
and a prediction zero, one.
So, same thing with SQL.
I can actually repeat that process with SQL.
And you can see that right here.
So, I have prediction and prediction probability.
These are things that are exclusive
to the Oracle, the SQL.
So, I have prediction
and prediction probability on CHURN_PRED.
This is dynamic SQL, so it's SELECT *
and I'm dynamically scoring
every customer available here on that table.
And I'm getting their probability to churn right here.
Now, the deployment that I just did
is actually to the rest and point,
to the Oracle machine learning services.
So, I have now that model.
So, I'm gonna just get a token,
a token passing my Oracle database username and password.
And now, I can predict,
remember that I called it CHURN_PRED.
So, this CHURN_PRED REST endpoint now is alive.
I am passing and now as the application developer.
I'm just passing this input records.
I'm passing some data about that customer.
So, I click send, that data goes and comes back.
I can see that the result came back in 47 milliseconds.
So, that's very fast between my home here in Miami
and the server in the northeast.
And I see the probabilities here.
So, this is basically the probability that this customer,
in particular this one
that we sent, is going to churn 84%.
But more than that, more than that,
because the top end details here is five,
what I'm saying is I want
to get the top five prediction details
or the top five reasons
why these model thinks that this guy is going to churn.
And I can see the reasons here and their weight.
So, age, the number of movies
of war movies the guys watches,
the number of thriller movies he watches,
how many years of residence and the genre family.
And this can also use micro batches.
So, I can have, in this case,
I have 100 customers as an example,
a micro batch here that I'm running.
Same thing, I can just run that scoring
and I'll get, all of these results, still subsecond.
And I get 100 customers
with all their probabilities to churn here.
So, you can see that my application can,
we can handle all of that 101,
'cause I started counting in zero.
You can count and you can run this process very fast.
Back to my slides then.
What I do then after that process.
So, we saw that we could use Python and R.
We saw Python being used as the engine,
but we can use R as well.
And we basically empower all data scientists
using their own native language that they wanna program in,
but they can actually use all the in-database capabilities.
In addition to that, for the autonomous database,
we have the REST endpoint as well that can be used.
And then, remember that I mentioned
that I was gonna show you guys AutoML,
the UI side was part of the autonomous database.
Well, you have AutoML as well available in Python
and this can run anywhere.
It doesn't need to be the autonomous database.
It can run on-premises, it can run on an exadata.
You have this Python AutoML UI, AutoML API that you can run.
And then, just give the model name.
You can score that with dynamic SQL
like you can see here on the right.
You can also use PL/SQL to build a model.
And that's something we have supported
all the way back to Oracle 11g Release.
But basically, this is the in-database model building
using PL/SQL on the left.
And on the right, I can actually reuse that model in Python
and I can use a decision tree model in Python
and run inference with that.
I also can build a decision tree with Python
and score in SQL or I can even use R
and build an in-database decision tree model
like I'm doing here on the left.
And then, again, use SQL.
So, there's a lot of flexibility for data scientists.
Finally, one of the most flexible portions
we have is the embedded execution.
You can actually write your own code, your own R,
or your own Python code,
and run your own third party package,
whatever you are looking for.
And we can actually package it up.
We actually fund the R or Python session needed.
The database is controlling that.
The database injects the data needed on that session.
Runs that code that you wrote,
and then brings back the results.
We can bring back the result into the database
or to an object storage or to a file service
or anything that you need there.
So, it is again, very, very flexible and very powerful.
All right, so recently,
we released an Oracle machine learning services data
and model monitoring.
So, basically, what we're trying to avoid here is,
data over time changes.
Your customer behavior changes, the economy changes,
and what people are doing changes.
There's something trending up and down.
So, all of these things affect data, the data coming in.
So, we wanna avoid that data drift.
But also the model monitoring
is also we're looking at the model.
We also added in addition
to these things in OML services then,
we added UIs for those.
So, we have this UI available there for you
where you have data monitoring,
you're looking at any drift over time for that process.
And then, things that, like looking at the features,
whether they're changing over time.
And then, we're evaluating them with specific metrics,
with some statistics, population stability index,
things like that that you can choose
and evaluate the specific processes that are happening.
And also on the model monitoring side,
you can monitor several algorithms at once.
So, we're looking at all these different algorithms
and some of them are drifting more than the others,
depending on what model it is.
So, you can actually evaluate them, look at feature impact,
look at the predictive impact over time.
So, last but not least here,
the roadmap for Oracle machine learning components
So, basically, these are the things
that are being launched and announced
and we're gonna be working on for the rest of the year,
working on different things
and including GPU availability
for things like LLM inferencing
and embeddings and things like that.
And then, I hope you guys
had a good time watching this session.
These are the key takeaways.
Again, machine learning can infuse your applications
with critical intelligence
about your customers and processes.
Building this machine learning is using AutoML UI
is very easy and you can deploy them to different APIs.
And then, also, we talked about the reliability
and the security of building these models in database.
They are objects inside the database.
These are some of the links
that you guys can go for more information.
Live labs, you can go into a workspace in Slack
to find us there, LinkedIn, and things like that.
So, with that, thank you very much for watching the session
and appreciate it and have a good day.
Ver Más Videos Relacionados
5.0 / 5 (0 votes)