Model Autotuning, Generative AI | SAS Viya March 2023 Release
Summary
TLDRIn the April 2023 SAS Viya release highlight show, host Thiago Doza discusses new features with experts. Sasha Karpinsky introduces Microsoft Word integration for SAS, allowing users to embed and update SAS reports in Word documents. Joe Madden presents updates in machine learning, including the new ASTORE model format for neural networks and autotuning capabilities in Light GBM. Alex Vilan covers enhancements in SAS Studio, such as data engineering steps and custom column steps. The show also features part two of an interview with Mary Osborne on generative AI models, exploring their potential and ethical considerations in various applications.
Takeaways
- đ The SAS Viya release as of April 2023 includes significant updates and integrations.
- đ Integration with Microsoft Word allows for seamless sharing of SAS analytical insights within Word documents.
- đĄ Machine learning updates feature a new ASTORE model format for neural networks, enhancing deployability and scoring speed.
- đ§ Autotuning capabilities in Light GBM have been introduced to improve efficiency and scalability for large datasets.
- đš SAS Studio has been enhanced with new data engineering steps, custom column steps, and an analyst step for ranking data.
- đ The ability to redeploy Studio flows using SAS and Python programs has been added for greater flexibility.
- đ€ Generative AI models, like the BERT-based classifier, are being integrated into SAS Viya to improve natural language processing.
- đ©âđ« Generative AI has the potential to revolutionize personalized education and product recommendation systems.
- đ„ Synthetic data generation can aid researchers in modeling rare diseases by expanding data sets for better analysis.
- đĄïž SAS is taking a cautious approach to generative AI, prioritizing ethical considerations and risk mitigation.
- đ Users are advised to verify the information generated by AI tools like chat GPT to ensure accuracy and reliability.
Q & A
What is the main topic of the SAS Viya Release Highlight Show for April 2023?
-The main topic is the new features and updates in SAS Viya as of April 2023, including integrations and enhancements in areas like Microsoft Word, machine learning, and SAS Studio.
How does the new Microsoft Word integration in SAS Viya work?
-The integration allows users to view and interact with SAS visual analytics reports, insert visual insights directly into Word documents, and update embedded SAS content with the latest data, all without leaving the Word application.
What is the significance of the aStore model format for neural networks in SAS Viya?
-The aStore model format is significant because it is easily deployable, more portable, and faster than traditional data step methods, leading to a significant improvement in scoring time for neural networks.
What is the new autotuning capability in Light GBM as discussed in the show?
-The new autotuning capability in Light GBM allows for efficiency and scalability for large datasets with many features, and it includes unique parameters like bagging fraction, bagging frequency, lasso, and ridge.
What are the new features introduced in SAS Studio for data engineering?
-SAS Studio introduces advanced data engineering capabilities such as the ability to create outputs as views, a new 'rank data' step for assigning rank values, and extended scheduling functionality for Studio flows.
How can users deploy or redeploy Studio flows using SAS and Python programs?
-Users can deploy Studio flows as jobs without specifying a scheduling trigger and redeploy jobs that have been previously deployed from a specific Studio flow, with this option also supported for SAS and Python programs.
What is the focus of the interview with Mary Osborne in the show?
-The interview focuses on generative AI models, discussing the recent release of a BERT-based classifier in SAS Viya, the concept of generative AI, and its potential impact on everyday life and business.
What are some examples of how generative AI can impact everyday life?
-Generative AI can impact everyday life by providing personalized instruction through 'teacher in the box' concepts, curating product information, and generating synthetic data for research purposes, among other applications.
What are the key technologies behind generative AI as discussed in the interview?
-The key technologies behind generative AI include large language models based on transformer architecture, such as BERT and GPT, and generative adversarial models for generating synthetic tabular data.
What is SAS's approach to the development and deployment of generative AI models?
-SAS is taking a conservative approach to the development and deployment of generative AI models, focusing on trustworthiness, ethical considerations, mitigating risks, and ensuring a strong return on investment for customers.
What advice does Mary Osborne give to casual users of generative AI like chat GPT?
-Mary Osborne advises casual users to use generative AI with caution, always verify the information it provides, and be aware that the models can generate very plausible but incorrect or non-existent content.
Outlines
đ Introduction to SAS VIA Release Highlights
The video script introduces the SAS VIA release highlight show hosted by Thiago Doza, focusing on updates as of April 2023. It sets the stage for various segments, including Microsoft Word integration in SAS for Microsoft 365 by Sasha Karpinsky, machine learning updates by Joe Madden featuring a new model format for neural networks and autotuning capabilities in Light GBM, and new features in SAS Studio by Alex Vilan, such as data engineering steps, custom column steps, and the ability to deploy or redeploy Studio flows using SAS and Python programs. The script also mentions the continuation of an interview with Mary Osborne about generative AI models.
đ Microsoft Word Integration and Machine Learning Updates
Sasha Karpinsky discusses the new integration of SAS with Microsoft Word, enhancing the ability to share analytical insights within an organization directly from Word documents. Joe Madden introduces advancements in machine learning, specifically the support for the analytical store (ASTORE) in SAS for neural networks, which improves deployability, portability, and speed. Additionally, autotuning for Light GBM is presented, which includes unique parameters to enhance efficiency and scalability for large datasets. The segment also previews new output results in Model Studio, including plots for evaluation and iteration history.
đ ïž Advanced Data Engineering and Custom Step Extensions
Alex Vilan presents new features in SAS Studio, focusing on advanced data engineering capabilities. These include the ability to create views as outputs, a new 'rank data' step for assigning rank values, and extended scheduling functionality for SAS Studio flows. The segment also covers enhancements to custom steps, such as making input table and column selector controls read-only and hiding their values at runtime, as well as excluding specific columns from previous selections. These updates aim to empower custom step authors to build more powerful and flexible custom steps.
đĄ Generative AI Models and Their Impact
Mary Osborne discusses generative AI models, starting with the release of a BERT-based classifier in SAS VIA. She explains generative AI as a technology for creating content, including images, text, and structured data. Osborne provides examples of generative AI's impact on everyday life, such as personalized education and curated product information. She also touches on business applications like synthetic data generation for rare disease research and the potential for content creation acceleration. The conversation emphasizes the importance of ethical considerations, computational costs, and the potential risks and benefits of generative AI.
đ Ethical Considerations and Future of Generative AI
The discussion continues with ethical concerns and the future of generative AI at SAS. Osborne emphasizes SAS's commitment to developing trustworthy models and methods, taking a conservative approach to novel text generation to mitigate risks. The conversation highlights the importance of an ethical path, considering data sources, privacy, and the potential for model hallucinations. The segment also addresses the high computational costs and the rapid evolution of generative AI, with a focus on ensuring a strong return on investment and reducing risks for customers. Practical advice for users is provided, urging caution and verification of generative AI outputs.
đą Closing Remarks and Invitation for Future Engagement
Thiago Doza concludes the show by summarizing the updates and inviting viewers to try out the new SAS VIA features. He encourages engagement through likes and subscriptions on the SAS Users YouTube channel and asks for comments on topics for future shows. The script ends with a tease for the next show in May, maintaining viewer interest and engagement.
Mindmap
Keywords
đĄSAS Viya
đĄMicrosoft Word Integration
đĄMachine Learning
đĄNeural Networks
đĄLight GBM
đĄModel Autotuning
đĄSAS Studio
đĄGenerative AI
đĄBERT (Bidirectional Encoder Representations from Transformers)
đĄSynthetic Data
Highlights
SAS Viya release highlights for April 2023 include integration with Microsoft Word, machine learning updates, and new features in SAS Studio.
Sasha Karpinsky discusses the new Microsoft Word integration for SAS, allowing users to share analytical insights directly within Word documents.
Joe Madden introduces a new model format for neural networks, the analytical store (ASTORE), for faster and more portable model deployment.
Autotuning capabilities in Light GBM are now available for improved efficiency and scalability with large datasets.
Alex Vilan covers new features in SAS Studio, including data engineering steps, custom column steps, and the ability to redeploy Studio flows using SAS and Python programs.
Part two of the interview with Mary Osborne explores generative AI models and their potential impact on everyday life.
Generative AI can provide personalized instruction through 'teacher in a box' technology, enhancing education.
Generative AI can curate product information, offering a more general and trustworthy approach to consumer decisions.
Synthetic data generation is highlighted as a way to expand datasets for rare diseases, improving the foundation for modeling and research.
Large language models and generative adversarial models are key technologies behind generative AI, with applications in natural language processing and structured data.
SAS is taking a conservative approach to generative AI, focusing on ethical considerations and ensuring a strong ROI for customers.
The future of generative AI at SAS involves continued research and development, with an emphasis on trustworthiness and risk mitigation.
The rapid growth of generative AI models like GPT is discussed, with newer versions offering improved capabilities.
Users are advised to approach generative AI with caution, verifying the information it provides to avoid inaccuracies.
The show concludes with a reminder to subscribe to the SAS Users YouTube channel for updates on future episodes.
Transcripts
[Music]
hello everyone and welcome to the SAS
via release highlight show I'm Thiago
Doza and today we'll be talking about
what's new in SAS via as of April
2023 for our rundown segment we have
Sasha karpinsky highlighting Microsoft
Word integration in SAS for Microsoft
365 then we have Joe Madden with a few
updates on machine learning
first he'll tell us about a new aore
model format for neural networks and
then he'll cover some new model
autotuning capabilities in light GBM and
in model Studio Alex Vilan is also back
focusing on different ways to look at
your data Alexi will cover multiple new
features in SAS Studio like data
engineering steps there are two new
custom column steps a new analyst step
for ranking data and the ability to de
deploy or redeploy Studio flows using
SAS and python programs finally in our
release talk segment we have part two of
our interview with Mary Osborne about
generative AI models but that's just an
overview of everything you'll hear about
today why don't we get right into it I
give to you the
[Music]
rundown welcome to the
20233 release highlights for SAS from
Microsoft
365 March
2023 today I'm so excited to introduce
sas's integration with Microsoft Word
available on both web and desktop
versions of word these new features make
it easier than ever to share critical
analytical insights from SAS with the
rest of your
organization you can now View and
interact with SAS visual analytics
reports insert visual insights from SAS
directly into your Word documents and up
update embedded SAS content with the
latest SAS data all without ever leaving
your word
application hi this is Joe Madden and I
have a few exciting machine learning
updates to share with you for the latest
release of SAS viia we'll kick things
off with a big Improvement for neural
networks we are pleased to share that
the analytical store commonly known as a
store is Now supported directly in SAS
Studio the SAS code node or the SWAT
python library today and it'll soon be
available model studio aore is our
preferred format within via for saving
and scoring a model because it's easily
Deployable it's more portable and it's
faster than relying upon the traditional
data step method in this example we have
a neural net running against some
simulated data scoring time is a scale
of magnitude faster with a store
compared to that data step and even when
looking at smaller data sets significant
Improvement is expected so if you've
used a store for other model types this
will be just like what you're familiar
with and remember scoring can be called
from the proc net and proc a store
procedure next we have another evolution
of the light GBM implementation we're
pleased to announce that autot tuning is
now ready for the light grab boost
action fans of light GBM know it's
particularly powerful when you need
efficiency and scalability for large
data sets that it often includes a large
number of features uh so this usage is
very similar to other autotune features
with a couple unique parameters such as
bagging fraction bagging frequency lasso
and
ridge so let's take a look at some of
those results just as you would see with
other autot tuning output you're going
to get nice view into what happened at
each iteration and don't worry if you're
not a programmer this will soon be
brought into model studio and speaking
of model studio and autot tuning we have
new output results that you're going to
want to check out in this example we're
showing gradient boosting enabled for
autotuning set up with the genetic
algorithm for its search
method the two new plots are evaluation
history and iteration history evaluation
history shows a scatter plot across
iterations each point on the plot
represents the various evaluations
performed within an iteration iterations
are denoted by color and represent
unique groupings of hyperparameter
values along with its results and the
solid line represents the best
combination for the objective function
that the autotune process found the
second plot iteration history shows how
time and the objective function spans
with iterations in this case the higher
KS value shows a better model fit as
time increases we'll see that the
objective also
increases while we just have time to
show this in action for graad Boost it
works across all model types so make
sure to give it a try
today hello everyone welcome to sus
Studio release 202 free or free in this
release we introduce a few Advanced Data
engineering capabilities in s
studio let's take a look first of all
for a table step in s studio we add
support of the ability to create an
output as a view so now you can choose
whether your output should be located in
a physical table or it should be created
in a form of view next for s studio
analyst we we introduce a new step T
that is called rank data this tab will
allow you to calculate and assign rank
values to your data
set next we significantly extend the
functionality of scheduling for s studio
flows we introduce a couple of options
on S Studio flows menu first we allow to
deploy s studio flow is a job without
the need to specify a scheduling trigger
and secondly we now also allow to rate
deploy jobs that have been deployed
previously from a specific s studio flow
this option is now also supported for
SAS and python programs in SAS
Studio next let's take a quick look at
the extensions for custom steps
framework for a couple of dynamic
controls that we have in custom steps
input table and column selector we now
allow to make these controls read only
and also allow to hide the values of
these controls at R
time also for column selector control we
add an additional attribute that allows
to exclude specific columns from the
previous
selection these capabilities will allow
custom step authors build more powerful
custom
steps that was Sasha karpinsky Joe
Madden and Alex vaan with this month's
rundown I know that the Microsoft Word
integration is something a lot of users
are excited about because it's such a
great way to share insights from svia
thanks to the three experts for coming
back and being regulars on our show now
it's time for our interview with Mary
Osborne about generative AI models I had
a chance to talk with Mary last week so
let's take a
look hey Mary thanks for uh returning to
the show for part two of this interview
happy to be
here yeah we're glad to have you um I
was actually out for last show so I
missed the interview entirely could you
give us a a recap on what you talked
about sure uh we recently released a
Bert based classifier for in SAS via
Bert is bidirectional encoder
representations for Transformers and it
is a large language model so it's a nice
addition to the rules based approaches
that we currently have in SAS visual
texting
analytics that's awesome so we're
hearing so much about generative AI in
terms of chat GPT could you tell me a
little bit more about what generative AI
is sure um in many cases we look at
generative AI as technology that as a
name would imply generat some sort of
content so that could be images like we
see in computer vision it could be text
like we do in my domain which is natural
language processing it could also be
tabular data or data that is more
structured that we see um much more on
the traditional structured machine
learning
standpoint very interesting so what are
some examples of how generative AI can
impact everyday
life there are actually quite a few um
one of the one of my favorites is
actually sort of the idea of teacher in
the Box um I have children and education
is something that is always top of mind
and there are some conversations
happening in the market around being
able to provide more personalized
instruction um through the use of
generative AI so having that teacher in
a box essentially for tutoring some
additional help for uh homework those
types of things and there's also the
idea of more personalized um not really
marketing um because people don't really
love to be marketed to um but there's
the idea of um instead of going out to a
website like Amazon for example and
looking at reviews which we know
sometimes people are paid to do reviews
so it can be kind of difficult to
determine whether or not that five-star
rating and that glowing review really is
accurate being able to curate
information about products so I could go
out and say you know I want to know what
the top
10 ski ski coats are uh for this season
and can you give me a list of those and
tell me what the pros and cons are so
instead of going to one specific vendor
being able to see more of a blanket
approach um more of a general approach
to marketing um and through through the
use of things like curated
lists lot of cool examples there I like
the the ski coat example you gave there
are some some business uses that you're
excited
about there um one of my favorite
examples I actually learned um this
weekend at a conference is being able to
use things like synthetic data
generation so at SAS we don't only do
generative AI through large language
models in the text side but we also do
play in the generative AI space on the
structured tabular side so if you are uh
a researcher and you're researching rare
diseases and maybe you only have a
population of a thousand people with
that rare disease it can be really hard
to model that kind of data so in in many
cases we need to expand the data
traditional approaches of expanding data
random random approaches um traditional
statistical I methods don't typically
work as well as we would like um by
introducing the idea of neural networks
and uh generative adversarial models we
have the ability to generate tabular
data that is similar to the original
data so it gives us much a much better
foundation for additional modeling and I
think that's a really interesting use
case um there's also of course all of
the content related uh use cases on the
large language model side so anything we
can do to speed up uh curation of
information um take advantage of the
technology to knock out some of the more
mundane aspects of content creation I
think are
benefits yeah that's super interesting
uh generative AI in general sounds super
complex though could you talk about some
of the key technology that is being uh
played here sure I mentioned large
language models and that we're
supporting one right now in Bert um
large language models that do generative
AI are are predicated on the idea of a
transformer based architecture um that
was a
groundbreaking
uh development that's the right word
groundbreaking development uh in in
terms of modeling and natural language
processing and it gave us a really good
way to not only um do basic things like
classification summarization those types
of things that we often think about in
terms of natural language processing but
it also moved the needle further and
being able to truly generate novel
content uh so you'll hear things about
Transformer models not the Optimus Prime
more than meets the eye variety but in
terms of large language models you'll
also uh hear things like ber GPT um all
of those are fall into the realm of the
large language model but on the
structured side we also support
generative adversarial models for
generating tabular synthetic data as
well as smot
very cool so what does the future of all
of this look like in terms of natural
language processing at SAS and and
Beyond what does that look
like this is something that we've put a
lot of thought into and we're going to
continue to put a lot of thought into
there are a lot of concerns around
generative AI um there are a lot of pros
and cons and at SAS we've always focused
on developing models that people can
trust and methods that people can trust
so rather than jumping Allin
uh to novel generation of text which
we've seen in many use cases uh in the
media about model hallucinations where
the models go off and say things that
aren't necessarily true but they say it
in a way that's believable we want to
make sure that we mitigate those types
of risks before we make technology
available from SAS in that domain so
we're taking a conservative approach um
doing a lot of research we have uh folks
in our R&D area prototyping a variety of
different things um that we hope to
bring to Market but we always want to
keep keep an eye on the feasibility
these models are really expensive to run
um they take a lot of compute power so
we want to make sure that whatever we
generate whatever we make available to
our customers really does um give an RO
a strong Roi because if you're going to
pay for them you want to make sure
you're getting something tangible in
return um we also want to make sure that
we're following an ethical path and
there are a lot of discussions around
ethics when it comes to generative Ai
and AI in general uh where does the data
come from in order to train these large
language models you have to have a
tremendous amount of data and there are
a lot of concerns about the way some of
the um off the-shelf right now uh large
language models are are built and
pre-trained using Wikipedia and other
internet-based sources who owns that
data and are there privacy concerns we
want to make sure that we're following
the most ethical path forward so as far
as generative AI at SAS uh we're always
going to be um keeping an eye on the
ethical side and we want to make sure
that whatever we produce is done um with
an eye on mitigating harm we don't we
definitely don't want to introduce harm
uh and doing things in a way that's
really going to bring True Value to our
customers and reduce risk because
they're all there are a lot of risks uh
with generative
AI for sure lots of points of concern
there about the ethics and cost to
compute but a lot to look forward to as
well um I've been seeing more and more
more jat uh chat GPT versions come out
like one month to the next the growth on
that is pretty exponential could you
talk a little bit about the rate of that
growth and what the newest version
offers and the the the timeline for that
sure um chat GPT is really fun I mean I
think everybody has everybody who's
involved in technology has played with
it in one form or another um I have
friends who have their grandparents Now
using chat GPT to generate recipes for
Sunday dinners uh so we have our
grandmothers and our great-grandmothers
involved uh in technology which is
really really cool not all the recipes
are really good so you have to take that
for whatever whatever it's worth um but
these models are here to stay so they're
not going to go away they're going to
continue to improve and as open AI has
has released um additional versions so
chy GPT the original was at running at
GPT 3.5 and they have released GP 4 and
that model is showing even more promise
so the the research in this area is
amazing there's there's a lot of work
being put into it by uh by so many uh
people organizations because there's a
lot of interest in it um so it's
technology that's here to stay it's not
going away uh it's going to continue to
improve and as long as we keep an eye on
it from an ethical
standpoint I say the sky's the
limit yeah it's a lot of exciting stuff
you'd have to be a really Brave one to
try a chat GPT recipe for now some of
the recipes are a little questionable
yeah a little sketchy but I'm sure it'll
get to the point where it'll be just
chef's kiss type of stuff hopefully um
question for you uh for the Casual chat
GPT generative AI user what are some
some things that they should watch out
for just general you know Pro tips best
practices that we should keep in mind
I think my favorite because like said
we're all engaging with chat GPT there
are people that are using it to generate
a tremendous amount of content be
careful um there is risk uh the the
models do generate really excellent
sounding results and sometimes they're
not really that trustworthy uh my
favorite example recently I asked chat
GPT about text to speech synthesis which
is an area that I find interesting which
is essentially um having a machine speak
like a human so you have to think about
things like inflection and tone and the
rise and fall of the voice depending on
what you're talking about so I thought
it'd be interesting to see what it came
back with and it gave me a really nice
explanation I asked it deci it sources
and it came back with four papers and
they all sounded believable and it gave
me links to the papers which I thought
though that's really cool so I click on
the links I get 404 not found errors so
the links are bad so I thought okay well
I'm just going to search for the papers
the papers don't actually exist so they
sound plausible but they're not real
papers they were not published works so
I asked chat GPT where can I find this
type of information on the ACL Anthology
which is a really big repository of
papers around computational Linguistics
and natural language processing so it
gave me the same paper names with ACL
Anthology links when I clicked on those
links the P it returned papers but they
weren't the papers that were cited
because those papers didn't exist uh
such took it one step further last step
um I asked it for the authors of these
invented papers and it gave me authors
that publish in the space um but the
combinations of researchers didn't match
up uh and they certainly didn't write
those papers because those papers don't
exist so if I had taken that information
at face value it probably wouldn't have
panned out very well because none of it
was real um it sounded plausible though
and the titles seemed very believable
the authors that were generated are real
but not in the right combinations so my
advice is use it with caution so always
check you know always verify the work um
I uh I work with students and I tell
them that in general if you're going to
use generative AI to do your homework
your grade is probably going to reflect
the level of effort you put into it so
use it with
caution yes good point they're very good
at sounding convincing sounding real
sounding human when it's actually not at
all uh break the fourth wall a little
bit this Mary Osborne that we have on
right now is actually generated by chat
GPT we just threw in prompt and here she
is convincing
right Mary thank you so much for for
coming on the show again and uh we hope
to have you back for another interview
great it's been my pleasure thank
you just to clarify that was the real
Mary Osborne AI isn't at that level yet
abilities across the entire analytics
life cycle including preparing data
creating and viewing reports building
models automating model deployment and
visualizing event streams so try it out
today go to sas.com slva for complete
information and to sign up for the trial
well we reached the end of this month's
show but we'll be back at it again next
month with more SAS via features and
updates if you're watching this on
YouTube why not give us a like And
subscribe to the SAS users YouTube
channel click on that Bell so you'll get
notified for new videos and when we go
live for our next show in May until then
comment on what topics you'd like to see
in future shows I've been your host
Thiago Doza that was the word for
today's updates and I'll see you next
time thanks
[Music]
everyone
5.0 / 5 (0 votes)