7. Frank Block, Head of IT Data Science at Roche, Switzerland

Iva Tahova
18 Apr 202225:15

Summary

TLDRDer Vortrag von Frank umfasst sein persönliches Engagement in der KI-Branche, beginnend mit seiner Ausbildung als Physiker und seiner Zeit am CERN. Er diskutiert die Bedeutung von Datenqualität, die auch nach Jahrzehnten immer noch eine Herausforderung darstellt, und betont die Notwendigkeit, den Fokus auf die Produktivität und den Nutzen von KI-Anwendungen zu legen. Er reflektiert über die Entwicklung von KI von Expertensystemen über neuronale Netzwerke hin zu heutigen automatisierten Modellen und unterstreicht die Bedeutung von Interdisziplinarität, Soft Skills und der Bedeutung von Qualität und Kundenzentrierung in der KI-Entwicklung.

Takeaways

  • 🔬 Datenqualität ist ein immerwährendes Thema, das die Produktivität von AI-Teams und die Skalierung von KI-Projekten beeinträchtigt.
  • 🧠 Die Industrielle Nutzung von KI erfordert tiefgreifendes Verständnis der Geschäftslogik und der Wertschöpfung für Stakeholder.
  • 💡 Die technische Seite erfordert fortlaufendes Lernen und gute Kenntnisse in Datenwissenschaft, Cloud-Architekturen und fortgeschrittenen Analysewerkzeugen.
  • 🛠️ Technologie ist ein Enabler, aber das zentrale Ziel ist es, Probleme zu lösen und nicht nur um der Technologie willen zu arbeiten.
  • 🤝 Interdisziplinarität ist entscheidend für erfolgreiche KI-Projekte, da sie die Komplementarität verschiedener Perspektiven und Fähigkeiten nutzt.
  • 💼 Soft Skills wie das Fähigkeiten zu netzwerken und die Fähigkeit, Geschichten in der Sprache der Stakeholder zu erzählen, sind für KI-Spezialisten wichtig.
  • 👨‍🔬 Die Rolle des Data Scientists hat sich von der reinen Forschung hin zu einer Produktions- und Operations-Orientierung entwickelt.
  • 🔍 Die KI-Industrie hat sich von der Ersetzung von Menschen durch KI hin zu einer stärkeren Betonung der Verstärkung menschlicher Fähigkeiten entwickelt.
  • 🏢 Organisationsbarriere und mangelnde Bereitschaft der mittleren Management-Ebene, KI zu adoptieren, können die Skalierung von KI-Projekten verlangsamen.
  • 📈 Die Bewegung von Daten als Produkt und die Einführung von DevOps-ähnlichen Prinzipien in der KI-Entwicklung sind wichtige Entwicklungen.

Q & A

  • Wie hat die Erfahrung des Redners bei CERN seine Karriere beeinflusst?

    -Der Redner ist Physiker ausgebildet und hat bei CERN, dem größten Beschleuniger der Welt, seine Masterarbeit geschrieben. Diese Erfahrung hat ihn stark beeinflusst und ihn für die Arbeit mit großen Datenmengen und der Analyse solcher Daten interessiert gemacht.

  • Was ist der Large Electron-Positron Collider und wie wichtig war er für die Forschung?

    -Der Large Electron-Positron Collider (LEP) ist ein 27 Kilometer umfassendes Gerät, das 100 Meter unter der Erdoberfläche liegt. Es war ein gigantisches Instrument, das für die Forschung zum Verständnis der Bedingungen des Urknalls von großer Bedeutung war.

  • Wie viele Protonen-Proton-Kollisionen pro Sekunde wurden im Large Hadron Collider gemessen?

    -Im Large Hadron Collider wurden zwei Milliarden Protonen-Proton-Kollisionen pro Sekunde gemessen, was enorme Mengen an Daten erzeugt hat.

  • Wie hat sich die Einstellung des Redners zur künstlichen Intelligenz (KI) entwickelt?

    -Der Redner begann mit der Arbeit an Expertensystemen und wechselte dann zu neuronalen Netzwerken, nachdem bewiesen wurde, dass nicht-lineare Neuronen fast alles modellieren können, was zu einer Renaissance der KI-Forschung führte.

  • Was ist die Bedeutung von Datenqualität für KI-Anwendungen?

    -Datenqualität ist ein konstanter Faktor, der die Skalierung von KI beeinträchtigt. Ohne hochwertige Daten ist es schwierig, KI-Teams produktiv zu gestalten und die volle Potenzial von KI zu nutzen.

  • Wie wichtig sind Soft Skills im Bereich der künstlichen Intelligenz?

    -Soft Skills sind sehr wichtig, da sie das Verständnis und die Kommunikation zwischen verschiedenen Stakeholdern und Teams erleichtern. Dazu gehören Fähigkeiten wie das Aufbauen von Netzwerken, das Erzählen von Erfolgsgeschichten und das Anpassen des Tempos der Innovation an die Bedürfnisse anderer.

  • Was sind die Herausforderungen beim Übergang von Prototypen zu Produktion in der KI-Entwicklung?

    -Ein Hauptherausforderung ist die Verkürzung der Zeit von der Erstellung eines Prototypen bis zur Produktion. Es ist wichtig, Plattformen zu nutzen, die einen nahtlosen Übergang ermöglichen, um die Zeit bis zur Produktion zu reduzieren.

  • Wie kann die Lücke zwischen verfügbaren Data Scientists und Marktbedarf überbrückt werden?

    -Durch die Förderung von Self-Service-Analytik und Citizen Data Scientists kann die Lücke schrittweise überbrückt werden. Dies erfordert jedoch eine hohe Datenqualität, um sicherzustellen, dass die Ergebnisse der Analysen verlässlich sind.

  • Wie wichtig ist die Interdisziplinarität in KI-Projekten?

    -Interdisziplinarität ist sehr wichtig, da sie die Zusammenarbeit verschiedener Fachbereiche wie Business, Data Science und Engineering ermöglicht, um komplexe Probleme zu lösen und umfassende KI-Lösungen zu entwickeln.

  • Was sind die aktuellen Entwicklungen, die der Redner in der KI-Branche beobachtet hat?

    -Der Redner beobachtet eine Verschiebung von KI als Ersatzmöglichkeit für Menschen hin zu KI, die unsere Fähigkeiten erweitert und unsere Arbeit interessanter macht. Er betont auch die Notwendigkeit, Datenqualität und Governance zu verbessern, um KI erfolgreich zu skalieren.

Outlines

00:00

🔬 Einführung in die Welt der Physik und KI

Der Sprecher, ein Physiker, erzählt von seiner Karriere und wie er durch seine Ausbildung und seine Zeit am CERN, dem größten Teilchenbeschleuniger der Welt, beeinflusst wurde. Er beschreibt seine Erfahrungen mit riesigen Datenmengen, die durch den Betrieb des Large Hadron Colliders generiert wurden, und wie er zu der Überzeugung kam, dass KI und KI-Systeme für die Analyse solcher Daten unerlässlich sind. Er spricht auch über seine frühen Erfahrungen mit Expertensystemen und der Entwicklung zu neuronalen Netzwerken, die er später in der Teilchenphysik anwenden konnte.

05:03

🔧 Automatisches Modellmanagement und AI-Produkte

Der Sprecher beschreibt seine frühen Arbeiten mit automatischem Modellmanagement und der Entwicklung von KI-Produkten. Er betont die Bedeutung der Datenqualität und der kontinuierlichen Verbesserung von Modellen, um den Wert von KI-Anwendungen zu maximieren. Er diskutiert auch die Herausforderungen der Skalierung von KI in Unternehmen, die oft durch organisatorische Barrieren und die Notwendigkeit, den Wert von KI nachzuweisen, begrenzt wird. Der Sprecher hebt hervor, wie wichtig es ist, den Fokus auf die Qualität und den Kunden zu legen, um erfolgreiche KI-Produkte zu entwickeln.

10:05

🤝 Interdisziplinäre Zusammenarbeit und Soft Skills

Der Sprecher betont die Bedeutung der Zusammenarbeit zwischen verschiedenen Disziplinen, wie Business, Data Science und Engineering, um erfolgreiche KI-Projekte umzusetzen. Er spricht über die Notwendigkeit, die Geschwindigkeit der Innovationen anzupassen, um sicherzustellen, dass alle Beteiligten mithalten können. Er erwähnt Soft Skills wie das Fähigkeit, ein Netzwerk aufzubauen, die Fähigkeit, Geschichten zu erzählen und die Bedeutung von Führung, um KI-Projekte voranzubringen. Der Sprecher diskutiert auch die Herausforderungen, Talente zu finden und zu halten und wie man effektive Teams aufbaut.

15:07

🌐 Aktuelle Entwicklungen in der KI-Industrie

Der Sprecher reflektiert über aktuelle Trends und Entwicklungen in der KI-Branche, wie die Verwendung von KI zur Steigerung der Fähigkeiten der Menschen und zur Verbesserung ihrer Arbeitsbedingungen. Er spricht über die Bedeutung von Datenqualität und Governance, die weiterhin die Skalierung von KI-Lösungen behindern können. Er diskutiert auch die Notwendigkeit, AI-Technologien zu standardisieren und zu industrialisieren, um sie effizienter und skalierbarer zu machen. Der Sprecher erwähnt auch die Bedeutung von Agile-Methoden und DevOps in der KI-Entwicklung.

20:08

📊 Datenqualität und KI-Implementierung

Der Sprecher beantwortet Fragen zu den Herausforderungen der KI-Implementierung, insbesondere in Bezug auf Datenqualität und Governance. Er diskutiert die Notwendigkeit, ein Gleichgewicht zwischen der Bereinigung von Daten und der schnellen Implementierung von KI-Lösungen zu finden. Der Sprecher betont, dass KI-Projekte nicht auf perfekte Daten warten sollten, sondern mit den verfügbaren Daten arbeiten und kontinuierlich verbessern sollten. Er spricht auch über seine Erfahrungen beim Übergang von Prototypen zu Produktion und die Bedeutung einer effizienten Plattform für diesen Prozess.

25:08

⏱️ Zeit zur Produktion und KI-Lernungen

Der Sprecher teilt seine Erkenntnisse über den Übergang von Prototypen zu Produktion in der KI-Entwicklung. Er betont die Bedeutung, die Zeit von der Erstellung eines Prototypen bis zur Produktion zu verkürzen und wie man Plattformen nutzt, um diesen Prozess zu beschleunigen. Der Sprecher diskutiert auch die Herausforderungen, die mit der Skalierung von KI-Lösungen einhergehen und wie man diese effektiv bewältigen kann.

Mindmap

Keywords

💡Datenqualität

Datenqualität bezieht sich auf die Genauigkeit, Vollständigkeit und Verlässlichkeit von Daten. Im Kontext des Videos ist dies ein zentrales Thema, da der Sprecher betont, dass er 90% seiner Zeit damit verbringt, Datenprobleme zu lösen. Dies zeigt, wie wichtig die Datenqualität für die Entwicklung und den Einsatz von KI-Anwendungen ist und wie sie die Produktivität von AI-Teams beeinflusst.

💡Industrielleisierung

Industrielleisierung bezieht sich auf den Prozess, bei dem Technologien und Verfahren standardisiert und automatisiert werden, um die Effizienz und Skalierbarkeit zu erhöhen. Im Video wird dies erwähnt, um die Herausforderungen zu beschreiben, die mit der Skalierung von KI-Lösungen verbunden sind, und wie sie die Produktivität und den Wert, den KI schaffen kann, beeinflusst.

💡Expertensysteme

Expertensysteme sind KI-Anwendungen, die Entscheidungsunterstützung bieten, indem sie auf einem Wissen basieren, das von Fachleuten in einem bestimmten Bereich bereitgestellt wird. Im Video erwähnt der Sprecher, dass er an Expertensystemen gearbeitet hat, was auf seine frühe Beteiligung an der KI-Entwicklung hinweist und die Entwicklung von KI von symbolischer KI hin zu neuronalen Netzen zeigt.

💡Neuronale Netze

Neuronale Netze sind ein Typ von KI-Systemen, der sich durch die Verwendung von Verbindungen und Abstufungen arbeitet, die an die Funktionsweise des menschlichen Gehirns erinnern. Der Sprecher erwähnt, dass er zu neuronalen Netzen übergegangen ist, nachdem bewiesen wurde, dass sie komplexere Probleme lösen können, was auf ihre Bedeutung für die KI-Entwicklung und -Anwendungen hinweist.

💡Datenanalyse

Datenanalyse ist der Prozess der Untersuchung von Daten, um Informationen und Erkenntnisse zu gewinnen. Im Video wird die Begeisterung des Sprechers für die Datenanalyse und die Analyse großer Datenmengen hervorgehoben, was zeigt, wie wichtig diese Fähigkeit für die Arbeit mit KI und der Erforschung von Phänomenen wie dem Urknall ist.

💡CERN

CERN ist das Europäische Organisation für Kernforschung und ist bekannt für seine Partikelbeschleuniger und Experimente zur Teilchenphysik. Im Video erwähnt der Sprecher, dass er bei CERN war, als der Large Hadron Collider in Betrieb genommen wurde, was die enorme Skala der Daten und die Herausforderungen bei der Datenanalyse im wissenschaftlichen Forschungskontext verdeutlicht.

💡Datenwissenschaft

Datenwissenschaft ist ein interdisziplinäres Fachgebiet, das sich mit der Analyse und dem Verständnis von Daten aus verschiedenen Quellen und in verschiedenen Formen beschäftigt. Im Video wird die Notwendigkeit von Datenwissenschaftskenntnissen betont, um KI-Lösungen zu entwickeln und zu implementieren, und wie diese Fähigkeiten zur Problemlösung beitragen.

💡Künstliche Intelligenz (KI)

Künstliche Intelligenz bezieht sich auf die Technologie und die Forschung, die darauf abzielt, kognitive Funktionen wie das Lernen, Problemlösen und Entscheidungsfindung in Computerprogrammen und Robotern zu simulieren. Im Video wird KI als zentrales Thema behandelt, und der Sprecher teilt seine Erfahrungen und Einsichten über die Entwicklung und den Einsatz von KI in verschiedenen Anwendungsbereichen.

💡Automatisierung

Automatisierung bezieht sich auf den Prozess der Ersetzung manueller oder menschlicher Tätigkeiten durch Maschinen oder KI-Systeme. Im Video erwähnt der Sprecher frühe Arbeiten zur Automatisierung des Modellbetriebs und der Modellersetzung durch neue Modelle, was auf die Bedeutung der Automatisierung für die Effizienz und Skalierung von KI-Lösungen hinweist.

💡Interdisziplinarität

Interdisziplinarität beschreibt die Zusammenarbeit und den Austausch zwischen verschiedenen Disziplinen oder Fachbereichen. Im Video wird betont, wie wichtig es ist, dass verschiedene Perspektiven und Expertisen, wie die von Geschäftsleuten, Datenwissenschaftlern und Ingenieuren, zusammenarbeiten, um KI-Projekte erfolgreich umzusetzen.

Highlights

Constant importance of data quality over time despite advancements in AI.

Observation of recent developments and trends in AI and data analysis.

Background as a physicist influenced the speaker's approach to data and AI.

Experience at CERN working with large-scale data from particle accelerators.

The evolution from expert systems to neural networks in AI applications.

Importance of understanding business models and value generation in AI projects.

The necessity for continuous learning and staying updated with AI advancements.

The shift from automating model production to proving the value of AI applications.

Emphasis on the product focus and customer-centric approach in AI development.

The challenge of data quality as a barrier for scaling AI and its impact on productivity.

The need for interdisciplinary teams in AI projects for a comprehensive approach.

The importance of soft skills in AI, such as the ability to network and communicate effectively.

The shift in AI focus from replacing human work to augmenting human capabilities.

The role of organizational barriers in the scaling of AI and the need for data-driven decision-making.

The potential of self-service analytics to address the shortage of data scientists.

Balancing data quality and governance with the need for timely insights in data-driven businesses.

Learnings from moving AI projects from prototyping to production and the importance of time to production.

Transcripts

play00:00

of constant over time and that may be

play00:02

interesting um to to mention as well as

play00:05

a few observations um more recent

play00:08

developments and and trends uh that i

play00:11

think i'm i'm seeing here on on the

play00:12

horizon

play00:14

um and

play00:15

so if if i continue here let me see oops

play00:18

now the controls are back so um

play00:21

where do i come from and um so

play00:24

one second i'm still having some issues

play00:26

here with my back and forth so now it's

play00:28

good um so where did i come from i'm i'm

play00:31

a physicist by training and that that

play00:33

influenced me a lot so let me tell you a

play00:36

little bit

play00:37

why i like data so much and um

play00:40

analysis of big amounts of data so when

play00:42

i when i started

play00:44

studying towards the part when you do

play00:46

your master thesis so the last two years

play00:48

of my studies i was lucky and could go

play00:50

to cern cern is in geneva it's the

play00:53

biggest accelerator you will find on

play00:55

earth

play00:56

and as a matter of fact it was just

play00:58

about the time when this huge machine

play01:00

was was being launched uh into operation

play01:03

back then under the name of a large

play01:06

electron positron collider and this is a

play01:09

huge 27 kilometer circumference device

play01:12

about 100 meters below surface

play01:15

so everything for me as a young student

play01:17

there was was gigantic right so starting

play01:19

with this huge machine then going on and

play01:22

here i'm showing now the upgrade that

play01:24

that it received like 10 years after it

play01:26

it started

play01:28

working as a lap it was called large

play01:31

hadron collider and there were huge

play01:33

experiments that are placed on the earth

play01:36

under the surface and these huge

play01:38

machines you see a little bit here

play01:39

perhaps

play01:40

the dimensions compared to to us humans

play01:43

so these are huge machines think about

play01:45

it as microscopes

play01:47

and and and they are there to collect

play01:49

tons of data

play01:50

and that data is then what will be

play01:52

analyzed later on and um so

play01:56

what is then of course also gigantic is

play01:58

the amount of data that's generated in

play02:00

such an environment

play02:02

um and and so you have something like

play02:04

two billion

play02:06

of these proton proton collisions head

play02:08

on collisions per second and that just

play02:10

generates crazy amounts of data right so

play02:13

um yeah that was huge

play02:15

of course then all the computing you

play02:17

need is huge

play02:18

um

play02:19

i don't know a few of you may remember

play02:21

that name of the craze super computer

play02:24

today you would probably put a iphone

play02:26

and it would do probably the same job

play02:29

but that was was a big computer back

play02:31

then so that was a certain approach um

play02:34

in an architecture that was invoked

play02:36

and cern was the first place where you

play02:39

would find that in europe and it was

play02:40

also the first place where it would be

play02:42

decommissioned because um they

play02:44

definitely went uh in into other

play02:46

architectures like farms of um you know

play02:49

of unix machines and then the servers

play02:52

and so on and so on and of course

play02:55

overwhelming the physics that's why i

play02:57

was there so understanding more about um

play03:00

the big bang right so what were the

play03:01

conditions

play03:03

we had a model the standard model and so

play03:05

on and so on so this is all the

play03:08

the story and i really just wanted to

play03:09

give you that kind of background to see

play03:11

you know what has motivated me to really

play03:13

work

play03:14

very early on in this ai space so i

play03:17

started off

play03:19

working on expert systems so that was

play03:21

kind of a paradigm

play03:23

kind of from symbolic ai that was

play03:26

on vogue back then

play03:28

after earlier on marvin minsky had shown

play03:30

that

play03:31

using simple what they call perceptron

play03:34

so not yet

play03:35

you know the neural networks as we know

play03:36

them but these were simpler

play03:39

architectures they couldn't even solve

play03:41

simple problems such as the xor problem

play03:44

so

play03:45

that got you know that kind of neural

play03:47

network research into a longer period of

play03:50

darkness or standby so i was happy

play03:52

working on on these um at cern you know

play03:55

doing diagnostics of certain falls of

play03:59

complex machines so that was an

play04:01

interesting approach to learn and then

play04:03

of course very quickly after

play04:07

i switched to neural networks which came

play04:09

in vogue again after someone else proved

play04:11

that well if you have non-linear neurons

play04:14

you can model almost anything you know

play04:16

you just have to have enough enough

play04:18

neurons in your in your

play04:20

deep learning model

play04:21

so

play04:22

there were many applications i could

play04:24

start then playing with you know be it

play04:26

for particle track reconstruction

play04:29

using different setups or identifying

play04:32

certain fundamental particles

play04:35

so all of that

play04:37

was you know back then you had to write

play04:39

your own code

play04:41

um we didn't have to open open source

play04:43

libraries which came a bit later so

play04:46

that takes a bit of time until you get

play04:49

something deployed as i moved out of

play04:52

academia when i finished my phd

play04:55

it was pretty much about

play04:57

already then automating

play04:59

um you know the whole operation of

play05:02

model production and operation

play05:05

and replacement of models by new models

play05:07

and massive amounts of models hundreds

play05:09

of different models um that would be

play05:11

live

play05:12

um and

play05:13

automated so this was something that we

play05:15

started very early on in in the 2000s

play05:18

and i think this is still you know today

play05:20

a topic under

play05:22

you know automl and and these are kind

play05:24

of the the tags that we have today but

play05:27

in fact that was a work uh

play05:29

on my side that started very early on

play05:32

and then it moved more and more into

play05:33

measuring also the value really proving

play05:35

the value of of ai of a application so

play05:40

you can prove it to a b testing

play05:42

depending on the applications where you

play05:44

come from that may be more difficult in

play05:46

a b2b context for instance and then

play05:49

today i think very important this this

play05:51

product focus everything is a product

play05:54

ai product and that brings automatically

play05:56

the customer

play05:58

um the user into the focus and quality

play06:02

with it so i think we we get more and

play06:04

more that um very good focus on

play06:08

quality

play06:09

so

play06:10

now

play06:11

summarizing looking back

play06:13

from from my experience at least what

play06:15

are the things that i think stayed

play06:17

constant over time and these may be

play06:20

interesting for you here and there to

play06:21

you know to have a look into

play06:23

um so one of the

play06:25

unfortunate

play06:27

evergreens and constants i found at

play06:29

least is data quality

play06:31

uh which is a pity because we 20 30

play06:34

years ago i would have told you yeah i

play06:36

waste 90 of my time

play06:38

resolving these data issues and today

play06:40

the answer is more or less the same

play06:42

so i think we still haven't gotten out

play06:45

of that

play06:47

remove that big barrier for

play06:49

scaling up

play06:50

ai

play06:52

this

play06:53

of course limits the productivity of our

play06:56

ai teams our data science teams

play06:58

analytics teams industrialization is

play07:01

slowed down or even impossible and uh of

play07:04

course the value finally that we can get

play07:06

out of

play07:07

ai is greatly limited by that

play07:10

so i'm happy to hear about also others

play07:12

uh here in the audience you know what

play07:14

what your experience is then on the

play07:17

skill side yes um if we want to create

play07:20

um

play07:21

advanced applications that contain some

play07:24

ai components we need to understand what

play07:26

what is that business of course that

play07:28

we're working for so how does that work

play07:30

how does it generate value

play07:32

um what are the

play07:34

pain points uh not always easy to

play07:36

identify

play07:38

how are the stakeholders and

play07:39

incentivized and

play07:41

what are the value metrics that we

play07:43

should be defining and measuring

play07:46

um and i think also not to be

play07:48

underestimated is of course the

play07:50

organizational change that that it

play07:52

implies it may often be a barrier

play07:54

for adopting ai is just the change that

play07:57

it would

play07:59

imply

play08:00

and usually

play08:01

in you know

play08:03

adopting ai-based innovation means

play08:06

change of the current way of working

play08:09

that's

play08:10

very much true in most cases

play08:12

on the technical side of skills

play08:15

of course we expect that um you know

play08:18

the the stakeholder or the the people

play08:21

who develop those solution solutions

play08:23

that they have very good and deep data

play08:25

science skills depending on the area

play08:27

you're working on

play08:28

um so we need to have some continuous

play08:31

learning we need to be up to date with

play08:33

what's going on on the ai front which is

play08:35

developing very quickly but i think also

play08:37

you need that kind of experimental

play08:39

scientific approach to to cut bigger

play08:42

problems into pieces formulate

play08:44

hypothesis and so on and so on

play08:46

um and visualization definitely another

play08:50

important ingredient

play08:51

more on the i call it i.t in quotes

play08:54

let's say skills um there's all the

play08:56

other technical um skills we need

play09:00

you need to have some some good

play09:01

knowledge of programming languages today

play09:04

more and more cloud architectures and so

play09:05

on uh be you know being capable of

play09:08

working with different um advanced

play09:11

analytics tools um but i think it should

play09:14

not um

play09:15

make us blind to

play09:17

you know in the sense of it's not

play09:19

all about orbitating around technology

play09:22

technology is the enabler but what we're

play09:24

really trying to do is solve problems

play09:26

and that is at the center and not the

play09:28

technology

play09:30

from my perspective

play09:32

one thing we may um

play09:34

or i've seen several times is of course

play09:36

that when we

play09:37

you know we built these models we almost

play09:40

fall in love with them and we want them

play09:41

to be perfect

play09:43

that most likely will never happen um

play09:46

so sometimes it's really better to

play09:49

have a reasonable model that works

play09:51

reasonably well let's get it out there

play09:54

it starts generating some value and

play09:56

let's keep optimizing it afterwards

play09:58

so that that would be a recommendation

play10:01

then the other

play10:02

topic um i've been um of course

play10:05

observing

play10:06

um is

play10:07

interdisciplinarity and here i'm just

play10:09

showing a few

play10:11

uh

play10:12

profiles um this

play10:14

may change according to the area you're

play10:16

working in of course you will have

play10:18

different personas participating in this

play10:21

but you know to give an example so we

play10:23

have the business side represented

play10:27

usually

play10:28

wants to minimize the risk while

play10:31

maximizing the return so that's

play10:34

the balance act that they're usually

play10:36

working on

play10:38

on the more like data scientists slash

play10:40

ai side well we do all these experiments

play10:44

we try models you know that very well

play10:46

things can go wrong anytime but that's

play10:49

part of the game

play10:50

and we keep generating knowledge

play10:53

which then

play10:54

is usually put into practice

play10:58

and production by more of the

play11:00

engineering

play11:02

people that that contribute uh to this

play11:05

whole

play11:06

project or or activity

play11:08

that you may be working on so um i think

play11:11

this this complementarity is very

play11:13

important so each have each of these

play11:16

have a different way of working and

play11:18

together that really makes a lot of

play11:20

sense

play11:21

some words about

play11:23

soft skills

play11:25

definitely one topic that i have

play11:27

observed

play11:29

is um you know if you are so

play11:33

um

play11:34

you just see that the innovation is

play11:35

working greatly there is many things you

play11:37

can do you can you can go very far ahead

play11:40

and by doing that you may

play11:43

forget

play11:44

perhaps that other people will not

play11:46

follow you at the same speed

play11:48

and it may be worth slowing down a bit

play11:51

the pace of innovation because otherwise

play11:53

you will find yourself alone running

play11:55

ahead and nobody behind you

play11:57

so um

play11:59

the other thing um i i put here ability

play12:02

to network so i think it's very

play12:03

important that

play12:05

um you know we as data scientists as ai

play12:09

specialist that we

play12:10

we know what's going on we we make

play12:12

ourselves known but that we also

play12:14

understand

play12:15

a network

play12:17

you know with all different areas of

play12:19

the companies of the organizations

play12:21

in a little bit a role of an internal

play12:23

consultant

play12:24

always looking for opportunities usually

play12:29

and

play12:30

trying to identify the right or the real

play12:32

pain points by asking the right

play12:34

questions and that is not always easy

play12:36

it's very often i find even close almost

play12:39

to an art

play12:41

because the real problems

play12:43

from my experience they're behind many

play12:44

layers of apparent problems

play12:47

until you get to them

play12:50

and then finally if you tell a story a

play12:52

success story about

play12:53

ai being employed here and there

play12:56

uh definitely you need to do that in in

play12:59

your stakeholders language and not in a

play13:01

very technical

play13:03

language which is usually not not very

play13:07

much

play13:08

appreciated

play13:10

then

play13:12

of course you will have to to find the

play13:14

people

play13:15

that have the skills to create your

play13:18

your ai applications solutions

play13:22

that will always be a challenge i guess

play13:25

there will never be enough data

play13:26

scientists

play13:27

and among those you still have to find

play13:29

the ones that are the the right talents

play13:31

for for what you're out to do

play13:34

and then of course the next thing is

play13:36

once you have hired a certain

play13:39

you know the talents that you were

play13:41

looking for how do you retain them

play13:42

because that's the next challenge um

play13:45

usually the market being very hot

play13:48

you know what is it that is really

play13:50

retaining them and

play13:52

from my perspective it's very much about

play13:55

the interesting challenges you can

play13:57

provide

play13:59

to be resolved so if these are really

play14:02

challenging high value adding i think

play14:04

then you have a good chance of retaining

play14:06

them

play14:07

and um then the next thing is how do you

play14:10

make that team efficient right so that

play14:12

we don't just do research in any kind of

play14:14

direction but really pointing towards

play14:16

generational value

play14:18

interdisciplinary

play14:20

teams i mentioned that before

play14:22

the last point here i would make is

play14:24

about the kind of leadership

play14:26

that i've seen over the years also

play14:29

myself as i learned

play14:31

i see much more benefits in a kind of a

play14:35

pool leadership which is more kind of a

play14:38

servant

play14:40

manager kind of

play14:43

approach than pushing

play14:46

into the teams

play14:47

what you think they should be doing i

play14:49

think rather better that they follow you

play14:51

because you have a certain proposition

play14:54

to make that sounds interesting

play14:57

um coming towards the end um a few notes

play15:01

on recent developments um that i've been

play15:04

observing and i'm sure

play15:06

you see them as well

play15:08

um one of the things

play15:10

over the years

play15:11

that i find interesting

play15:13

and absolutely correct also the

play15:15

development that we

play15:17

i think in the beginning it was very

play15:19

much about um ai machine learning

play15:23

simply replacing people in what they're

play15:25

doing and i think that focus is now

play15:27

shifting more and more

play15:29

definitely in the area where i am in

play15:32

into augmenting our capabilities

play15:35

and also making our jobs more

play15:37

interesting

play15:39

by the help of of ai that that empowers

play15:42

us

play15:43

um

play15:44

another topic that i i really

play15:47

stumbled across over the last years is

play15:49

really the point sometimes we ask so why

play15:51

can't we really scale up ai in a bigger

play15:54

way data quality certain one but the

play15:57

others i think the purely organizational

play15:59

barriers and

play16:00

i think part of that is also middle

play16:03

management which is largely not yet

play16:06

ready to adopt

play16:08

ai on a large scale so you may end up

play16:11

doing

play16:13

many great projects you can show value

play16:15

but in the end nothing will change

play16:18

the the new ways of working will not be

play16:20

adopted

play16:21

and that i see as a little bit of a

play16:23

barrier

play16:24

so we need to enable many more people to

play16:27

really start using data themselves and

play16:29

become more data driven

play16:31

um

play16:32

i think the tools are around

play16:34

the the reservation i would have is

play16:36

still

play16:37

um the data quality side as long as we

play16:39

have those issues i guess you get more

play16:42

trouble than benefits if you have

play16:44

everyone doing auto ml on any kind of um

play16:47

suspicious data so this must go hand in

play16:50

hand

play16:51

you know

play16:52

before we can unlock the value of data

play16:54

really

play16:55

and regulations i guess they they

play16:57

continue to increase so we're seeing a

play16:59

lot um coming up already out there

play17:02

that will certainly continue growing

play17:05

so i think we need to be

play17:07

ready and embrace it um

play17:10

and industrialization standardization

play17:12

automation of ai definitely as well

play17:16

and to close

play17:18

um

play17:19

you've seen some of that certainly um

play17:21

the envelopes concepts you know that we

play17:24

augment or increase the

play17:27

the well-known devops approach to also

play17:29

contain the

play17:31

the continuous exploration using uh ml

play17:34

and ai methods

play17:36

um

play17:36

are part of this and we also have some

play17:39

architectures

play17:41

frameworks that are being

play17:43

described and proposed just an example

play17:46

here from from google which i find quite

play17:49

useful of course now without going into

play17:52

any any details here

play17:54

um in the ways of working the agile and

play17:56

the scaled agile i think also good

play17:59

movements here

play18:00

that um you know have a

play18:02

product centric um and and team

play18:05

empowering

play18:07

kind of focus

play18:09

so i think all in all

play18:11

very desirable

play18:13

developments

play18:14

and and finally on the data front i i

play18:18

see some hope there so there is

play18:20

the fair data movement that you may have

play18:22

seen already

play18:24

i think that is already addressing some

play18:27

data quality dimensions hopefully also

play18:29

putting data quality more into the

play18:31

center of detention and another movement

play18:34

that i'm observing that is creating that

play18:37

kind of attention on data is this kind

play18:40

of data as a product approach that we

play18:42

see in the data mesh

play18:44

context since 2019 more or less so i see

play18:48

some adoption of those principles

play18:51

i've just brought here you know what are

play18:53

the kind of

play18:54

requirements or product properties for

play18:58

these data products and many of them of

play19:00

course they are they are related to data

play19:03

quality so let's hope that this

play19:05

brings the desired effect and good data

play19:08

plenty of good data so that we could

play19:11

generate plenty of good

play19:13

applications using ai in future thanks

play19:16

for your attention this is all i i had

play19:18

for you today thank you happy to take

play19:20

some questions

play19:21

thanks a lot thank you so much for

play19:23

sharing the presentation thank you now

play19:25

just we have time for q a uh we have

play19:29

given

play19:30

uh possibility of microphone camera on

play19:32

to all participants before launch so

play19:34

anybody who has a question please just

play19:36

switch on the camera

play19:41

so i we have just a feedback from niraj

play19:44

uh

play19:46

ibsen pharmaceuticals excellent

play19:47

presentation frank thanks a lot we

play19:49

agreed

play20:00

hi frank thank you very much for this

play20:02

interesting insights and based on your

play20:05

long-term experience you mentioned one

play20:07

point that data scientists is a shortage

play20:11

they're not enough on the market to

play20:12

fulfill all the demand of companies and

play20:16

how do you see the possibilities of

play20:18

self-service analytics to fill up a

play20:21

little bit this gap and

play20:23

let's say get

play20:24

the

play20:25

the demand of

play20:28

knowledge workers closer to the

play20:32

situation of missing data scientists

play20:34

yeah now thanks for the question marcus

play20:36

and this is uh spot on right

play20:40

i mean we have those talks about the

play20:42

citizen data scientist um i think the

play20:44

idea is great

play20:46

the only dependence i would see or

play20:49

restriction is the data quality

play20:52

so i mean you have great tools you know

play20:54

tools like yours and others that that

play20:56

provide those kind of easy

play20:58

access to

play21:00

data preparation and data analysis and

play21:03

ml i think the tools are there

play21:05

now we need to make sure that the data

play21:07

is also there and good

play21:09

because otherwise you will have many

play21:12

people

play21:14

that start using the data and they will

play21:16

get contradictory results and then you

play21:18

get a lot of discussion everywhere

play21:21

and then in the end they will even say

play21:22

it's the tool right um which is not of

play21:25

course the case so that is my only

play21:28

restriction um you know as soon as you

play21:30

have areas of data where you say

play21:33

this is safe we have you know we can

play21:35

guarantee the quality of that i think

play21:37

then you can open it up for

play21:40

you know business users

play21:42

to widely start working on that

play21:45

thank you very much frank well thank you

play21:48

thanks a lot and now there was somebody

play21:50

else hannah took on if you have any

play21:52

question you can switch on microphone as

play21:53

well yes

play21:55

yes thanks frank uh interesting talk

play21:58

because i think i want to ask a relevant

play22:00

question is like

play22:01

because what we are facing is we are

play22:03

facing a huge amount of data and of

play22:05

course data quality and governance is

play22:07

important

play22:08

but then

play22:10

and some people will emphasize too much

play22:12

on this and

play22:14

this will slow down the let's say ins

play22:16

driving the insights

play22:19

so we're in this dilemma is like because

play22:22

actually

play22:23

the huge amount of data what type of

play22:24

insights we should focus on what type of

play22:27

insights comes first

play22:28

or

play22:30

let's clean up all the data but this we

play22:33

don't know this cleaning up the data is

play22:35

a huge tedious

play22:37

very time consuming work and the inside

play22:39

trailing is of course more welcomed from

play22:42

the business and

play22:44

i mean what is the balance there

play22:48

thanks um for the question i think

play22:50

that's exactly you know the the

play22:52

complicated situation we're in and

play22:54

you know we as data scientists

play22:57

well we work in this environment and we

play22:59

would never say let's wait until all the

play23:00

data is fixed and then we start working

play23:02

that's just not feasible so very often

play23:05

also our work i mean our productivity

play23:08

goes down because we have to fight

play23:09

against these data quality uh problems

play23:12

at the same same time

play23:14

as we do this work we also

play23:17

understand certain um you know we get

play23:20

data intelligence which can then be used

play23:23

again to nicely model it subsequently

play23:26

into

play23:27

a data mesh data product a data lake

play23:30

whatever but we could we also provide

play23:32

help to those people who then you know

play23:34

make data nice right in the end but very

play23:37

often we're at the forefront of this

play23:39

right so yeah we we cannot wait we need

play23:41

to move on but

play23:43

wouldn't it be great if we could move on

play23:45

like three times faster right

play23:49

thanks a lot thank you uh

play23:51

jack

play23:53

yeah um hi frank nice to see you again

play23:56

thank you

play23:57

you mentioned at the end moving into

play23:59

production so what what do you have kind

play24:01

of top two learnings when you move from

play24:04

this prototyping mvps to production and

play24:08

then what do you need to take into

play24:09

account and and what's your learnings

play24:13

in that journey

play24:14

well great great question and and

play24:17

you know what we are working um

play24:19

very much on is is to shorten that time

play24:24

you know we're very good in in

play24:26

prototyping um

play24:28

and shorten the time from prototype to

play24:31

production or to product right and this

play24:34

it could still be that after three four

play24:36

months we have a prototype but then it

play24:38

takes another nine months or so to get

play24:40

the product out and it's too long and so

play24:43

we are now also trying to work on

play24:45

certain

play24:46

platforms which then would allow us more

play24:49

of a seamless path from idea to

play24:52

prototype to production so you know many

play24:57

products out there and definitely we

play24:59

looked into some data science

play25:00

workbenches

play25:01

and we are getting some benefits from

play25:03

there so this is something that we're

play25:05

really uh ramping up strongly now so i

play25:08

think time to production is

play25:10

is key um

play25:12

yeah i would say that's the major

play25:13

learning part

Rate This

5.0 / 5 (0 votes)

Связанные теги
Künstliche IntelligenzDatenanalyseDatenqualitätIndustrielle AnwendungPhysikCERNNeuronale NetzeData ScienceAI-ProdukteInnovation
Вам нужно краткое изложение на английском?