Java and AI? - Inside Java Newscast #72

Java
3 Jul 202409:53

Summary

TLDRThe video script challenges the notion that AI in Java is inferior, arguing that Java is well-positioned for AI's future with ongoing projects like Valhalla, Panama, and Babylon. It categorizes AI development into model development, model execution, and AI as a feature in products. Java's strengths in software development, coupled with emerging features, make it a formidable contender for AI integration and execution, despite not leading in AI-centric product development or model creation.

Takeaways

  • 😀 The speaker initially accepted the common opinion that AI in Java is inferior without much knowledge of the subject.
  • 🤔 The speaker, being a Java enthusiast, was annoyed by the negative view of AI in Java and anticipated improvements with upcoming Java projects like Valhalla, Panama, and Babylon.
  • 🔮 The video aims to challenge the myopic view that AI in Java is currently inadequate, suggesting that Java is well-positioned for the future of AI.
  • 📈 AI development is categorized into three areas: developing a machine learning model, executing a model, and integrating AI as a feature or product.
  • 🛠️ Java is already strong in many development aspects, making it a suitable candidate for projects that include AI-based features.
  • 📚 Java offers several libraries and runtime options for executing machine learning models, such as TornadoVM, ONNX Runtime, DJL, Tribuo, and LangChain4j.
  • 🚀 Upcoming Java projects like Valhalla, Panama, and Babylon are expected to enhance Java's capabilities in AI, especially in executing models with improved performance.
  • 💡 Valhalla's value types and limited operator overloading, Panama's Foreign Function and Memory (FFM) API and vector APIs, and Babylon's code reflection are highlighted as potential game-changers.
  • 🤖 The speaker argues that Java's ecosystem for executing ML models is already robust and will only get stronger with the integration of native code and pure Java implementations.
  • 🏆 While Java may not be the best for developing machine learning models currently, its strengths in other areas make it competitive for AI integration in existing projects.
  • 🔄 The future of AI development may lean towards integration into other applications rather than AI-centric products, positioning Java favorably for this trend.

Q & A

  • What is the common misconception about AI in Java that the speaker initially accepted?

    -The common misconception is that AI in Java is bad, a view that the speaker grudgingly accepted without initially knowing much about the subject.

  • What upcoming Java projects does the speaker believe will significantly improve AI capabilities in Java?

    -The speaker mentions Valhalla, Panama, and Babylon as upcoming Java projects that are expected to make substantial progress in enhancing AI capabilities in Java.

  • How does the speaker categorize AI development?

    -The speaker categorizes AI development into three areas: developing a machine learning model, executing a machine learning model, and integrating a machine learning model as a feature into larger, often pre-existing products.

  • Why might Java be a preferable choice for integrating AI as a feature into an existing project?

    -Java might be preferable because it avoids the complexity of creating a new service in a different language and incorporating it via a REST API or as foreign code. Java's strong ecosystem for other development aspects makes it a strong candidate for integrating AI features.

  • What are some of the Java libraries and runtimes mentioned for executing machine learning models?

    -Some of the Java libraries and runtimes mentioned are TornadoVM, ONNX Runtime, DJL, Tribuo, and LangChain4j.

  • What is Project Valhalla aiming to achieve in the context of AI and Java?

    -Project Valhalla aims to provide the capability to define types that 'code like a class, work like an int', which is relevant for AI as it could allow the use of primitives like half-floats and enable writing performant code without sacrificing good design and maintainability.

  • What is the significance of Panama's vector API for AI in Java?

    -Panama's vector API can dramatically speed up CPU-based computations, which is beneficial for executing machine learning models efficiently in Java.

  • What is the primary goal of Project Babylon in relation to AI and Java?

    -Project Babylon's goal is to allow Java code to parse other Java code and derive new code that can be executed by a GPU, which is directly aimed at improving AI capabilities in Java by enabling GPU-accelerated computation.

  • Why does the speaker believe that Java may not need to be the best for just running an ML model?

    -The speaker believes that Java may not need to be the best for just running an ML model because the larger portion of AI-related development work will be its integration into other projects, where Java is already competitive and will become stronger with the advancements in Java projects like Valhalla, Panama, and Babylon.

  • What challenges does Java face in the area of developing machine learning models?

    -Java faces challenges such as the need for a type system that can easily handle heterogeneous data, operator overloading, ease of use with mathematical functions, libraries designed for large data set analysis, and good visualization tools. Additionally, Java's explicit static typing and checked exceptions can be seen as a downside for those who value simplicity over robustness in the early stages of model development.

  • What is the speaker's view on the future of AI in Java, considering the ongoing and upcoming Java projects?

    -The speaker is optimistic about the future of AI in Java, stating that with the advancements in projects like Valhalla, Panama, and Babylon, Java will not only become more competitive but also strengthen its position in the coming years, especially for integrating AI as a feature into other applications.

Outlines

00:00

🤖 Java's Future in AI Development

The script challenges the notion that AI in Java is inferior, arguing that it's a limited view that doesn't consider upcoming Java advancements. The speaker, Nicolai Parlog, introduces the topic of AI in Java, noting that while Java may not be the best for developing AI currently, it is well-positioned for the future with projects like Valhalla, Panama, and Babylon. The video aims to explore Java's suitability for AI across three development categories: model development, product development, and feature integration.

05:03

🛠️ Java's Strengths and Future Enhancements for AI Execution

This paragraph delves into Java's capabilities for executing machine learning models, both within existing projects and new ones. It highlights Java's strengths in various development aspects, such as strong typing, performance, and security, which make it a strong candidate for projects incorporating AI features. The paragraph also discusses Java's current library and runtime support for AI, including TornadoVM, ONNX Runtime, DJL, and others, and how upcoming features like Valhalla's value types, Panama's Foreign Function and Memory API, and Babylon's code reflection will enhance Java's AI execution capabilities.

Mindmap

Keywords

💡AI in Java

AI in Java refers to the integration and implementation of artificial intelligence functionalities within Java-based applications. The video discusses the common misconception that Java is not well-suited for AI development. It is central to the video's theme, which is to challenge this view and highlight Java's strengths and upcoming improvements for AI capabilities.

💡Machine Learning

Machine Learning is a subset of AI that involves the development of algorithms that can learn from and make predictions or decisions based on data. The video script interchangeably uses 'AI' and 'Machine Learning' due to the current wave of AI being predominantly based on machine learning. It is a key part of the discussion on AI development in Java.

💡Valhalla

Project Valhalla is an initiative within the OpenJDK community aimed at enhancing Java's capabilities to handle new data types and improve performance. The video mentions Valhalla as one of the projects that will potentially make AI in Java more efficient and effective by allowing the definition of types that work like primitives but with the benefits of object-oriented programming.

💡Panama

Project Panama is focused on improving foreign function and memory access in Java, facilitating better integration with native libraries and potentially enhancing performance for AI tasks. The video discusses Panama as a project that will contribute to Java's AI capabilities by providing a vector API for accelerated CPU computations.

💡Babylon

Project Babylon is an OpenJDK project aimed at allowing Java code to parse and generate new code, including foreign code that can be executed by GPUs. The video mentions Babylon as a direct contributor to AI in Java by enabling pure Java implementations that can perform at similar levels to native code, which is crucial for executing machine learning models.

💡Model Execution

Model Execution refers to the process of using a trained machine learning model to make predictions or decisions based on input data. The video discusses this as a category of AI development where Java can be particularly strong, especially with the support of various libraries and upcoming features like those from Project Panama and Babylon.

💡Product-Centered AI

Product-Centered AI refers to the development of products where AI is a central feature, such as ChatGPT or AI pins. The video script uses this term to discuss a category of AI development where Java may not be the best choice for executing ML models alone, but could be very competitive when considering the full scope of product development requirements.

💡Feature Integration

Feature Integration in the context of the video refers to incorporating AI capabilities as a feature into larger, pre-existing products, such as auto-tagging in Google Photos. The script argues that Java is well-suited for this category due to its strengths in various development aspects and the potential improvements from ongoing Java projects.

💡ML Libraries

ML Libraries are software libraries that provide tools and functionalities for machine learning tasks. The video mentions several Java ML libraries like TornadoVM, ONNX Runtime, and DJL, which are crucial for executing machine learning models and are part of the ecosystem that makes Java a strong contender for AI development.

💡Foreign-function-and-memory API

The foreign-function-and-memory API is a feature finalized in Java that allows Java programs to call foreign functions and access foreign memory regions. The video script highlights this API as a key improvement for Java's AI capabilities, especially for integrating with native libraries that can accelerate AI computations.

💡Data-Centric Applications

Data-Centric Applications are applications that are designed around the manipulation, analysis, and utilization of data. The video discusses how Java's recent improvements in support for data-centric applications, such as better performance characteristics, make it a strong candidate for AI projects where data handling is a significant part of the execution process.

Highlights

AI in Java is often seen as inferior, but this view is myopic and only applies to the current state of AI development.

Java is well-positioned for the future of AI with ongoing projects like Valhalla, Panama, and Babylon.

AI development can be categorized into developing, executing, and integrating machine learning models.

Java offers strong options for executing machine learning models with libraries like TornadoVM, ONNX Runtime, DJL, and more.

Project Valhalla aims to enhance Java with value types and potentially limited operator overloading for better performance and design.

Panama's vector API and foreign-function-and-memory API are set to improve Java's computational capabilities.

Project Babylon seeks to enable Java code to parse and derive new code for GPU execution, enhancing AI capabilities.

Java's strengths in performance, memory safety, and development speed make it suitable for AI features in projects.

Java's ecosystem for executing ML models is already strong and will be further improved by upcoming projects.

Java's support for data-centric applications is improving, aiding in AI model input preparation and output interpretation.

Java may not be the best for developing AI-centric products, but it is competitive and improving.

The importance of AI as a standalone product may be diminishing as AI becomes more integrated into other applications.

Java's explicit static typing and checked exceptions can be seen as a downside in the early stages of AI development.

Python remains the preferred platform for data scientists and AI model development due to its simplicity and robust libraries.

Java's integration into other projects and its ongoing improvements position it as a strong contender in the AI field.

The general opinion that 'AI in Java is bad' overlooks Java's competitive advantages and upcoming enhancements.

Subscribe to Inside Java Newscast for updates on Java features and their impact on AI development.

Transcripts

play00:00

"AI in Java is bad" is a commonly held  opinion out there that I, without knowing  

play00:04

much about this space, grudgingly accepted. But, being a Java fanboy I was annoyed by that  

play00:10

and I was waiting for Valhalla, Panama,  and Babylon to make sufficient progress,  

play00:14

so I could make a video about how AI in Java may  suck now, but will be _so good_ in the future. 

play00:20

But that's not this video! When I recently started looking into  

play00:23

the topic, I realized that "AI in Java is bad" is  a pretty myopic view that, if it's correct at all,  

play00:29

really only applies to this very moment  in AI development and that Java is already  

play00:33

well-positioned for the future of AI. And _on top_ of that come Valhalla,  

play00:37

Panama, and Babylon. Let me explain.

play00:42

Welcome everyone to the Inside  Java Newscast, where we cover  

play00:45

recent developments in the OpenJDK community. I'm Nicolai Parlog, Java Developer Advocate at  

play00:50

Oracle, and today we're gonna look at Java and AI. A quick note before we start: 

play00:54

I know that artificial intelligence is more than  just machine learning, but since the current AI  

play00:58

wave is basically exclusively ML-based, I'll  use the terms interchangeably in this video. 

play01:04

Ready? Then let's dive right in!

play01:06

I want to split AI development  into three categories.

play01:09

The first one is _developing_  a machine learning model. 

play01:12

Collecting data, preparing it for learning,  developing and training the model,  

play01:15

evaluating and iterating on it - all  the "original" machine learning tasks. 

play01:20

The output is a trained model that can  classify inputs, generate images or texts,  

play01:24

deny people life choices for inscrutable  reasons, start nuclear war, etc.

play01:28

Then there's _executing_ a machine  learning model based on some inputs. 

play01:32

Note that trained models can be exported  and imported by different languages,  

play01:37

so this can be done on an entirely different  platform than was used to train the model. 

play01:41

And thanks to a distinction MKBHD made me aware  of, I want to split execution into two categories.

play01:46

One (and the second overall category) is  development of a _product_ centered around such  

play01:51

an ML model, like ChatGPT or the Humane AI pin. This is mostly regular greenfield software  

play01:57

development, except that requirements for running  these models, like availability of ML libraries  

play02:02

or ease of pushing computations onto the  GPUs, dominate the overall requirements.

play02:07

The other (and third) category is integration  of a machine learning model as a _feature_  

play02:12

into larger, often pre-existing products. Think of auto-tagging and searching in Google  

play02:16

Photos, auto-subtitling in PowerPoint, and pretty  much everything Apple has just presented at WWDC. 

play02:22

Here, AI is just one of many requirements, one  of many forces acting on the project and in the  

play02:28

case of brownfield development (what a word),  these forces and the path of least resistance  

play02:33

are mostly known and running the model  must fit in with the existing architecture.

play02:37

Let's look at each of these three  categories separately and see how  

play02:40

suitable Java is for them and  what features may improve it. 

play02:44

We'll start with the last one and  work our way backwards from there.

play02:47

The easiest case for AI in Java  is when model execution needs to  

play02:51

be added to an existing Java project. Of course, in many situations you could  

play02:55

create a new service in an arbitrary language,  and incorporate that in your project via a REST  

play02:59

API or as foreign code, but, realistically,  you'd probably try to avoid that due to the  

play03:04

developmental and operational complexity. In this scenario, Java doesn't need to be  

play03:08

the best ecosystem for executing the model, it  just needs to be better than using a different  

play03:12

one minus the additional effort of having  it as a separate service and platform.

play03:17

And similar logic applies when creating a new  project where AI is just one of many features. 

play03:22

Java may not be the best ecosystem just  for model execution but it is really  

play03:26

strong and often top of its class in many other  important development aspects: strong typing,  

play03:31

good abstractions and core library, memory  safety, performance, observability, security,  

play03:36

cloud support, web server and framework  choice, 3rd party library choice in general,  

play03:41

development speed, developer base,  stability, and the list goes on and on.

play03:45

All that puts Java high up on the list for  projects that include AI-based features  

play03:50

assuming its support for model  execution is sufficiently good. 

play03:53

So how good is it? 

play03:55

On the library and runtime front, Java  offers a number of strong options: 

play03:59

TornadoVM, ONNX Runtime, DJL, Tribuo,  LangChain4j, just to name a few. 

play04:05

Many already support multi-CPU, GPU, and  even FPGA-accelerated computation and,  

play04:09

where applicable, we can expect their  integration with native libraries to  

play04:13

improve due to the recent finalization  of the foreign-function-and-memory API.

play04:17

And some OpenJDK projects are  working on features that will  

play04:21

further improve Java's capabilities in  this space, potentially dramatically,  

play04:25

and particularly when it comes  to executing models in pure Java:

play04:28

Project Valhalla aims to give us the capability  to define types that "code like a class,  

play04:33

work like an int", which is relevant here because  models like to use primitives like half-floats  

play04:38

that Java currently doesn't support. Beyond that, Valhalla will allow us to  

play04:42

write performant code that doesn't have to  sacrifice good design and maintainability,  

play04:46

which is essential for every software  project that will run in production for  

play04:49

anything longer than a few months. And another idea Valhalla might,  

play04:54

might (!), MIGHT (!) explore is  limited operator overloading,  

play04:58

which may allow us to define, for example,  multiplication for custom scalar... scalars? 

play05:03

Scalars? Sc... tensors and sca, scalars. That's what you get for  

play05:07

studying math in German. Skalare. Anyway... for custom scalars and tensors. 

play05:13

Then there's Panama's vector API, which can  speed up CPU-based computations dramatically. 

play05:18

And finally, and most directly aimed  at AI, there's Project Babylon. 

play05:21

Its goal is to allow Java code to parse other  Java code and derive new code that could either  

play05:28

be a different Java program or any kind of  foreign code, in this context specifically,  

play05:32

code that can be executed by a GPU. I strongly recommend Inside Java Newscast  

play05:37

#58 for a primer on Project Babylon. As part of his work on the project,  

play05:41

its lead Paul Sandoz explored how to  implement Triton (that's a domain-specific  

play05:45

Python platform for GPU computation) in  pure Java and got really good results.

play05:50

So the Java ecosystem for executing ML models is  already pretty strong and Valhalla's value types,  

play05:56

Panama's FFM and vector APIs, and Babylon's  code reflection will only strengthen it further,  

play06:01

whether by better integrating with native code or  by enabling pure Java implementations with similar  

play06:06

performance, giving projects the benefit of using  just one stack for the entire system or service.

play06:11

Of course even if we ignore the rest of the  application and focus on just running a model,  

play06:16

the code that does that consists of  more than calling `predictor.predict`. 

play06:20

Input data needs to be prepared before it  can be thrown at the model and, likewise,  

play06:24

its output needs to be interpreted and transformed  into something the user can understand. 

play06:28

This is likely to be a considerable portion of  the overall code for model execution and Java's  

play06:32

strengths apply here as well, particularly  its good performance characteristics and  

play06:36

its recently improved support for  designing data-centric applications. 

play06:39

So, yeah, I'm not worried about Java when  it comes to projects using AI as a feature.

play06:45

Most of what we just discussed also applies  to developing an AI-centered product in Java. 

play06:50

But of course, the larger the AI portion, the more  strengths and weaknesses in that area dominate  

play06:55

the overall evaluation of which platform to use. At this moment, is Java the best for just running  

play07:00

an ML model? No. 

play07:02

Is it the best for developing  an AI-centric product once we  

play07:05

factor in the surrounding requirements  we talked about in the previous section? 

play07:09

Maybe, it's definitely up there. Will it be the best once all the projects  

play07:12

I mentioned earlier bear fruit? I think it can be, yes.

play07:17

But here's a more interesting question: Does it matter? 

play07:21

I really liked MKBHD's opinion on this  and, by the way, the link to his video  

play07:24

as well to everything else I mention  here is of course in the description. 

play07:27

He makes a good argument for "AI  as a product" being mostly a fad. 

play07:32

For AI becoming mostly "just" a feature  in all kinds of other applications. 

play07:37

So as it looks now, I don't think this  category is particularly important.

play07:41

Which leaves us with the last category:  

play07:43

developing machine learning models  in Java and his is a tough one. 

play07:46

It needs everything we  described so far and then some.

play07:50

AI development is often done by people who don't  see themselves as being primarily a software  

play07:55

developer and so they value different things  about a platform than other developers might: 

play07:59

Ease of learning the language, example  code bases are always very important,  

play08:03

how quick you get to the first usable  results, simplicity over choice,  

play08:06

and also (occasionally or maybe even  often) simplicity over robustness.

play08:11

If certain language features only become  beneficial when you maintain a project of  

play08:14

sufficient size for a sufficient time  but appear to be in the way early on,  

play08:19

enforcing their use can  quickly be seen as a downside. 

play08:22

Looking at you, explicit static  typing and checked exceptions. 

play08:25

Thanks to Project Amber's on-ramp efforts,  Java made and will keep making significant  

play08:29

progress in this area, but it will  never be a scripting language.

play08:33

More importantly, though, elegant  model development requires a number  

play08:36

of specific language features and libraries: a type system that can easily handle heterogenous  

play08:40

data, some degree of operator overloading  is super helpful, ease of use when working  

play08:45

with mathematical functions (for example for  differentiation), libraries that were designed  

play08:49

to classify and analyze large data sets, and  really good and easy-to-use visualization tools.

play08:54

And Python is and will probably remain king here. The language is well-suited to these kinds of  

play08:59

applications and thanks to that has been the  platform of choice for data scientists for  

play09:03

about two decades now, which gives it a big leg  up on libraries and frameworks in that space.

play09:09

So Java isn't competitive when it comes  to developing machine learning models  

play09:12

and isn't top-of-the-class in creating  AI-centered products and this lead to  

play09:17

the general opinion that "AI in Java is bad". But this is due to our current place in the AI  

play09:22

timeline and overlooks the already dawning reality  that a big chunk of AI related development work  

play09:28

will be its integration into other projects and  there Java is already very competitive and will  

play09:33

only become stronger in the coming years, thanks  to projects like Valhalla, Panama, and Babylon.

play09:38

If you want to follow that development along,  make sure to subscribe if you haven't yet, as we  

play09:42

will cover every new Java feature as it hatches. And if you enjoyed this video, you can do me a  

play09:46

favor and leave a like, which also helps  putting it in front of more developers. 

play09:50

I'll see you again in two weeks. So long...

Rate This

5.0 / 5 (0 votes)

Related Tags
JavaAIMachine LearningOpenJDKValhallaPanamaBabylonSoftware DevelopmentAI IntegrationTech TrendsFuture Predictions