TinyML: Getting Started with STM32 X-CUBE-AI | Digi-Key Electronics

DigiKey

20 Jul 202015:19

Summary

TLDRThis video introduces the X-Cube AI tool from STMicroelectronics, designed to help developers deploy pre-trained machine learning models on STM32 microcontrollers. The tutorial walks through converting a TensorFlow Lite model into a format compatible with X-Cube AI and demonstrates how to run inference on an STM32 L432KC Nucleo board. The video compares X-Cube AI's performance to TensorFlow Lite, showing improvements in flash space usage and inference speed. While X-Cube AI offers performance benefits, its proprietary nature makes it limited to STM32 processors, unlike TensorFlow Lite's open-source flexibility.

Takeaways

😀 STMicroelectronics launched the STM32 Cube AI suite in 2019 to help users deploy machine learning models on STM32 microcontrollers.
😀 The X-Cube AI tool allows for importing pre-trained neural network models (e.g., from Keras, TensorFlow, or ONNX) and deploying them on STM32 microcontrollers.
😀 Inference is the process of running unseen data through a trained model to make predictions (e.g., recognizing a cat in a photo).
😀 The X-Cube AI library is proprietary and specific to STM32 microcontrollers, differentiating it from open-source alternatives like TensorFlow Lite.
😀 The tutorial demonstrates how to deploy a TensorFlow Lite model (trained on sine function predictions) on an STM32 microcontroller using X-Cube AI.
😀 Model quantization (reducing precision to 8 bits) can save space, but was not covered in this video; instead, floating-point models were used.
😀 STM32 Cube IDE integrates the X-Cube AI tool and provides features like performance validation and analysis of the model's complexity.
😀 The X-Cube AI tool also lets users select different microcontroller options with an AI filter to ensure compatibility with the library.
😀 When setting up a project in STM32CubeIDE, users can configure model settings, load pre-trained models, and perform inference testing on the microcontroller.
😀 Using X-Cube AI, the model used in the demonstration only required about 28,000 bytes of flash memory and 4,900 bytes of RAM, running inference in about 77 microseconds.
😀 Compared to TensorFlow Lite, which needed about 50,000 bytes of flash and took 104 microseconds to run inference, X-Cube AI saved over 40% in flash memory and performed faster by 26%.
😀 Despite its performance advantages, X-Cube AI is closed-source and tied to STM32 processors, making it a less flexible option compared to open-source alternatives like TensorFlow Lite.

Q & A

What is the primary function of the X-Cube AI tool?
-The X-Cube AI tool allows users to deploy machine learning models, trained in frameworks like Keras, TensorFlow, or ONNX, onto STM32 microcontrollers. It helps convert trained models into a format that can run on resource-constrained devices and enables inference execution on microcontrollers.
How does inference work in the context of the X-Cube AI tool?
-Inference is the process of feeding new, unseen data into a trained machine learning model to make predictions. In the video example, the model trained to predict the sine function takes an input value (e.g., 2), processes it, and outputs a predicted result close to the actual sine of 2.
What is the role of the TensorFlow Lite model in this process?
-The TensorFlow Lite model is used as the base model in the demo. It was initially trained using Keras and TensorFlow to predict sine function outputs. The model is then converted into a TensorFlow Lite format, which is compatible with microcontrollers, and imported into the X-Cube AI tool for deployment.
What challenges did the speaker face when using X-Cube AI?
-The speaker encountered issues when loading Keras model files directly into X-Cube AI but found that TensorFlow Lite files worked well. This points to some compatibility issues between different model formats and the X-Cube AI tool.
What kind of STM32 microcontroller does the speaker use for the demo?
-The speaker uses an STM32L432KC Nucleo board, which features an ARM Cortex-M4 processor, for the demonstration. This board is compact and provides enough processing power to run the example machine learning model.
How does the X-Cube AI tool help manage the limitations of microcontrollers?
-X-Cube AI addresses microcontroller resource limitations by optimizing models for size and processing requirements. The tool ensures that models can fit into the available memory (RAM and flash) and can run efficiently on low-power microcontrollers.
What optimizations are mentioned for improving model performance on microcontrollers?
-The speaker mentions quantizing the model to reduce its size by converting the weights and biases to 8-bit precision, although this technique is not explored in the demo. Additionally, the X-Cube AI tool offers options for using external memory to handle larger models.
What is the significance of the flash and RAM usage during inference?
-Flash and RAM usage are important metrics for evaluating the efficiency of a machine learning model on a microcontroller. In the demo, X-Cube AI uses significantly less flash memory compared to TensorFlow Lite (about 40% less), while the RAM usage is comparable between both tools.
How do the inference times compare between TensorFlow Lite and X-Cube AI?
-The inference time for X-Cube AI is about 77 microseconds, which is slightly faster than the 104 microseconds required by TensorFlow Lite. This demonstrates the performance improvement achieved by using the X-Cube AI library on STM32 microcontrollers.
Why might developers choose X-Cube AI over TensorFlow Lite for STM32 microcontrollers?
-Developers might choose X-Cube AI because it offers better performance (in terms of speed and memory usage) on STM32 microcontrollers. While TensorFlow Lite is open-source and supports a wider range of platforms, X-Cube AI is tailored for STM32 devices and provides optimizations that could be beneficial in constrained environments.