Introduction
FP-AI-VISION1 is a function pack (FP) demonstrating the capability of STM32H7 Series microcontrollers to execute a
Convolutional Neural Network (CNN) efficiently in relation to computer vision tasks. FP-AI-VISION1 contains everything needed
to build a CNN-based computer vision application on STM32H7 microcontrollers.
FP-AI-VISION1 also demonstrates several memory allocation configurations for the data involved in the application. Each
configuration enables the handling of specific requirements in terms of amount of data required by the application. Accordingly,
FP-AI-VISION1 implements examples describing how to place the different types of data efficiently in both the on-chip and
external memories. These examples enable the user to understand easily which memory allocation fits his requirements the
best.
This user manual describes the content of the FP-AI-VISION1 function pack and details the different steps to be carried out in
order to build a CNN-based computer vision application on STM32H7 microcontrollers.
Artificial Intelligence (AI) and computer vision function pack
for STM32H7 microcontrollers
UM2611
User manual
UM2611 - Rev 5 - August 2021
For further information contact your local STMicroelectronics sales office.
www.st.com
1 General information
The FP-AI-VISION1 function pack runs on the STM32H7 microcontrollers based on the Arm
®
Cortex
®
-M7
processor.
Note: Arm is a registered trademark of Arm Limited (or its subsidiaries) in the US and/or elsewhere.
1.1 FP-AI-VISION1 function pack feature overview
Runs on the STM32H747I-DISCO board connected with the B-CAMS-OMV camera module bundle
(advised) or STM32F4DIS-CAM camera daughterboard (legacy only)
Includes three image classification application examples based on CNN:
One food recognition application operating on color (RGB 24 bits) frame images
One person presence detection application operating on color (RGB 24 bits) frame images
One person presence detection application operating on grayscale (8 bits) frame images
Includes one people counting application (counting the number of persons in a scene) based on an object-
detection CNN model
Includes complete application firmware for camera capture, frame image preprocessing, inference execution
and output post-processing
Includes examples of integration of both floating-point and 8-bit quantized C models
Supports several configurations for data memory placement in order to meet application requirements
Includes test and validation firmware in order to test, debug and validate the embedded application
Includes capture firmware enabling dataset collection
Includes support for file handling (on top of FatFS) on external microSD
card
Includes a USB webcam application, which can be used to create image and video datasets as well as to
perform live testing on the host
UM2611
General information
UM2611 - Rev 5
page 2/65