Introduction

FP-AI-VISION1 is a function pack (FP) demonstrating the capability of STM32H7 Series microcontrollers to execute a

Convolutional Neural Network (CNN) efficiently in relation to computer vision tasks. FP-AI-VISION1 contains everything needed

to build a CNN-based computer vision application on STM32H7 microcontrollers.

FP-AI-VISION1 also demonstrates several memory allocation configurations for the data involved in the application. Each

configuration enables the handling of specific requirements in terms of amount of data required by the application. Accordingly,

FP-AI-VISION1 implements examples describing how to place the different types of data efficiently in both the on-chip and

external memories. These examples enable the user to understand easily which memory allocation fits his requirements the

best.

This user manual describes the content of the FP-AI-VISION1 function pack and details the different steps to be carried out in

order to build a CNN-based computer vision application on STM32H7 microcontrollers.

Artificial Intelligence (AI) and computer vision function pack

for STM32H7 microcontrollers

UM2611

User manual

UM2611 - Rev 5 - August 2021

For further information contact your local STMicroelectronics sales office.

www.st.com

1 General information

The FP-AI-VISION1 function pack runs on the STM32H7 microcontrollers based on the Arm

Cortex

-M7

processor.

Note: Arm is a registered trademark of Arm Limited (or its subsidiaries) in the US and/or elsewhere.

1.1 FP-AI-VISION1 function pack feature overview

• Runs on the STM32H747I-DISCO board connected with the B-CAMS-OMV camera module bundle

(advised) or STM32F4DIS-CAM camera daughterboard (legacy only)

• Includes three image classification application examples based on CNN:

– One food recognition application operating on color (RGB 24 bits) frame images

– One person presence detection application operating on color (RGB 24 bits) frame images

– One person presence detection application operating on grayscale (8 bits) frame images

• Includes one people counting application (counting the number of persons in a scene) based on an object-

detection CNN model

• Includes complete application firmware for camera capture, frame image preprocessing, inference execution

and output post-processing

• Includes examples of integration of both floating-point and 8-bit quantized C models

• Supports several configurations for data memory placement in order to meet application requirements

• Includes test and validation firmware in order to test, debug and validate the embedded application

• Includes capture firmware enabling dataset collection

•

Includes support for file handling (on top of FatFS) on external microSD

™

card

• Includes a USB webcam application, which can be used to create image and video datasets as well as to

perform live testing on the host

UM2611

General information

UM2611 - Rev 5

page 2/65