VivadoAccelerator backend of
hls4ml leverages the PYNQ software stack to easily deploy models on supported devices.
hls4ml supports the following boards:
but, in principle, support can be extended to any board supported by PYNQ. For the Zynq-based boards, there are two components: an ARM-based processing system (PS) and FPGA-based programmable logic (PL), with various intefaces between the two.
Neural Network Overlay
In the PYNQ project, programmable logic circuits are presented as hardware libraries called overlays.
The overlay can be accessed through a Python API.
hls4ml, we create a custom neural network overlay, which sends and receives data via AXI stream.
The target device is programmed using a bitfile that is generated by the
This example is taken from part 7 of the hls4ml tutorial.
Specifically, we’ll deploy a model on a
First, we generate the bitfile from a Keras model
model and a config.
import hls4ml config = hls4ml.utils.config_from_keras_model(model, granularity='name') hls_model = hls4ml.converters.convert_from_keras_model(model, hls_config=config, output_dir='hls4ml_prj_pynq', backend='VivadoAccelerator', board='pynq-z2') hls4ml.build(bitfile=True)
After this command completes, we will need to package up the bitfile, hardware handoff, and Python driver to copy to the PS of the board.
mkdir -p package cp hls4ml_prj_pynq/myproject_vivado_accelerator/project_1.runs/impl_1/design_1_wrapper.bit package/hls4ml_nn.bit cp hls4ml_prj_pynq/myproject_vivado_accelerator/project_1.srcs/sources_1/bd/design_1/hw_handoff/design_1.hwh package/hls4ml_nn.hwh cp hls4ml_prj_pynq/axi_stream_driver.py package/ tar -czvf package.tar.gz -C package/ .
Then we can copy this package to the PS of the board and untar it.
Finally, on the PS in Python we can create a
NeuralNetworkOverlay object, which will download the bitfile onto the PL of the board.
We also must provide the shapes of our input and output data,
y_test.shape, respectively, to allocate the buffers for the data transfer.
predict method will send the input data to the PL and return the output data
from axi_stream_driver import NeuralNetworkOverlay nn = NeuralNetworkOverlay('hls4ml_nn.bit', X_test.shape, y_test.shape) y_hw, latency, throughput = nn.predict(X_test, profile=True)