Part 7b: Deployment on PYNQ-Z2#

The following section is the code to execute in the pynq-z2 jupyter notebook to execute NN inference.

The following cells are intended to run on a pynq-z2, they will not run on the server used to train and synthesize models!

First, import our driver Overlay class. We’ll also load the test data.

from axi_stream_driver import NeuralNetworkOverlay
import numpy as np

X_test = np.load('X_test.npy')
y_test = np.load('y_test.npy')
---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
Cell In[1], line 1
----> 1 from axi_stream_driver import NeuralNetworkOverlay
      2 import numpy as np
      4 X_test = np.load('X_test.npy')

ModuleNotFoundError: No module named 'axi_stream_driver'

Create a NeuralNetworkOverlay object. This will download the Overlay (bitfile) onto the PL of the pynq-z2. We provide the X_test.shape and y_test.shape to allocate some buffers for the data transfer.

nn = NeuralNetworkOverlay('hls4ml_nn.bit', X_test.shape, y_test.shape)

Now run the prediction! When we set profile=True the function times the inference, and prints out a summary as well as returning the profiling information. We also save the output to a file so we can do some validation.

y_hw, latency, throughput = nn.predict(X_test, profile=True)

An example print out looks like:

Classified 166000 samples in 0.402568 seconds (412352.6956936468 inferences / s)

Now let’s save the output and transfer this back to the host.

np.save('y_hw.npy', y_hw)

Now, go back to the host and follow part7c_validation.ipynb