Part 7b: Deployment on PYNQ-Z2#
The following section is the code to execute in the pynq-z2 jupyter notebook to execute NN inference.
The following cells are intended to run on a pynq-z2, they will not run on the server used to train and synthesize models!
First, import our driver Overlay
class. We’ll also load the test data.
from axi_stream_driver import NeuralNetworkOverlay
import numpy as np
X_test = np.load('X_test.npy')
y_test = np.load('y_test.npy')
---------------------------------------------------------------------------
ModuleNotFoundError Traceback (most recent call last)
Cell In[1], line 1
----> 1 from axi_stream_driver import NeuralNetworkOverlay
2 import numpy as np
4 X_test = np.load('X_test.npy')
ModuleNotFoundError: No module named 'axi_stream_driver'
Create a NeuralNetworkOverlay
object. This will download the Overlay
(bitfile) onto the PL of the pynq-z2. We provide the X_test.shape
and y_test.shape
to allocate some buffers for the data transfer.
nn = NeuralNetworkOverlay('hls4ml_nn.bit', X_test.shape, y_test.shape)
Now run the prediction! When we set profile=True
the function times the inference, and prints out a summary as well as returning the profiling information. We also save the output to a file so we can do some validation.
y_hw, latency, throughput = nn.predict(X_test, profile=True)
An example print out looks like:
Classified 166000 samples in 0.402568 seconds (412352.6956936468 inferences / s)
Now let’s save the output and transfer this back to the host.
np.save('y_hw.npy', y_hw)
Now, go back to the host and follow part7c_validation.ipynb