oneAPI
The oneAPI backend of hls4ml is designed for deploying NNs on Intel/Altera FPGAs. It will eventually replace the Quartus backend, which targeted Intel HLS. (Quartus continues to be used with IP produced by the oneAPI backend.) This section discusses details of the oneAPI backend.
The oneAPI code uses SYCL kernels to implement the logic that is deployed on FPGAs. It naturally leads to the accelerator style of programming. In the SYCL HLS (IP Component) flow, which is currently the only flow supported, the kernel becomes the IP, and the “host code” becomes the testbench. An accelerator flow, with easier deployment on PCIe accelerator boards, is planned to be added in the future.
The produced work areas use cmake to build the projects in a style based
oneAPI-samples.
The standard fpga_emu
, report
, fpga_sim
, and fpga
make targets are supported. Additionally, make lib
produces the library used for calling the predict
function from hls4ml. The compile
and build
commands
in hls4ml interact with the cmake system, so one does not need to manually use the build system, but it there
if desired.
The oneAPI backend, like the Quartus backend, only implements the Resource
strategy for the layers. There
is no Latency
implementation of any of the layers.
Note: currently tracing and external weights (i.e. setting BramFactor) are not supported.
io_parallel and io_stream
As mentioned in the I/O Types section, io_parallel
is for small models, while io_stream
is for
larger models. In oneAPI
, there is an additional difference: io_stream
implements each layer on its
own task_sequence
. Thus, the layers run in parallel, with pipes connecting the inputs and outputs. This
is similar in style to the dataflow implementation on Vitis HLS, but more explicit. It is also a change
relative to the Intel HLS-based Quartus
backend. On the other hand, io_parallel
always uses a single task,
relying on pipelining within the task for good performance. In contrast, the Vitis backend sometimes uses dataflow
with io_parallel
.