oneAPI Backend
The oneAPI backend of hls4ml is designed for deploying NNs on Intel/Altera FPGAs. It will eventually
replace the Quartus backend, which should really have been called the Intel HLS backend. (The actual Quartus
program continues to be used with IP produced by the oneAPI backend.)
This section discusses details of the oneAPI backend.
The oneAPI code uses SYCL kernels to implement the logic that is deployed on FPGAs. It naturally leads to the
accelerator style of programming. In the IP Component flow, which is currently the only flow supported, the
kernel becomes the IP, and the “host code” becomes the testbench. An accelerator flow, with easier deployment on
PCIe accelerator boards, is planned to be added in the future.
The produced work areas use cmake to build the projects in a style based
oneAPI-samples.
The standard fpga_emu, report, fpga_sim, and fpga are supported. Additionally, make lib
produces the library used for calling the predict function from hls4ml. The compile and build commands
in hls4ml interact with the cmake system, so one does not need to manually use the build system, but it there
if desired.
The oneAPI backend, like the Quartus backend, only implements the Resource strategy for the layers. There
is no Latency implementation of any of the layers.
Note: currently tracing and external weights (i.e. setting BramFactor) are not supported.
io_parallel and io_stream
As mentioned in the I/O Types section, io_parallel is for small models, while io_stream is for
larger models. In oneAPI, there is an additional difference: io_stream implements each layer on its
own task_sequence. Thus, the layers run in parallel, with pipes connecting the inputs and outputs. This
is similar in style to the dataflow implementation on Vitis, but more explicit. On the other hand, io_parallel
always uses a single task, relying on pipelining within the task for good performance. In contrast, the Vitis
backend sometimes uses dataflow with io_parallel.