oneAPI Backend

The oneAPI backend of hls4ml is designed for deploying NNs on Intel/Altera FPGAs. It will eventually replace the Quartus backend, which should really have been called the Intel HLS backend. (The actual Quartus program continues to be used with IP produced by the oneAPI backend.) This section discusses details of the oneAPI backend.

The oneAPI code uses SYCL kernels to implement the logic that is deployed on FPGAs. It naturally leads to the accelerator style of programming. In the IP Component flow, which is currently the only flow supported, the kernel becomes the IP, and the “host code” becomes the testbench. An accelerator flow, with easier deployment on PCIe accelerator boards, is planned to be added in the future.

The produced work areas use cmake to build the projects in a style based oneAPI-samples. The standard fpga_emu, report, fpga_sim, and fpga are supported. Additionally, make lib produces the library used for calling the predict function from hls4ml. The compile and build commands in hls4ml interact with the cmake system, so one does not need to manually use the build system, but it there if desired.

The oneAPI backend, like the Quartus backend, only implements the Resource strategy for the layers. There is no Latency implementation of any of the layers.

Note: currently tracing and external weights (i.e. setting BramFactor) are not supported.

io_parallel and io_stream

As mentioned in the I/O Types section, io_parallel is for small models, while io_stream is for larger models. In oneAPI, there is an additional difference: io_stream implements each layer on its own task_sequence. Thus, the layers run in parallel, with pipes connecting the inputs and outputs. This is similar in style to the dataflow implementation on Vitis, but more explicit. On the other hand, io_parallel always uses a single task, relying on pipelining within the task for good performance. In contrast, the Vitis backend sometimes uses dataflow with io_parallel.