Software Details

Frontends and Backends

In hls4ml there is a a concept of a frontend to parse the input NN into an internal model graph, and a backend that controls what type of output is produced from the graph. Frontends and backends can be independently chosen. Examples of frontends are the parsers for Keras or ONNX, and examples of backends are Vivado HLS, Intel HLS, and Vitis HLS. See Status and Features for the currently supported frontends and backends.

I/O Types

hls4ml supports multiple styles for handling data between layers, known as the io_type.

io_parallel

Data is passed in parallel between the layers. This is good for MLP networks and small CNNs. Synthesis may fail for larger networks.

io_stream

Data is passed one “pixel” at a time. Each pixel is an array of channels, which are always sent in parallel. This method for sending data between layers is recommended for larger CNNs. For Dense layers, all the inputs are streamed in parallel as a single array.

With the io_stream IO type, each layer is connected with the subsequent layer through first-in first-out (FIFO) buffers. The implementation of the FIFO buffers contribute to the overall resource utilization of the design, impacting in particular the BRAM or LUT utilization. Because the neural networks can have complex architectures generally, it is hard to know a priori the correct depth of each FIFO buffer. By default hls4ml choses the most conservative possible depth for each FIFO buffer, which can result in a an unnecessary overutilization of resources.

In order to reduce the impact on the resources used for FIFO buffer implementation, we have a FIFO depth optimization flow. This is described in the FIFO Buffer Depth Optimization section.