Optimizer Passes and Flows

Internal Structure

The hls4ml library will parse models from Keras, PyTorch or ONNX into an internal execution graph. This model graph is represented with the ModelGraph class. The nodes in this graph, corresponding to the layer and operations of the input model are represented by classes derived from the Layer base class.

Layers are required to have defined inputs and outputs that define how they are connected in the graph and what is the shape of their output. All information about the layer’s state and configuration is stored in its attributes. All weights, variables and data types are attributes and there are mapping views to sort through them. Layers can define expected attributes and can be verified for correctness, or to produce a list of configurable attributes that user can tweak.

Optimizer passes

To reach a state from which the code can be generated, the internal model graph undergoes a series of optimizations (transformations), dubbed optimization passes. All transformations of the model and any modification to any layer’s attributes must be implemented through an optimization pass. All optimizer passes derive from the OptimizerPass class. Optimizer passes are usually applied to nodes/layers; however, a special class ModelOptimizerPass exists that is applied on the full model. An example of a layer optimizer is fuse_biasadd, which adds a bias to a Dense, Conv1D, or Conv2D layer, while an example of an optimizer pass that runs on the full model is MakeStamp, which creates a unique number (stamp).

Subclasses of OptimizerPass must provide a criteria in match function that, if satisfied, will perform the transformation from transform function. The boolean return value of transform indicates if the optimizer pass made changes to the model graph that may require running the optimizers again. In that case, optimizers in a flow are run again.

Optimizers can be general, independent of the backend, in which case they are located in hls4ml.model.optimizer.passes, or they may be backend-specific, in which case they are located in a folder dependent on the backend, e.g., hls4ml.backends.vivado.passes or hls4ml.backends.quartus.passes. A common set of optimizers that are used by FPGA backends are located in hls4ml.backends.fpga.passes.

Certain optimizers are used frequently enough that it makes sense to define special classes, which inherit from OptimizerPass

GlobalOptimizerPass: An optimizer pass that matches each node. This is useful, for example, to transform the types for a particular backend.

LayerOptimizerPass: An optimizer pass that matches each node of a particular layer type. This is useful, for example, to write out the HLS code for a particular node that remains in the final graph.

ConfigurableOptimizerPass: An optimizer pass that has some configurable parameters.

Template: An optimizer pass that populates a code template and assigns it to an attribute of a given layer. This is commonly used to generate code blocks in later stages of the conversion.

Note that LayerOptimizerPass and ModelOptimizerPass also exist as decorators that wrap a function.

New optimizers can be registered with the register_pass(). Optimizers should be assigned to a flow (see below).

Flows

A Flow is an ordered set of optimizers that represents a single stage in the conversion process. The optimizers from a flow are applied in sequence until they no longer make changes to the model graph (controlled by the transform return value), after which the next flow (stage) can start. Flows may require that other flows are applied before them, ensuring the model graph is in a desired state before a flow starts. The function register_flow() is used to register a new flow. Flows are applied on a model graph with apply_flow().

There are common model-level flows that can run regardless of the backend, and there are backend-specific flows. The convert and optimize flows do not depend on a backend.

Each backend provides provides a default flow that defines the default target for that backend. For example, the Vivado backend defaults to an IP flow that requires additional flows and produces an IP. It runs no optimizers itself, but it requires that many other flows (sub-flows) to have run. The convert and optimize flows defined above are some of these required sub-flows.

Another example is FIFO buffer depth optimization explained in the FIFO Buffer Depth Optimization section.