hls4ml.backends.fpga package

Subpackages

hls4ml.backends.fpga.passes package

Submodules

hls4ml.backends.fpga.fpga_backend module

class hls4ml.backends.fpga.fpga_backend.FPGABackend(name)

Bases: Backend

compile(model)

Compile the generated project that can be linked into Python runtime.

Parameters:: model (ModelGraph) – Model to compile.
Raises:: Exception – If the project failed to compile
Returns:: Returns the name of the compiled library.
Return type:: string

compute_conv1d_instructions(in_W, in_C, kernel_size=3, stride=1, pad=0)

compute_conv2d_instructions(in_H, in_W, in_C, kernel_size=3, stride=1, pad=0)

classmethod convert_precision_string(precision)

create_layer_class(layer_class)

Wrap the original layer class into the backend-specific layer class.

Backends should extend base layer classes with new attributes and variables as needed. These new classes are then used within the model.

Parameters:: layer_class (class) – Base class to extend

generate_conv1d_line_buffer_fn(layer_idx, n_partitions, in_W, in_C, kernel=3, stride=1, pad=0, dilation=1)

Generate a C++ function that mimics the im2col algorithm. This function works for 1D convolution.

The HLS compiler produces suboptimal designs for a im2col algorithm implementation, so a trick we use is to generate a resulting a result of im2col transformation explicitly, instead of relying on loops. Since the result depends on the parameters of the convolution layer (the input size, the kernel size, stride etc), we need to do this for every convolution layer.

Parameters:

layer_idx (int) – Index of layer (‘index’ attribute).
n_partitions (int) – Number of partitions to divide the input into. The pixels in each partition will be processed in parallel.
in_W (int) – Width of input.
in_C (int) – Number of channels.
kernel (int, optional) – Size of the kernel. Defaults to 3.
stride (int, optional) – Stride length. Defaults to 1.
pad (int or Iterable, optional) – Padding to apply. Defaults to 0. Specified as either a number or a list [left_pad, right_pad].
dilation (int, optional) – Dilation rate. Defaults to 1.

Returns:

Generated C++ function

Return type:

str

generate_conv2d_line_buffer_fn(layer_idx, n_partitions, in_H, in_W, in_C, kernel=(3, 3), stride=(1, 1), pad=(0, 0, 0, 0), dilation=(1, 1))

Generate a C++ function that mimics the im2col algorithm. This function works for 2D convolution.

The HLS compiler produces suboptimal designs for a im2col algorithm implementation, so a trick we use is to generate a resulting a result of im2col transformation explicitly, instead of relying on loops. Since the result depends on the parameters of the convolution layer (the input size, the kernel size, stride etc), we need to do this for every convolution layer.

Parameters:

layer_idx (int) – Index of layer (‘index’ attribute).
n_partitions (int) – Number of partitions to divide the input into. The pixels in each partition will be processed in parallel.
in_H (int) – Height of input.
in_W (int) – Width of input.
in_C (int) – Number of channels.
kernel (int or Iterable, optional) – Size of the kernel. Defaults to (3,3).
stride (int or Iterable, optional) – Stride length. Defaults to (1,1).
pad (int or Iterable, optional) – Padding to apply. Defaults to 0. Specified as either a number or a list [top_pad, bottom_pad, left_pad, right_pad].
dilation (int or Iterable, optional) – Dilation rate. Defaults to (1,1).

Returns:

Generated C++ function

Return type:

str

get_closest_reuse_factor(valid_rf, chosen_rf): Returns closest value to chosen_rf. valid_rf is sorted (obtained from get_valid_reuse_factors()) If two numbers are equally close, return the smallest number.

get_layer_mult_size(layer)

get_valid_conv_partition_splits(out_height, out_width)

Generate valid partition splits of a Conv1D/2D layer.

Essentially a list of divisors of the number of pixels of the output image.

Parameters:

out_height (int) – The height of the output image
out_width (int) – The width of the output image

Returns:

List of valid partition splits

Return type:

list

get_valid_reuse_factors(n_in, n_out)

get_writer_flow()

product_type(data_T, weight_T): Helper function to determine which product implementation to use during inference

set_closest_reuse_factor(layer, n_in, n_out, attribute='reuse_factor', include_max_rf=True)

set_target_reuse_factor(layer)

write(model)

Write the generated project to disk.

This function converts the model to C++ and writes the generated files in the output directory specified in the config.

Parameters:: model (ModelGraph) – Model to write.

write_hls(model)

hls4ml.backends.fpga.fpga_layers module

class hls4ml.backends.fpga.fpga_layers.BatchNormalizationQuantizedTanh(model, name, attributes, inputs, outputs=None, initialize=True)

Bases: Layer

Merged Batch Normalization and quantized (binary or ternary) Tanh layer. The mean, variance, beta, gamma parameters are folded into the threshold(s) at which the sign of the input flips after the quantized (binary or ternary) Tanh activation.