hls4ml.optimization.dsp_aware_pruning.keras package

Submodules

hls4ml.optimization.dsp_aware_pruning.keras.builder module

class hls4ml.optimization.dsp_aware_pruning.keras.builder.HyperOptimizationModel(model, attributes, optimizer, loss_fn, validation_metric, regularization_range)

Bases: HyperModel

Helper class for Keras Tuner

build(hp)

Builds a model.

Parameters:

hp – A HyperParameters instance.

Returns:

A model instance.

hls4ml.optimization.dsp_aware_pruning.keras.builder.build_optimizable_model(model, attributes, optimizer, loss_fn, validation_metric, increasing, train_dataset, validation_dataset, batch_size, epochs, verbose=False, directory='hls4ml-optimization-keras', tuner='Bayesian', regularization_range=[1e-06, 1.8478497974222906e-06, 3.414548873833601e-06, 6.30957344480193e-06, 1.165914401179831e-05, 2.1544346900318823e-05, 3.9810717055349695e-05, 7.356422544596421e-05, 0.00013593563908785255, 0.00025118864315095795, 0.00046415888336127773, 0.0008576958985908938, 0.001584893192461114, 0.0029286445646252374, 0.0054116952654646375, 0.01])

Function identifying optimizable layers and adding a regularization loss

Notes: - In general, the regularization and learning rate ranges do not need to be provided, as the implementation sets a generic enough range. if the user has an idea on the possible range on hyperparameter ranges, the tuning will complete faster. - The default tuner is Bayesian & when coupled with the correct ranges of hyperparameters, it performs quite well, fast. However, older version of Keras Tuner had a crashing bug with it. - In general, the directory does not need to be specified. However, if pruning several models simultaneously, to avoid conflicting intermediate results, it is useful to specify directory.

Parameters:
  • model (keras.Model) – Model to be optimized

  • attributes (dict) – Layer-wise model attributes, obtained from hls4ml.optimization.get_attributes_from_keras_model()

  • optimizer (keras.optimizers.Optimizer) – Optimizer used during training

  • loss_fn (keras.losses.Loss) – Loss function used during training

  • validation_metric (keras.metrics.Metric) – Validation metric, used as a baseline

  • train_dataset (tf.Dataset) – Training inputs and labels, in the form of an iterable TF Dataset

  • validation_dataset (tf.Dataset) – Validation inputs and labels, in the form of an iterable TF Dataset

  • batch_size (int) – Batch size during training

  • epochs (int) – Maximum number of epochs to fine-tune model, in one iteration of pruning

  • verbose (bool) – Whether to log tuner outputs to the console

  • directory (string) – Directory to store tuning results

  • tuner (str) – Tuning algorithm, choose between Bayesian and Hyperband

  • regularization_range (list) – List of suitable hyperparameters for weight decay

  • learning_rate_range (list) – List of suitable hyperparameters for learning rate

Returns:

Model prepared for optimization

Return type:

keras.Model

hls4ml.optimization.dsp_aware_pruning.keras.builder.remove_custom_regularizers(model)

Helper function to remove custom regularizers (DenseRegularizer & Conv2DRegularizer) This makes it possible to load the model in a different environment without hls4ml installed

Parameters:

model (keras.Model) – Baseline model

Returns:

Model without custom regularizers

Return type:

keras.Model

hls4ml.optimization.dsp_aware_pruning.keras.config module

hls4ml.optimization.dsp_aware_pruning.keras.config.SUPPORTED_LAYERS = (<class 'keras.src.layers.core.dense.Dense'>, <class 'keras.src.layers.convolutional.conv2d.Conv2D'>, <class 'qkeras.qlayers.QDense'>, <class 'qkeras.qconvolutional.QConv2D'>)

Supported ranking metrics, for classifying redundant (groups of) weights

  1. l1 - groups of weights are ranked by their l1 norm

  2. l2 - groups of weights are ranked by their l2 norm

  3. oracle - abs(dL / dw * w), introduced by Molchanov et al. (2016)

    Pruning Convolutional Neural Networks for Resource Efficient Inference

  4. saliency - (d^2L / dw^2 * w)^2, introduced by Lecun et al. (1989) Optimal Brain Damage

hls4ml.optimization.dsp_aware_pruning.keras.config.SUPPORTED_METRICS = ('l1', 'l2', 'oracle', 'saliency')

Temporary directory for storing best models, tuning results etc.

hls4ml.optimization.dsp_aware_pruning.keras.masking module

hls4ml.optimization.dsp_aware_pruning.keras.masking.get_model_masks(keras_model, model_attributes, sparsity, objective, metric='l1', local=False, gradients=None, hessians=None, knapsack_solver='CBC_MIP')

Function calculating a binary mask for all optimizable layers Entries equal to one correspond to the weight being updated during the training Entries equal to zero correspond to the weight being frozen during the training

Masking is such that:
  • resource_utilization <= (1 - sparsity) * baseline_utilization OR

  • resource_saving > sparsity * baseline_utilization [equivalent formulation]

Offsets are used for weight sharing - in the case of weight sharing, the mask is set to zero Therefore, the weights will be frozen during training; however, they still need to be the mean of the group Offsets represent the mean of each weight-shared group - therefore, it is important to have offsets only for frozen weights; that is where the corresponding entry in the mask tensor is zero

If a layer supports both weight sharing and pruning, both the norm and variance of the group are calculated And the smaller one is considered; so if the norm is smaller, the group will be considered for pruning Otherise, the group will be considered for weight sharing. Both the norm and variance are normalized, to avoid magnitude biases.

Parameters:
  • keras_model (keras.model) – Model to be masked

  • model_attributes (dict) – A layer-wise dictionary of LayerAttributes classes

  • sparsity (float) – Desired sparsity, with respect to the objective

  • objective (ObjectiveEstimator) – Objective to be minimized (e.g. DSP, FLOPs etc.)

  • metric (string) – Weight ranking metric - l1, l2, Oracle, saliency

  • local (boolean) – Equal layer-wise sparsity

  • gradients (dict) – A layer-wise dictionary of weight gradients (needed for Oracle ranking)

  • hessians (dict) – A layer-wise dictionary of second gradients (needed for saliency ranking)

  • knapsack_solver (str) – Algorithm for solving Knapsack problem; recommended is to use default. Unless dealing with highly dimensional problems, in which case greedy is better.

Returns:

tuple containing

  • masks (dict): Layer-wise dictionary of binary tensors

  • offsets (dict): Layer-wise dictionary of offsets for every weight

hls4ml.optimization.dsp_aware_pruning.keras.reduction module

hls4ml.optimization.dsp_aware_pruning.keras.reduction.reduce_model(model)

Function for removing zero neurons & filters from a model and rewiring the model graph This function is built on top of Keras Surgeon available at: https://github.com/BenWhetton/keras-surgeon Keras Surgeon is no longer under active development and does not work for TensorFlow 2.3+ and QKeras The baseline version was forked and updated, available at: https://github.com/fastmachinelearning/keras-surgeon

IMPORTANT: To use this funcionality please install separately from the above GitHub.

Parameters:

model (keras.model) – Input model

Returns:

Modified model, with redundant structures removed

Return type:

reduced (keras.model)

hls4ml.optimization.dsp_aware_pruning.keras.regularizers module

class hls4ml.optimization.dsp_aware_pruning.keras.regularizers.Conv2DRegularizer(alpha, beta=0, norm=1, structure_type=SUPPORTED_STRUCTURES.UNSTRUCTURED, pattern_offset=1, consecutive_patterns=1)

Bases: Regularizer

A flexible regularizer for Conv2D layers, simultaneously performing pruning and clustering

Parameters:
  • alpha (float) – Sparse penalty; a higher value pushes more weights towards zero

  • beta (float) – Variance penalty; a higher value reduces variance between a group of weights

  • norm (int) – Norm type (l1 or l2)

  • structure_type (string) – Type of regularization - unstructured, structured, pattern

  • pattern_offset (int) – Length of each pattern if structure_type == pattern

  • weights (tf.Variable) – Four-dimensional layer weight tensor, dimensionality (filter_width x filter_height x n_chan x n_filt)

Returns:

Penalty associated with layer weights

Return type:

Regularizer penalty (tf.Variable)

Example use cases:
  • structure_type = unstructured: unstructured weight regularization

  • structure_type = structured: filter regularization

    (group weights of dimensionality filt_width x filt_height x n_chan)

  • structure_type = pattern: regularization on groups of every n-th weight in flattened array

    (e.g. grouping by reuse factor in hls4ml)

get_config()

Returns the config of the regularizer.

An regularizer config is a Python dictionary (serializable) containing all configuration parameters of the regularizer. The same regularizer can be reinstantiated later (without any saved state) from this configuration.

This method is optional if you are just training and executing models, exporting to and from SavedModels, or using weight checkpoints.

This method is required for Keras model_to_estimator, saving and loading models to HDF5 formats, Keras model cloning, some visualization utilities, and exporting models to and from JSON.

Returns:

Python dictionary.

class hls4ml.optimization.dsp_aware_pruning.keras.regularizers.DenseRegularizer(alpha, beta=0, norm=1, structure_type=SUPPORTED_STRUCTURES.UNSTRUCTURED, block_shape=(1, 1), pattern_offset=1, consecutive_patterns=1)

Bases: Regularizer

A flexible regularizer for Dense layers, simultaneously penalizing high values and variance

Parameters:
  • alpha (float) – Sparse penalty; a higher value pushes more weights towards zero

  • beta (float) – Variance penalty; a higher value reduces variance between a group of weights

  • norm (int) – Norm type (l1 or l2)

  • structure_type (string) – Type of regularization - unstructured, structured, pattern, block

  • block_shape (tuple) – Block shape if structure_type == block

  • pattern_offset (int) – Length of each pattern if structure_type == pattern

  • consecutive_patterns (int) – How many consecutive patterns should be considered

  • weights (tf.Variable) – Two-dimensional layer weight tensor, dimensionality (M x N)

Returns:

Penalty associated with layer weights

Return type:

Regularizer penalty (tf.Variable)

Examples

  • structure_type = unstructured: unstructured weight regularization

  • structure_type = structured: neuron regularization

    (group weights by row)

  • structure_type = pattern: regularization on groups of every n-th weight

    (e.g. grouping by reuse factor in hls4ml)

  • structure_type = block: regularization on blocks within weight matrix

    (e.g. 4x4, 8x1 for certain SIMD processors)

  • consecutive_patterns is commonly encountered with optimization of BRAM utilization -

    e.g. while it is true that each DSP pattern consumes one DSP, They likely use less than one BRAM block (e.g. if the BRAM width is 36 bit and weight width is 16) In that case, we need to group several patterns together, So the entire block of patterns can be removed, thus saving DSP and BRAM

get_config()

Returns the config of the regularizer.

An regularizer config is a Python dictionary (serializable) containing all configuration parameters of the regularizer. The same regularizer can be reinstantiated later (without any saved state) from this configuration.

This method is optional if you are just training and executing models, exporting to and from SavedModels, or using weight checkpoints.

This method is required for Keras model_to_estimator, saving and loading models to HDF5 formats, Keras model cloning, some visualization utilities, and exporting models to and from JSON.

Returns:

Python dictionary.

hls4ml.optimization.dsp_aware_pruning.keras.utils module

hls4ml.optimization.dsp_aware_pruning.keras.utils.get_last_layer_with_weights(model)

Finds the last layer with weights

The last layer with weights determined the output shape, so, pruning is sometimes not applicable to it. As an example, consider a network with 16 - 32 - 5 neurons - the last layer’s neuron (5) cannot be removed since they map to the data labels

Parameters:

model (keras.model) – Input model

Returns:

Index location of last layer with params

Return type:

idx (int)

hls4ml.optimization.dsp_aware_pruning.keras.utils.get_model_gradients(model, loss_fn, X, y)

Calculate model gradients with respect to weights

Parameters:
  • model (keras.model) – Input model

  • loss_fn (keras.losses.Loss) – Model loss function

  • X (np.array) – Input data

  • y (np.array) – Output data

Returns:

Per-layer gradients of loss with respect to weights

Return type:

grads (dict)

hls4ml.optimization.dsp_aware_pruning.keras.utils.get_model_hessians(model, loss_fn, X, y)

Calculate the second derivatives of the loss with repsect to model weights.

Note, only diagonal elements of the Hessian are computed.

Parameters:
  • model (keras.model) – Input model

  • loss_fn (keras.losses.Loss) – Model loss function

  • X (np.array) – Input data

  • y (np.array) – Output data

Returns:

Per-layer second derivatives of loss with respect to weights

Return type:

grads (dict)

hls4ml.optimization.dsp_aware_pruning.keras.utils.get_model_sparsity(model)

Calculate total and per-layer model sparsity

Parameters:

model (-) – Model to be evaluated

Returns:

tuple containing

  • sparsity (float): Model sparsity, as a percentage of zero weights w.r.t to total number of model weights

  • layers (dict): Key-value dictionary; each key is a layer name and the associated value is the layer’s sparsity

TODO - Extend support for recurrent layers (reccurent_kernel)

Module contents

class hls4ml.optimization.dsp_aware_pruning.keras.MaskedBackprop(model, loss_fn, attributes)

Bases: object

A helper class to perform masked backprop (training with frozen weights) The important function is __call__ as it masks gradients, based on frozen weights While this function can exist without a class, taking masks as input would deplete memory Since a new graph is created for every call, causing a large run-time The trick is to set the masks, models etc. as class variables and then pass the sparsity As the sparsity changes, a new graph of the function is created

update_masks(masks)
hls4ml.optimization.dsp_aware_pruning.keras.optimize_model(model, model_attributes, objective, scheduler, X_train, y_train, X_val, y_val, batch_size, epochs, optimizer, loss_fn, validation_metric, increasing, rtol, callbacks=None, ranking_metric='l1', local=False, verbose=False, rewinding_epochs=1, cutoff_bad_trials=1, directory='hls4ml-optimization-keras', tuner='Bayesian', knapsack_solver='CBC_MIP', regularization_range=[1e-06, 1.8478497974222906e-06, 3.414548873833601e-06, 6.30957344480193e-06, 1.165914401179831e-05, 2.1544346900318823e-05, 3.9810717055349695e-05, 7.356422544596421e-05, 0.00013593563908785255, 0.00025118864315095795, 0.00046415888336127773, 0.0008576958985908938, 0.001584893192461114, 0.0029286445646252374, 0.0054116952654646375, 0.01])

Top-level function for optimizing a Keras model, given objectives

Parameters:
  • model (keras.Model) – Model to be optimized

  • model_attributes (dict) – Layer-wise model attributes, obtained from hls4ml.optimization.get_attributes_from_keras_model(…)

  • objective (hls4ml.optimization.objectives.ObjectiveEstimator) – Parameter, hardware or user-defined objective of optimization

  • scheduler (hls4ml.optimization.scheduler.OptimizationScheduler) – Sparsity scheduler, choose between constant, polynomial and binary

  • X_train (np.array) – Training inputs

  • y_train (np.array) – Training labels

  • X_val (np.array) – Validation inputs

  • y_val (np.array) – Validation labels

  • batch_size (int) – Batch size during training

  • epochs (int) – Maximum number of epochs to fine-tune model, in one iteration of pruning

  • optimizer (keras.optimizers.Optimizer or equivalent-string description) – Optimizer used during training

  • loss_fn (keras.losses.Loss or equivalent loss description) – Loss function used during training

  • validation_metric (keras.metrics.Metric or equivalent loss description) – Validation metric, used as a baseline

  • increasing (boolean) – If the metric improves with increased values; e.g. accuracy -> increasing = True, MSE -> increasing = False

  • rtol (float) – Relative tolerance; pruning stops when pruned_validation_metric < (or >) rtol * baseline_validation_metric

  • callbacks (list of keras.callbacks.Callback)

  • ranking_metric (string) – Metric used for ranking weights and structures; currently supported l1, l2, saliency and Oracle

  • local (boolean) – Layer-wise or global pruning

  • verbose (boolean) – Display debug logs during model optimization

  • rewinding_epochs (int) – Number of epochs to retrain model without weight freezing, allows regrowth of previously pruned weights

  • cutoff_bad_trials (int) – After how many bad trials (performance below threshold), should model pruning / weight sharing stop

  • directory (string) – Directory to store temporary results

  • tuner (str) – Tuning algorithm, choose between Bayesian, Hyperband and None

  • knapsack_solver (str) – Algorithm to solve Knapsack problem when optimizing; default usually works well; for very large networks, greedy algorithm might be more suitable

  • regularization_range (list) – List of suitable hyperparameters for weight decay

Returns:

Optimized model

Return type:

keras.Model