hls4ml.optimization.dsp_aware_pruning.keras package
Submodules
hls4ml.optimization.dsp_aware_pruning.keras.builder module
- class hls4ml.optimization.dsp_aware_pruning.keras.builder.HyperOptimizationModel(model, attributes, optimizer, loss_fn, validation_metric, regularization_range)
Bases:
HyperModel
Helper class for Keras Tuner
- build(hp)
Builds a model.
- Parameters:
hp – A HyperParameters instance.
- Returns:
A model instance.
- hls4ml.optimization.dsp_aware_pruning.keras.builder.build_optimizable_model(model, attributes, optimizer, loss_fn, validation_metric, increasing, train_dataset, validation_dataset, batch_size, epochs, verbose=False, directory='hls4ml-optimization-keras', tuner='Bayesian', regularization_range=[1e-06, 1.8478497974222906e-06, 3.414548873833601e-06, 6.30957344480193e-06, 1.165914401179831e-05, 2.1544346900318823e-05, 3.9810717055349695e-05, 7.356422544596421e-05, 0.00013593563908785255, 0.00025118864315095795, 0.00046415888336127773, 0.0008576958985908938, 0.001584893192461114, 0.0029286445646252374, 0.0054116952654646375, 0.01])
Function identifying optimizable layers and adding a regularization loss
Notes: - In general, the regularization and learning rate ranges do not need to be provided, as the implementation sets a generic enough range. if the user has an idea on the possible range on hyperparameter ranges, the tuning will complete faster. - The default tuner is Bayesian & when coupled with the correct ranges of hyperparameters, it performs quite well, fast. However, older version of Keras Tuner had a crashing bug with it. - In general, the directory does not need to be specified. However, if pruning several models simultaneously, to avoid conflicting intermediate results, it is useful to specify directory.
- Parameters:
model (keras.Model) – Model to be optimized
attributes (dict) – Layer-wise model attributes, obtained from hls4ml.optimization.get_attributes_from_keras_model()
optimizer (keras.optimizers.Optimizer) – Optimizer used during training
loss_fn (keras.losses.Loss) – Loss function used during training
validation_metric (keras.metrics.Metric) – Validation metric, used as a baseline
train_dataset (tf.Dataset) – Training inputs and labels, in the form of an iterable TF Dataset
validation_dataset (tf.Dataset) – Validation inputs and labels, in the form of an iterable TF Dataset
batch_size (int) – Batch size during training
epochs (int) – Maximum number of epochs to fine-tune model, in one iteration of pruning
verbose (bool) – Whether to log tuner outputs to the console
directory (string) – Directory to store tuning results
tuner (str) – Tuning algorithm, choose between Bayesian and Hyperband
regularization_range (list) – List of suitable hyperparameters for weight decay
learning_rate_range (list) – List of suitable hyperparameters for learning rate
- Returns:
Model prepared for optimization
- Return type:
keras.Model
- hls4ml.optimization.dsp_aware_pruning.keras.builder.remove_custom_regularizers(model)
Helper function to remove custom regularizers (DenseRegularizer & Conv2DRegularizer) This makes it possible to load the model in a different environment without hls4ml installed
- Parameters:
model (keras.Model) – Baseline model
- Returns:
Model without custom regularizers
- Return type:
keras.Model
hls4ml.optimization.dsp_aware_pruning.keras.config module
- hls4ml.optimization.dsp_aware_pruning.keras.config.SUPPORTED_LAYERS = (<class 'keras.src.layers.core.dense.Dense'>, <class 'keras.src.layers.convolutional.conv2d.Conv2D'>, <class 'qkeras.qlayers.QDense'>, <class 'qkeras.qconvolutional.QConv2D'>)
Supported ranking metrics, for classifying redundant (groups of) weights
l1 - groups of weights are ranked by their l1 norm
l2 - groups of weights are ranked by their l2 norm
- oracle - abs(dL / dw * w), introduced by Molchanov et al. (2016)
Pruning Convolutional Neural Networks for Resource Efficient Inference
saliency - (d^2L / dw^2 * w)^2, introduced by Lecun et al. (1989) Optimal Brain Damage
- hls4ml.optimization.dsp_aware_pruning.keras.config.SUPPORTED_METRICS = ('l1', 'l2', 'oracle', 'saliency')
Temporary directory for storing best models, tuning results etc.
hls4ml.optimization.dsp_aware_pruning.keras.masking module
- hls4ml.optimization.dsp_aware_pruning.keras.masking.get_model_masks(keras_model, model_attributes, sparsity, objective, metric='l1', local=False, gradients=None, hessians=None, knapsack_solver='CBC_MIP')
Function calculating a binary mask for all optimizable layers Entries equal to one correspond to the weight being updated during the training Entries equal to zero correspond to the weight being frozen during the training
- Masking is such that:
resource_utilization <= (1 - sparsity) * baseline_utilization OR
resource_saving > sparsity * baseline_utilization [equivalent formulation]
Offsets are used for weight sharing - in the case of weight sharing, the mask is set to zero Therefore, the weights will be frozen during training; however, they still need to be the mean of the group Offsets represent the mean of each weight-shared group - therefore, it is important to have offsets only for frozen weights; that is where the corresponding entry in the mask tensor is zero
If a layer supports both weight sharing and pruning, both the norm and variance of the group are calculated And the smaller one is considered; so if the norm is smaller, the group will be considered for pruning Otherise, the group will be considered for weight sharing. Both the norm and variance are normalized, to avoid magnitude biases.
- Parameters:
keras_model (keras.model) – Model to be masked
model_attributes (dict) – A layer-wise dictionary of LayerAttributes classes
sparsity (float) – Desired sparsity, with respect to the objective
objective (ObjectiveEstimator) – Objective to be minimized (e.g. DSP, FLOPs etc.)
metric (string) – Weight ranking metric - l1, l2, Oracle, saliency
local (boolean) – Equal layer-wise sparsity
gradients (dict) – A layer-wise dictionary of weight gradients (needed for Oracle ranking)
hessians (dict) – A layer-wise dictionary of second gradients (needed for saliency ranking)
knapsack_solver (str) – Algorithm for solving Knapsack problem; recommended is to use default. Unless dealing with highly dimensional problems, in which case greedy is better.
- Returns:
tuple containing
masks (dict): Layer-wise dictionary of binary tensors
offsets (dict): Layer-wise dictionary of offsets for every weight
hls4ml.optimization.dsp_aware_pruning.keras.reduction module
- hls4ml.optimization.dsp_aware_pruning.keras.reduction.reduce_model(model)
Function for removing zero neurons & filters from a model and rewiring the model graph This function is built on top of Keras Surgeon available at: https://github.com/BenWhetton/keras-surgeon Keras Surgeon is no longer under active development and does not work for TensorFlow 2.3+ and QKeras The baseline version was forked and updated, available at: https://github.com/fastmachinelearning/keras-surgeon
IMPORTANT: To use this funcionality please install separately from the above GitHub.
- Parameters:
model (keras.model) – Input model
- Returns:
Modified model, with redundant structures removed
- Return type:
reduced (keras.model)
hls4ml.optimization.dsp_aware_pruning.keras.regularizers module
- class hls4ml.optimization.dsp_aware_pruning.keras.regularizers.Conv2DRegularizer(alpha, beta=0, norm=1, structure_type=SUPPORTED_STRUCTURES.UNSTRUCTURED, pattern_offset=1, consecutive_patterns=1)
Bases:
Regularizer
A flexible regularizer for Conv2D layers, simultaneously performing pruning and clustering
- Parameters:
alpha (float) – Sparse penalty; a higher value pushes more weights towards zero
beta (float) – Variance penalty; a higher value reduces variance between a group of weights
norm (int) – Norm type (l1 or l2)
structure_type (string) – Type of regularization - unstructured, structured, pattern
pattern_offset (int) – Length of each pattern if structure_type == pattern
weights (tf.Variable) – Four-dimensional layer weight tensor, dimensionality (filter_width x filter_height x n_chan x n_filt)
- Returns:
Penalty associated with layer weights
- Return type:
Regularizer penalty (tf.Variable)
- Example use cases:
structure_type = unstructured: unstructured weight regularization
- structure_type = structured: filter regularization
(group weights of dimensionality filt_width x filt_height x n_chan)
- structure_type = pattern: regularization on groups of every n-th weight in flattened array
(e.g. grouping by reuse factor in hls4ml)
- get_config()
Returns the config of the regularizer.
An regularizer config is a Python dictionary (serializable) containing all configuration parameters of the regularizer. The same regularizer can be reinstantiated later (without any saved state) from this configuration.
This method is optional if you are just training and executing models, exporting to and from SavedModels, or using weight checkpoints.
This method is required for Keras model_to_estimator, saving and loading models to HDF5 formats, Keras model cloning, some visualization utilities, and exporting models to and from JSON.
- Returns:
Python dictionary.
- class hls4ml.optimization.dsp_aware_pruning.keras.regularizers.DenseRegularizer(alpha, beta=0, norm=1, structure_type=SUPPORTED_STRUCTURES.UNSTRUCTURED, block_shape=(1, 1), pattern_offset=1, consecutive_patterns=1)
Bases:
Regularizer
A flexible regularizer for Dense layers, simultaneously penalizing high values and variance
- Parameters:
alpha (float) – Sparse penalty; a higher value pushes more weights towards zero
beta (float) – Variance penalty; a higher value reduces variance between a group of weights
norm (int) – Norm type (l1 or l2)
structure_type (string) – Type of regularization - unstructured, structured, pattern, block
block_shape (tuple) – Block shape if structure_type == block
pattern_offset (int) – Length of each pattern if structure_type == pattern
consecutive_patterns (int) – How many consecutive patterns should be considered
weights (tf.Variable) – Two-dimensional layer weight tensor, dimensionality (M x N)
- Returns:
Penalty associated with layer weights
- Return type:
Regularizer penalty (tf.Variable)
Examples
structure_type = unstructured: unstructured weight regularization
- structure_type = structured: neuron regularization
(group weights by row)
- structure_type = pattern: regularization on groups of every n-th weight
(e.g. grouping by reuse factor in hls4ml)
- structure_type = block: regularization on blocks within weight matrix
(e.g. 4x4, 8x1 for certain SIMD processors)
- consecutive_patterns is commonly encountered with optimization of BRAM utilization -
e.g. while it is true that each DSP pattern consumes one DSP, They likely use less than one BRAM block (e.g. if the BRAM width is 36 bit and weight width is 16) In that case, we need to group several patterns together, So the entire block of patterns can be removed, thus saving DSP and BRAM
- get_config()
Returns the config of the regularizer.
An regularizer config is a Python dictionary (serializable) containing all configuration parameters of the regularizer. The same regularizer can be reinstantiated later (without any saved state) from this configuration.
This method is optional if you are just training and executing models, exporting to and from SavedModels, or using weight checkpoints.
This method is required for Keras model_to_estimator, saving and loading models to HDF5 formats, Keras model cloning, some visualization utilities, and exporting models to and from JSON.
- Returns:
Python dictionary.
hls4ml.optimization.dsp_aware_pruning.keras.utils module
- hls4ml.optimization.dsp_aware_pruning.keras.utils.get_last_layer_with_weights(model)
Finds the last layer with weights
The last layer with weights determined the output shape, so, pruning is sometimes not applicable to it. As an example, consider a network with 16 - 32 - 5 neurons - the last layer’s neuron (5) cannot be removed since they map to the data labels
- Parameters:
model (keras.model) – Input model
- Returns:
Index location of last layer with params
- Return type:
idx (int)
- hls4ml.optimization.dsp_aware_pruning.keras.utils.get_model_gradients(model, loss_fn, X, y)
Calculate model gradients with respect to weights
- Parameters:
model (keras.model) – Input model
loss_fn (keras.losses.Loss) – Model loss function
X (np.array) – Input data
y (np.array) – Output data
- Returns:
Per-layer gradients of loss with respect to weights
- Return type:
grads (dict)
- hls4ml.optimization.dsp_aware_pruning.keras.utils.get_model_hessians(model, loss_fn, X, y)
Calculate the second derivatives of the loss with repsect to model weights.
Note, only diagonal elements of the Hessian are computed.
- Parameters:
model (keras.model) – Input model
loss_fn (keras.losses.Loss) – Model loss function
X (np.array) – Input data
y (np.array) – Output data
- Returns:
Per-layer second derivatives of loss with respect to weights
- Return type:
grads (dict)
- hls4ml.optimization.dsp_aware_pruning.keras.utils.get_model_sparsity(model)
Calculate total and per-layer model sparsity
- Parameters:
model (-) – Model to be evaluated
- Returns:
tuple containing
sparsity (float): Model sparsity, as a percentage of zero weights w.r.t to total number of model weights
layers (dict): Key-value dictionary; each key is a layer name and the associated value is the layer’s sparsity
TODO - Extend support for recurrent layers (reccurent_kernel)
Module contents
- class hls4ml.optimization.dsp_aware_pruning.keras.MaskedBackprop(model, loss_fn, attributes)
Bases:
object
A helper class to perform masked backprop (training with frozen weights) The important function is __call__ as it masks gradients, based on frozen weights While this function can exist without a class, taking masks as input would deplete memory Since a new graph is created for every call, causing a large run-time The trick is to set the masks, models etc. as class variables and then pass the sparsity As the sparsity changes, a new graph of the function is created
- update_masks(masks)
- hls4ml.optimization.dsp_aware_pruning.keras.optimize_model(model, model_attributes, objective, scheduler, X_train, y_train, X_val, y_val, batch_size, epochs, optimizer, loss_fn, validation_metric, increasing, rtol, callbacks=None, ranking_metric='l1', local=False, verbose=False, rewinding_epochs=1, cutoff_bad_trials=1, directory='hls4ml-optimization-keras', tuner='Bayesian', knapsack_solver='CBC_MIP', regularization_range=[1e-06, 1.8478497974222906e-06, 3.414548873833601e-06, 6.30957344480193e-06, 1.165914401179831e-05, 2.1544346900318823e-05, 3.9810717055349695e-05, 7.356422544596421e-05, 0.00013593563908785255, 0.00025118864315095795, 0.00046415888336127773, 0.0008576958985908938, 0.001584893192461114, 0.0029286445646252374, 0.0054116952654646375, 0.01])
Top-level function for optimizing a Keras model, given objectives
- Parameters:
model (keras.Model) – Model to be optimized
model_attributes (dict) – Layer-wise model attributes, obtained from hls4ml.optimization.get_attributes_from_keras_model(…)
objective (hls4ml.optimization.objectives.ObjectiveEstimator) – Parameter, hardware or user-defined objective of optimization
scheduler (hls4ml.optimization.scheduler.OptimizationScheduler) – Sparsity scheduler, choose between constant, polynomial and binary
X_train (np.array) – Training inputs
y_train (np.array) – Training labels
X_val (np.array) – Validation inputs
y_val (np.array) – Validation labels
batch_size (int) – Batch size during training
epochs (int) – Maximum number of epochs to fine-tune model, in one iteration of pruning
optimizer (keras.optimizers.Optimizer or equivalent-string description) – Optimizer used during training
loss_fn (keras.losses.Loss or equivalent loss description) – Loss function used during training
validation_metric (keras.metrics.Metric or equivalent loss description) – Validation metric, used as a baseline
increasing (boolean) – If the metric improves with increased values; e.g. accuracy -> increasing = True, MSE -> increasing = False
rtol (float) – Relative tolerance; pruning stops when pruned_validation_metric < (or >) rtol * baseline_validation_metric
callbacks (list of keras.callbacks.Callback)
ranking_metric (string) – Metric used for ranking weights and structures; currently supported l1, l2, saliency and Oracle
local (boolean) – Layer-wise or global pruning
verbose (boolean) – Display debug logs during model optimization
rewinding_epochs (int) – Number of epochs to retrain model without weight freezing, allows regrowth of previously pruned weights
cutoff_bad_trials (int) – After how many bad trials (performance below threshold), should model pruning / weight sharing stop
directory (string) – Directory to store temporary results
tuner (str) – Tuning algorithm, choose between Bayesian, Hyperband and None
knapsack_solver (str) – Algorithm to solve Knapsack problem when optimizing; default usually works well; for very large networks, greedy algorithm might be more suitable
regularization_range (list) – List of suitable hyperparameters for weight decay
- Returns:
Optimized model
- Return type:
keras.Model