hls4ml.optimization package

Subpackages

Submodules

hls4ml.optimization.attributes module

class hls4ml.optimization.attributes.LayerAttributes(name, layer_type, inbound_layers, weight_shape, input_shape, output_shape, optimizable, optimization_attributes, args)

Bases: object

A class for storing layer information

Parameters:
  • name (string) – Layer name

  • layer_type (keras.Layer) – Layer type (e.g. Dense, Conv2D etc.)

  • inbound_layers (list) – List of parent nodes, identified by name

  • weight_shape (tuple) – Layer weight shape

  • input_shape (tuple) – Layer input shape

  • output_shape (tuple) – Layer output shape

  • optimizable (bool) – Should optimizations (pruning, weight sharing) be applied to this layer

  • optimization_attributes (OptimizationAttributes) – Type of optimization, pruning or weight sharing, block shape and pattern offset

  • args (dict) – Additional information, e.g. hls4mlAttributes; dictionary so it can be generic enough for different platforms

update_args(updates)
class hls4ml.optimization.attributes.OptimizationAttributes(structure_type=SUPPORTED_STRUCTURES.UNSTRUCTURED, pruning=False, weight_sharing=False, block_shape=(1, 1), pattern_offset=1, consecutive_patterns=1)

Bases: object

A class for storing layer optimization attributes

Parameters:
  • structure_type (enum) – Targeted structure - unstructured, structured, pattern, block

  • pruning (boolean) – Should pruning be applied to the layer

  • weight_sharing (boolean) – Should weight sharing be applied to the layer

  • block_shape (tuple) – Block shape if structure_type == block

  • pattern_offset (int) – Length of each pattern if structure_type == pattern

  • consecutive_patterns (int) – How many consecutive patterns are grouped together if structure_type == pattern

Notes

  • In the case of hls4ml, pattern_offset is equivalent to the number of weights processed in parallel

  • The pattern_offset is n_in * n_out / reuse_factor; default case (=1) is equivalent to no unrolling

hls4ml.optimization.attributes.get_attributes_from_keras_model(model)

Given a Keras model, builds a dictionary of class attributes Additional arguments (e.g. reuse factor), depend on the target hardware platform and are inserted later Per-layer pruning sype (structured, pattern etc.), depend on the pruning objective and are inserted later

Parameters:

model (keras.model) – Model to extract attributes from

Returns:

Each key corresponds to a layer name, values are instances of LayerAttribute

Return type:

model_attributes (dict)

hls4ml.optimization.attributes.get_attributes_from_keras_model_and_hls4ml_config(model, config)

Given a Keras model and hls4ml configuration, builds a dictionary of class attributes Per-layer pruning sype (structured, pruning etc.), depend on the pruning objective and are inserted later

Parameters:
  • model (keras.model) – Model to extract attributes from

  • config (dict) – hls4ml dictionary

Returns:

Each key corresponds to a layer name, values are LayerAttribute instances

Return type:

model_attributes (dict)

class hls4ml.optimization.attributes.hls4mlAttributes(n_in, n_out, io_type, strategy, weight_precision, output_precision, reuse_factor, parallelization_factor=1)

Bases: object

A class for storing hls4ml information of a single layer

Parameters:
  • n_in (int) – Number of inputs (rows) for Dense matrix multiplication

  • n_out (int) – Number of outputs (cols) for Dense matrix multiplication

  • io_type (string) – io_parallel or io_stream

  • strategy (string) – Resource or Latency

  • weight_precision (FixedPrecisionType) – Layer weight precision

  • output_precision (FixedPrecisionType) – Layer output precision

  • reuse_factor (int) – Layer reuse factor

  • parallelization_factor (int) – Layer parallelization factor - [applicable to io_parallel Conv2D]

hls4ml.optimization.config module

class hls4ml.optimization.config.SUPPORTED_STRUCTURES(value)

Bases: Enum

An enumeration.

BLOCK = 'block'
PATTERN = 'pattern'
STRUCTURED = 'structured'
UNSTRUCTURED = 'unstructured'

hls4ml.optimization.knapsack module

hls4ml.optimization.knapsack.solve_knapsack(values, weights, capacity, implementation='CBC_MIP', **kwargs)

A function for solving the Knapsack problem

Parameters:
  • values (-) – A one-dimensional array, where each entry is the value of an item

  • weights (-) – An matrix, each row represents the weights of every item, in a given knapsack

  • capacity (-) – A one-dimensional array, each entry is the maximum weights of a Knapsack

  • implementation (-) – Algorithm to solve Knapsack problem - dynamic programming, greedy, branch and bound

  • time_limit (-) – Limit (in seconds) after which the CBC or Branch & Bound should stop looking for a solution and return optimal so far

  • scaling_factor (-) – Scaling factor for floating points values in CBC or B&B

Returns:

tuple containing

  • optimal_value (float): The optimal values of elements in the knapsack

  • selected_items (list): A list of indices, corresponding to the selected elements

Notes

  • The general formulation of the Knapsack problem for N items and M knapsacks is:

    max v.T @ x s.t. A @ x <= W v ~ (N, 1) x ~ (N, 1) A ~ (M, N) W ~ (M, 1) x_{i, j} = {0, 1} and <= is the generalized, element-wise inequlaity for vectors

  • Supported implementations:
    • Dynamic programming:
      • Optimal solution

      • Time complexity: O(nW)

      • Suitable for single-dimensional constraints and a medium number of items, with integer weights

    • Branch and bound:
      • Optimal

      • Solved using Google OR-Tools

      • Suitable for multi-dimensional constraints and a large number of items

    • Branch and bound:
      • Solution sub-optimal, but often better than greeedy

      • Solved using Google OR-Tools, with the CBC MIP Solver

      • Suitable for multi-dimensional constraints and a very high number of items

    • Greedy:
      • Solution sub-optimal

      • Time complexity: O(mn)

      • Suitable for highly dimensional constraints or a very high number of items

  • Most implementations require integer values of weights and capacities;

    For pruning & weight sharing this is never a problem In case non-integer weights and capacities are requires, All of the values should be scaled by an appropriate scaling factor

hls4ml.optimization.scheduler module

class hls4ml.optimization.scheduler.BinaryScheduler(initial_sparsity=0, final_sparsity=1.0, threshold=0.01)

Bases: OptimizationScheduler

Sparsity updated by binary halving the search space; constantly updates lower and upper bounds In the update step, sparsity is incremented, as the midpoint between previous sparsity and target sparsity (upper bound) In the repair step, sparsity is decrement, as the midpoint between between the lower bound and previous sparsity

repair_step()

Method used when the neural architecture does not meet satisfy performance requirement for a given sparsity. Then, the target sparsity is decreased according to the rule.

Examples

  • ConstantScheduler, sparsity = 0.5, increment = 0.05 -> sparsity = 0.55 [see ConstantScheduler for explanation]

  • BinaryScheduler, sparsity = 0.75, target = 1.0, previous = 0.5 -> sparsity = (0.5 + 0.75) / 2 = 0.625

Returns:

tuple containing

  • updated (boolean) - Has the sparsity changed? If not, the optimization algorithm can stop

  • sparsity (float) - Updated sparsity

update_step()

Increments the current sparsity, according to the rule.

Examples

  • ConstantScheduler, sparsity = 0.5, increment = 0.05 -> sparsity = 0.55

  • BinaryScheduler, sparsity = 0.5, target = 1.0 -> sparsity = 0.75

Returns:

tuple containing

  • updated (boolean) - Has the sparsity changed? If not, the optimization algorithm can stop

  • sparsity (float) - Updated sparsity

class hls4ml.optimization.scheduler.ConstantScheduler(initial_sparsity=0, final_sparsity=1.0, update_step=0.05)

Bases: OptimizationScheduler

Sparsity updated by a constant term, until
  1. sparsity target reached OR

  2. optimization algorithm stops requesting state updates

repair_step()

Method used when the neural architecture does not meet satisfy performance requirement for a given sparsity. Then, the target sparsity is decreased according to the rule.

Examples

  • ConstantScheduler, sparsity = 0.5, increment = 0.05 -> sparsity = 0.55 [see ConstantScheduler for explanation]

  • BinaryScheduler, sparsity = 0.75, target = 1.0, previous = 0.5 -> sparsity = (0.5 + 0.75) / 2 = 0.625

Returns:

tuple containing

  • updated (boolean) - Has the sparsity changed? If not, the optimization algorithm can stop

  • sparsity (float) - Updated sparsity

update_step()

Increments the current sparsity, according to the rule.

Examples

  • ConstantScheduler, sparsity = 0.5, increment = 0.05 -> sparsity = 0.55

  • BinaryScheduler, sparsity = 0.5, target = 1.0 -> sparsity = 0.75

Returns:

tuple containing

  • updated (boolean) - Has the sparsity changed? If not, the optimization algorithm can stop

  • sparsity (float) - Updated sparsity

class hls4ml.optimization.scheduler.OptimizationScheduler(initial_sparsity=0, final_sparsity=1)

Bases: ABC

Baseline class handling logic regarding target sparsity and its updates at every step

get_sparsity()
abstract repair_step()

Method used when the neural architecture does not meet satisfy performance requirement for a given sparsity. Then, the target sparsity is decreased according to the rule.

Examples

  • ConstantScheduler, sparsity = 0.5, increment = 0.05 -> sparsity = 0.55 [see ConstantScheduler for explanation]

  • BinaryScheduler, sparsity = 0.75, target = 1.0, previous = 0.5 -> sparsity = (0.5 + 0.75) / 2 = 0.625

Returns:

tuple containing

  • updated (boolean) - Has the sparsity changed? If not, the optimization algorithm can stop

  • sparsity (float) - Updated sparsity

abstract update_step()

Increments the current sparsity, according to the rule.

Examples

  • ConstantScheduler, sparsity = 0.5, increment = 0.05 -> sparsity = 0.55

  • BinaryScheduler, sparsity = 0.5, target = 1.0 -> sparsity = 0.75

Returns:

tuple containing

  • updated (boolean) - Has the sparsity changed? If not, the optimization algorithm can stop

  • sparsity (float) - Updated sparsity

class hls4ml.optimization.scheduler.PolynomialScheduler(maximum_steps, initial_sparsity=0, final_sparsity=1.0, decay_power=3)

Bases: OptimizationScheduler

Sparsity updated by at a polynomial decay, until
  1. sparsity target reached OR

  2. optimization algorithm stops requesting state updates

For more information, see Zhu & Gupta (2016) -

‘To prune, or not to prune: exploring the efficacy of pruning for model compression’

Note, the implementation is slightly different, since TensorFlow Prune API depends on the total number of epochs and update frequency.

In certain cases, a model might underperform at the current sparsity level, but perform better at a higher sparsity. In this case, polynomial sparsity will simply jump to the next sparsity level The model’s performance over several sparsity levels optimization is tracked and toped after high loss over several trials (see top level pruning/optimization function)

repair_step()

Method used when the neural architecture does not meet satisfy performance requirement for a given sparsity. Then, the target sparsity is decreased according to the rule.

Examples

  • ConstantScheduler, sparsity = 0.5, increment = 0.05 -> sparsity = 0.55 [see ConstantScheduler for explanation]

  • BinaryScheduler, sparsity = 0.75, target = 1.0, previous = 0.5 -> sparsity = (0.5 + 0.75) / 2 = 0.625

Returns:

tuple containing

  • updated (boolean) - Has the sparsity changed? If not, the optimization algorithm can stop

  • sparsity (float) - Updated sparsity

update_step()

Increments the current sparsity, according to the rule.

Examples

  • ConstantScheduler, sparsity = 0.5, increment = 0.05 -> sparsity = 0.55

  • BinaryScheduler, sparsity = 0.5, target = 1.0 -> sparsity = 0.75

Returns:

tuple containing

  • updated (boolean) - Has the sparsity changed? If not, the optimization algorithm can stop

  • sparsity (float) - Updated sparsity

Module contents

hls4ml.optimization.optimize_keras_model_for_hls4ml(keras_model, hls_config, objective, scheduler, X_train, y_train, X_val, y_val, batch_size, epochs, optimizer, loss_fn, validation_metric, increasing, rtol, callbacks=None, ranking_metric='l1', local=False, verbose=False, rewinding_epochs=1, cutoff_bad_trials=3, directory='hls4ml-optimization', tuner='Bayesian', knapsack_solver='CBC_MIP', regularization_range=[1e-06, 1.8478497974222906e-06, 3.414548873833601e-06, 6.30957344480193e-06, 1.165914401179831e-05, 2.1544346900318823e-05, 3.9810717055349695e-05, 7.356422544596421e-05, 0.00013593563908785255, 0.00025118864315095795, 0.00046415888336127773, 0.0008576958985908938, 0.001584893192461114, 0.0029286445646252374, 0.0054116952654646375, 0.01])

Top-level function for optimizing a Keras model, given hls4ml config and a hardware objective(s)

Parameters:
  • keras_model (keras.Model) – Model to be optimized

  • hls_config (dict) – hls4ml configuration, obtained from hls4ml.utils.config.config_from_keras_model(…)

  • objective (hls4ml.optimization.objectives.ObjectiveEstimator) –

  • Parameter

  • optimization (hardware or user-defined objective of) –

  • scheduler (Sparsity) –

  • scheduler

  • constant (choose between) –

  • binary (polynomial and) –

  • X_train (np.array) – Training inputs

  • y_train (np.array) – Training labels

  • X_val (np.array) – Validation inputs

  • y_val (np.array) – Validation labels

  • batch_size (int) – Batch size during training

  • epochs (int) – Maximum number of epochs to fine-tune model, in one iteration of pruning

  • optimizer (keras.optimizers.Optimizer or equivalent-string description) – Optimizer used during training

  • loss_fn (keras.losses.Loss or equivalent loss description) – Loss function used during training

  • validation_metric (keras.metrics.Metric or equivalent loss description) – Validation metric, used as a baseline

  • increasing (boolean) – If the metric improves with increased values; e.g. accuracy -> increasing = True, MSE -> increasing = False

  • rtol (float) – Relative tolerance; pruning stops when pruned_validation_metric < (or >) rtol * baseline_validation_metric

  • callbacks (list of keras.callbacks.Callback) –

  • ranking_metric (string) – Metric used for ranking weights and structures; currently supported l1, l2, saliency and Oracle

  • local (boolean) – Layer-wise or global pruning

  • verbose (boolean) – Display debug logs during model optimization

  • rewinding_epochs (int) – Number of epochs to retrain model without weight freezing, allows regrowth of previously pruned weights

  • cutoff_bad_trials (int) – After how many bad trials (performance below threshold), should model pruning / weight sharing stop

  • directory (string) – Directory to store temporary results

  • tuner (str) – Tuning algorithm, choose between Bayesian, Hyperband and None

  • knapsack_solver (str) – Algorithm to solve Knapsack problem when optimizing; default usually works well; for very large networks, greedy algorithm might be more suitable

  • regularization_range (list) – List of suitable hyperparameters for weight decay

Returns:

Optimized model

Return type:

keras.Model