Release Notes
v1.0.0: foxglove 1.0.0
Released on 2024-12-09 - GitHub - PyPI
What's Changed
hls4ml v1.0.0 "foxglove" introduces several significant improvements:
- A new QONNX frontend by @jmitrevs introduced in #979
- The ability for hls4ml to automatically infer the precision of data types by @vloncar introduced in #855
- The addition of an experimental backend for Intel oneAPI by @jmitrevs introduced in #955
- The addition of a backend for Siemens Catapult by @dgburnette in #956
- Added support for HGQ proxy models by @calad0i in #914
- An API for hardware-aware optimization by @bo3z in #768 and #809
The full list of other improvements and fixes is:
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #949
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #953
- hls4ml Optimization API [Part 1] by @bo3z in #768
- QKeras support for RNN layers by @laurilaatu in #856
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #962
- Try to fix sphinx problem by restricting tensorflow-model-optimization by @jmitrevs in #967
- Bump pre-commit/action from 3.0.0 to 3.0.1 by @dependabot in #968
- Change fractional (and others) to be a property, move quantizers by @jmitrevs in #964
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #969
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #971
- vitis backend tarball fix by @calad0i in #972
- remove special vitis version of nnet_dense_resource.h by @jmitrevs in #975
- Allow Vitis synthesis tests by @jmduarte in #927
- Fix cleanup of synthesis tests (leftover from 927) by @vloncar in #989
- Fix sphinx by pinning tensorflow<=2.15 by @jmduarte in #992
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #984
- add clock uncertainty configuration option by @jmitrevs in #870
- Stage initial set of changes for the Catapult backend by @dgburnette in #956
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #999
- fix unwanted tested file change in #956 by @calad0i in #1000
- Fix SR backend synth missing variables by @bo3z in #993
- Upsampling support for PyTorch models by @vloncar in #977
- Split fpga_types into separate files by @vloncar in #998
- Support negative_slope in quantized_relu by @vloncar in #987
- Group more tests per YAML to reduce the number of envs created by @vloncar in #996
- Automatic precision inference by @vloncar in #855
- Remove unnecessary transposes related to conversion to channels_last format by @vloncar in #976
- Update pytest docker image to 0.5.4 by @jmitrevs in #1005
- Fix pre-commit warning and change '.h5' to '.keras' for written output by @jmitrevs in #1006
- Fix extension test for Keras v3 by @vloncar in #1009
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #1007
- updated pytest docker image by @jmitrevs in #1017
- SepConv1d/2d for io_parallel with Latency strategy by @vloncar in #1012
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #1021
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #1023
- Latency Pooling Header Updates by @calad0i in #973
- Make im2col default option for quartus by @calad0i in #1010
- add protection for when kernel_quantizer is None by @jmitrevs in #997
- prevent test directory overwrites for activation by @jmitrevs in #1031
- Update Jenkinsfile to use new Docker image and Python 3.10 environment by @vloncar in #1033
- clean-up test ci yaml generater by @calad0i in #1036
- Add View to layer name map for pytorch parser by @JanFSchulte in #1039
- Add RNN support for Pytorch by @JanFSchulte in #850
- Add Vitis to pytorch API tests by @JanFSchulte in #1040
- clean up mult-dimensional dense by @jmitrevs in #1042
- Add namespaces and optional writer config by @vloncar in #986
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #1044
- Add support for HGQ proxy model by @calad0i in #914
- Bug Fix for Operand Shape Mismatch in BatchNorm Fusion (PyTorch) by @sei-rquartiano in #1045
- remove precision settings that make pytest for batchnorm in pytorch fail by @JanFSchulte in #1053
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #1047
- rm slow mnist training in test by @calad0i in #1018
- Add an optimizer to replace SeparableConv by Depthwise + Conv (pointwise) by @jmitrevs in #1022
- Add functionality to use granularity option also for pytorch models by @JanFSchulte in #1051
- Update pooling logic for Vivado, Vitis, and Catapult backends by @jmitrevs in #1056
- remove padding attribute by @jmitrevs in #1061
- Run long-running pytests out of the batch by @vloncar in #1062
- Fix tanh activiation in pytorch parser by @JanFSchulte in #1055
- make auto the default for layer config by @jmitrevs in #1016
- remove checks on 'padding' that were missed in previous PR by @jmitrevs in #1064
- Remove extras flow by @vloncar in #1067
- Expose alpha and theta type for parametrized activations by @jmitrevs in #1069
- Raise exception on compile errors by @vloncar in #1068
- update qkeras in Jenkinsfile by @jmitrevs in #1072
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #1075
- hls4ml Optimization API [Part 2] by @bo3z in #809
- Hardcore weight txt path by @vloncar in #1089
- quote the ${WEIGHT_DIR} to handle special characters by @jmitrevs in #1091
- Beginnings of the oneAPI backend by @jmitrevs in #955
- update keras activation parsing, especially leaky relu by @jmitrevs in #1085
- Fix softmax parsing in pytorch and add test by @JanFSchulte in #1086
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #1098
- Change indexing in filling result for io_parallel convolutions, Vitis by @jmitrevs in #1102
- Update QONNX parsing for 1.0 by @jmitrevs in #979
- remove incorrect input from Constant nodes by @jmitrevs in #1119
- add max_precision to onnx parser by @jmitrevs in #1113
- Add RF to config templates for "Merge" layers by @vloncar in #1121
- Add doc for HGQ by @calad0i in #1117
- Multi output fix 2 by @calad0i in #1103
- Make auto default precision for pytorch parser by @JanFSchulte in #1112
- remove incorrect setting of result_t by @jmitrevs in #1130
- Fix problem with scale being a multidimensional array. by @jurevreca12 in #1132
- Added support for QONNX
Resize
node ingestion and tested with tiny UNet model by @nghielme in #1122 - Update install_requires for 1.0.0 by @vloncar in #1136
- Pointwise Conv1D with code generation for "Latency" strategy (update of #811) by @jmduarte in #881
- Introduce optional description to layer attributes by @vloncar in #1127
- Qonnx warnings by @jmitrevs in #1142
- Fixes to parsing of pytorch models when using torch functionals by @JanFSchulte in #1143
- Update README.md for v1.0.0 by @bo3z in #1100
- Temporary workaround for QKeras installation by @vloncar in #1145
New Contributors
- @laurilaatu made their first contribution in #856
- @dgburnette made their first contribution in #956
- @sei-rquartiano made their first contribution in #1045
- @jurevreca12 made their first contribution in #1132
Full Changelog: v0.8.1...v1.0.0
v0.8.1: edelweiss 0.8.1
Released on 2023-12-19 - GitHub - PyPI
What's Changed
- Fix for #905 by @calad0i in #906
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #921
- Fix logos in README.md by @vloncar in #930
- Fix writer precision when fp bits >= 14 by @calad0i in #909
- Let repack_stream optimizer inheirt original precision by @calad0i in #907
- Update A3D3 grant no. by @schsu in #941
- Add precision inherition for when generating stream clone by @calad0i in #911
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #942
- Quartus multi out with stream fix by @calad0i in #908
- Fix profiling for Keras LSTM layers. by @Landay7 in #940
- Fix for multiple inputs that may get out of order by @jmduarte in #937
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #944
- Bump actions/upload-artifact from 3 to 4 by @dependabot in #943
- better repalce_node fn by @calad0i in #934
- bump to 0.8.1 by @jmitrevs in #945
New Contributors
Full Changelog: v0.8.0...v0.8.1
v0.8.0: edelweiss 0.8.0
Released on 2023-11-16 - GitHub - PyPI
What's Changed
- Decouple pipeline style from strategy by @vloncar in #781
- Don't use reader in ModelGraph and layers by @vloncar in #770
- Remove tf_to_hls by @vloncar in #795
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #796
- Fix parsing of QConv2DBatchnorm weights by @vloncar in #802
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #801
- Discussion - Inlined Conv slows down latency significantly (up to x15 - x20) by @bo3z in #800
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #807
- Fix over-allocation of bits for quantised po2 by @bo3z in #806
- Propagate zeros from Conv layers to multiplication config by @bo3z in #797
- Fix Vitis Conv1D/2D latency strategy by @vloncar in #815
- Improved parsing of pytorch models using torch.FX - Clean by @JanFSchulte in #799
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #816
- Support for parsing nested models by @vloncar in #794
- Fix loading weights in n-dim dense -> 1x1 conv by @vloncar in #821
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #828
- Fix loading weights in GarNetStacked and GarNet internal array precisions by @joshlerner in #827
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #830
- Fix profiling for GRU/LSTM by @drankincms in #833
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #835
- remove obsolete and unused docker directory by @jmitrevs in #836
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #842
- Remove obsolete parameter mapping between pytorch and keras by @JanFSchulte in #847
- Make binary CNN match between Keras and hls4ml by @jmitrevs in #804
- No longer make ExponentPrecisionType and XnorPrecisionType inherit from IntegerPrecisionType by @jmitrevs in #845
- Add support for flattening to the pytorch parser by @JanFSchulte in #852
- Add option to configure IP version by @AdrianAlan in #851
- Bug fix for named nn.Sequential in pytorch parser by @JanFSchulte in #848
- Add QDepthwiseConv2D, DepthwiseConv2D, DepthwiseConv1D support by @jmitrevs in #834
- Symbolic expressions in hls4ml by @vloncar in #660
- Update dependencies, add testing extras by @jmitrevs in #837
- Bump actions/checkout from 3 to 4 by @dependabot in #866
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #869
- try to use new runners for gitlab CI by @jmitrevs in #879
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #880
- Fix weight precision format string by @vloncar in #877
- add acknowledgments by @jmduarte in #862
- Support for quantized SeparableConv1D/2D by @vloncar in #861
- Speed up Keras profiling by @AdrianAlan in #863
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #882
- Fix profiling SeparableConv1D and SeparableConv2D by @qberthet in #891
- Add support for filt_height==1 for streaming quartus conv2d by @jmitrevs in #886
- Fix config structure name in pragma for SeparableConv1D by @qberthet in #884
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #895
- Fix bit overflow with softmax by @calad0i in #887
- bump 0.8.0rc1 by @jmitrevs in #915
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #902
- Add funding acknowledgements by @jmduarte in #918
- Fix fetching models from example-models repo by @vloncar in #919
- add blank line to make rst format correct by @jmitrevs in #923
- Update default FPGA part number from KU115 to VU13P by @jmduarte in #924
- update to 0.8.0 by @jmitrevs in #925
New Contributors
- @pre-commit-ci made their first contribution in #796
- @joshlerner made their first contribution in #827
- @qberthet made their first contribution in #891
Full Changelog: v0.7.1...v0.8.0
v0.8.0rc1: edelweiss 0.8.0rc1
Released on 2023-11-08 - GitHub - PyPI
What's Changed
- Decouple pipeline style from strategy by @vloncar in #781
- Don't use reader in ModelGraph and layers by @vloncar in #770
- Remove tf_to_hls by @vloncar in #795
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #796
- Fix parsing of QConv2DBatchnorm weights by @vloncar in #802
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #801
- Discussion - Inlined Conv slows down latency significantly (up to x15 - x20) by @bo3z in #800
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #807
- Fix over-allocation of bits for quantised po2 by @bo3z in #806
- Propagate zeros from Conv layers to multiplication config by @bo3z in #797
- Fix Vitis Conv1D/2D latency strategy by @vloncar in #815
- Improved parsing of pytorch models using torch.FX - Clean by @JanFSchulte in #799
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #816
- Support for parsing nested models by @vloncar in #794
- Fix loading weights in n-dim dense -> 1x1 conv by @vloncar in #821
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #828
- Fix loading weights in GarNetStacked and GarNet internal array precisions by @joshlerner in #827
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #830
- Fix profiling for GRU/LSTM by @drankincms in #833
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #835
- remove obsolete and unused docker directory by @jmitrevs in #836
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #842
- Remove obsolete parameter mapping between pytorch and keras by @JanFSchulte in #847
- Make binary CNN match between Keras and hls4ml by @jmitrevs in #804
- No longer make ExponentPrecisionType and XnorPrecisionType inherit from IntegerPrecisionType by @jmitrevs in #845
- Add support for flattening to the pytorch parser by @JanFSchulte in #852
- Add option to configure IP version by @AdrianAlan in #851
- Bug fix for named nn.Sequential in pytorch parser by @JanFSchulte in #848
- Add QDepthwiseConv2D, DepthwiseConv2D, DepthwiseConv1D support by @jmitrevs in #834
- Symbolic expressions in hls4ml by @vloncar in #660
- Update dependencies, add testing extras by @jmitrevs in #837
- Bump actions/checkout from 3 to 4 by @dependabot in #866
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #869
- try to use new runners for gitlab CI by @jmitrevs in #879
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #880
- Fix weight precision format string by @vloncar in #877
- add acknowledgments by @jmduarte in #862
- Support for quantized SeparableConv1D/2D by @vloncar in #861
- Speed up Keras profiling by @AdrianAlan in #863
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #882
- Fix profiling SeparableConv1D and SeparableConv2D by @qberthet in #891
- Add support for filt_height==1 for streaming quartus conv2d by @jmitrevs in #886
- Fix config structure name in pragma for SeparableConv1D by @qberthet in #884
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #895
- Fix bit overflow with softmax by @calad0i in #887
- bump 0.8.0rc1 by @jmitrevs in #915
New Contributors
- @pre-commit-ci made their first contribution in #796
- @joshlerner made their first contribution in #827
- @qberthet made their first contribution in #891
Full Changelog: v0.7.1...v0.8.0rc1
v0.7.1: delphinium 0.7.1
Released on 2023-05-13 - GitHub - PyPI
What's Changed
- bump version to v0.7.0 by @jmduarte in #778
- Fix for 2D conv layers in the special case of io_parallel with full parallelization by @drankincms in #760
- Fix RNN layers when strategy=resource by @vloncar in #780
- Update Jenkins test environment to avoid dependency hell by @vloncar in #786
- Explicitly set strategy for pointwise conv by @vloncar in #785
- Minor docs fixes for 0.7.1 by @vloncar in #788
- bump 0.7.1 by @jmitrevs in #791
Full Changelog: v0.7.0...v0.7.1
v0.7.0: delphinium
Released on 2023-04-26 - GitHub - PyPI
What's Changed
- fix conv1d io_parallel resource by @jmitrevs in #403
- Speed up CI tests by @thesps in #407
- Fix GlobalPooling1D Layers by @jmduarte in #399
- Fix batched multiple inputs by @jmduarte in #414
- Fixed 'qkeras_mnist_dense' example build problem #423 by @siorpaes in #424
- Update for pyyaml 6.0 by @thesps in #435
axi_stream_driver
update by @nicologhielmetti in #420- Reshape fixes: don't repack stream for flatten; remove final reshape by @jmduarte in #443
- Fix Conv2D with
io_type = io_parallel
&Strategy: Resource
by @thesps in #448 - Support applying Softmax over multidimensional tensors by @vloncar in #384
- Disable some unsupported layers by @thesps in #447
- Fixes: quantized_relu & unsigned profiling part II by @thesps in #441
- GarNet and GarNetStack in config.py by @yiiyama in #344
- support ZeroPadding layers by @jmduarte in #480
- New backend development framework by @vloncar in #395
- Register
ApplyAlpha
layer templates by @thesps in #499 - Parsing extended by @nicologhielmetti in #501
- Remove intermediate casting in product by @jmitrevs in #490
- Add QKeras as a package dependency by @vloncar in #511
- Copy flows from config by @thesps in #510
- VivadoAccelerator backend updates by @thesps in #508
- Optimized look-up table by @nemerchiedde in #527
- Upsampling2D test case by @ChiRuiChen in #520
- Support UpSampling1D by @vloncar in #475
- RNN support (part 1) by @vloncar in #521
- Quartus Custom Matrix Multiplication & Quantization by @bo3z in #523
- Vivado-equivalent implementation of Softmax on Quartus by @bo3z in #540
- Ensure 2 bits for scale in po2 quantizers by @vloncar in #531
- Link update by @bkmgit in #519
- Fix removal of nodes ingested by multiple downstream nodes by @jmduarte in #544
- Enable SeparableConv2d by @jmduarte in #547
- Extension API by @vloncar in #528
- change string ReuseFactor to int by @jmitrevs in #416
- Make the size of bn scale and bias what they really are by @jmitrevs in #532
- Raise runtime error when a layer is named
input
by @jmduarte in #482 - fix insertion before a node with multiple inputs + support additional broadcasting by @jmduarte in #551
- Pointwise conv1d/2d resource by @jmduarte in #471
- Quartus Embedding Layer by @bo3z in #548
- Fix for QActivations passed as an argument by @AdrianAlan in #553
- Don't override precision directly in the QKeras optimizer by @vloncar in #567
- Remove the in/out size from top function by @vloncar in #559
- Transpose2d, Concatenate2d, and up to 3 Clones for io_stream by @jmduarte in #402
- Remove io_serial as io_stream and add some more info in docs. by @Duchstf in #334
- Update docs for v0.6.0 by @thesps in #453
- Use correct number of args for multiple outputs by @apfusco in #487
- Fixed a few typos in the documentation by @pitmonticone in #467
- returning integer from _compute_n_samples by @JochiSt in #537
- Providing support for Alveo boards by @selwyn96 in #552
- Make layer names case sensitive in config. by @jmitrevs in #577
- Add issue and PR templates by @jmduarte in #582
- Vivado Backend GRU/LSTM support by @drankincms in #560
- Update CI template syntax by @thesps in #593
- Update flow dependencies by @vloncar in #588
- Fix parsing of ZeroPadding layers by @vloncar in #595
- remove cppname by @jmitrevs in #562
- Remove email helpline from the docs by @vloncar in #601
- Fixes for GRU/LSTM in Vivado backend by @drankincms in #598
- Remove io_serial by @vloncar in #609
- Fix test_graph by @vloncar in #611
- Override parent backend optimizer passes with derived backend passes by @thesps in #597
- Enforce function pipelining when using io_parallel with Resource strategy by @vloncar in #605
- FIFO depth optimization by @nicologhielmetti in #509
- Add tracing support for the quartus backend by @jmitrevs in #583
- Quartus streaming support for Activations, Dense & Batch Normalization by @bo3z in #557
- QConv alpha != 1 bug fix by @bo3z in #612
- Quartus Stream Embedding by @bo3z in #625
- change master to main by @jmitrevs in #602
- Edit order of the optimizers in the flow so that BramFactor is followed by @jmitrevs in #621
- Softmax LUT Optimization by @bo3z in #570
- Quartus Synthesis Flow Improvement by @bo3z in #618
- Quartus Extensions by @bo3z in #628
- Quartus GRU by @bo3z in #596
- Quartus Merge layers by @bo3z in #634
- fix nondefault project name handling by @jmitrevs in #626
- Fix parsing of logic synthesis reports by @vloncar in #639
- Fix conv1d stream implementation hls directives by @Jonathan-Shoemaker in #635
- Implementation and optimizations linked to Simple-RNN and LSTM for qu… by @nemerchiedde in #575
- Softsign optimization by @nemerchiedde in #585
- Parallel CNNs, Pooling & Image Layers for Quartus Backend by @bo3z in #561
- Quartus Streaming Softsign (PR #585 contd.) by @bo3z in #655
- Remove final reshapes even for Quartus by @jmitrevs in #661
- Unrolled CNN implementation by @vloncar in #600
- the strategy was not propagated in the pytest by @jmitrevs in #663
- Fix keras model loading issue with loading model with KerasH5 by @calad0i in #664
- append applied_flows container before filling instead of after by @jmitrevs in #641
- set version using
setuptools_scm
by @jmduarte in #479 - Argmax Softmax by @bo3z in #627
- Fix version extraction in Sphinx config by @vloncar in #669
- Add requested citations to README by @jmduarte in #615
- skip BatchNorm fusion when input/output is used multiple times by @jmduarte in #481
- Use wider accum_t for (average) pooling by @vloncar in #681
- Quartus Streaming Conv, Pooling & Image layers by @bo3z in #656
- Create branch on PR by @jmduarte in #636
- Delete
example-prjs
directory by @jmduarte in #682 - Adiabatically turn on
pre-commit
by @jmduarte in #678 - Add causal padding by @cgutsche in #688
- Update
pre-commit
GitHub Action by @jmduarte in #689 - New config_from_keras_model by @vloncar in #690
- remove obsolete np.int and np.float by @jmitrevs in #703
- Update p-clang-format to work on mac by @jmduarte in #704
- Fix function call in Alveo tcl script by @vloncar in #694
- add readme for contrib by @jmitrevs in #706
- WIP Add custom KL loss layer HLS implementation by @katyagovorkova in #606
- Fix incorrectly linted build() command by @vloncar in #709
- For encoded convolution, add check for when min_width would have been larger than in_width by @jmitrevs in #610
- fifo_depth_optimization flow require ip, not writer, before running by @jmitrevs in #642
- update isort to fix pre-commit by @jmduarte in #719
- Fixed sign parsing for ac_fixed and ac_int by @jmitrevs in #727
- Correctly expand dims of pointwise layer by @vloncar in #715
- Support keepdims in GlobalPooling layers by @vloncar in #716
- Register layer attributes in VivadoAccelerator backend by @vloncar in #724
- Add quantized sigmoid, fix quantized tanh for QKeras by @jmitrevs in #569
- print_vivado_report function for nicer reports by @vloncar in #730
- Quartus bram factor by @jmitrevs in #700
- Fix inplace variables by @jmitrevs in #714
- Fix for cloned stream that is subsequently flattened by @jmduarte in #708
- Vitis HLS backend by @vloncar in #629
- Update documentation for v0.7.0 release by @jmduarte in #710
- Fix release notes + version in docs by @jmduarte in #742
- Fix precommits by @jmitrevs in #741
- mv
dependabot.yml
by @jmduarte in #743 - Bump actions/setup-python from 2 to 4 by @dependabot in #748
- fix Vitis pragmas messed up by pre-commit by @jmitrevs in #751
- Additional cleanup of the codebase by @vloncar in #750
- Fix for BatchNormalization layers with
center=False
orscale=False
by @jmduarte in #754 - Remove references to GPL since we now use a different license by @jmitrevs in #761
- Fix pooling layers when padding is applied from the left/top by @JanFSchulte in #757
- Further update documentation for 0.7.0 by @jmitrevs in #744
- Update pypi-publish.yml by @jmduarte in #763
- Fix pypi version by @jmduarte in #766
- add a default weight_size by @jmitrevs in #772
- CNNs with binary inputs and weights need fixes by @jmitrevs in #749
- Minor documentation updates by @jmitrevs in #774
New Contributors
- @siorpaes made their first contribution in #424
- @nemerchiedde made their first contribution in #527
- @ChiRuiChen made their first contribution in #520
- @bo3z made their first contribution in #523
- @bkmgit made their first contribution in #519
- @apfusco made their first contribution in #487
- @pitmonticone made their first contribution in #467
- @JochiSt made their first contribution in #537
- @selwyn96 made their first contribution in #552
- @Jonathan-Shoemaker made their first contribution in #635
- @calad0i made their first contribution in #664
- @cgutsche made their first contribution in #688
- @dependabot made their first contribution in #748
- @JanFSchulte made their first contribution in #757
Full Changelog: v0.6.0...v0.7.0
v0.7.0rc1: delphinium rc1
Released on 2023-04-15 - GitHub - PyPI
What's Changed
- fix conv1d io_parallel resource by @jmitrevs in #403
- Speed up CI tests by @thesps in #407
- Fix GlobalPooling1D Layers by @jmduarte in #399
- Fix batched multiple inputs by @jmduarte in #414
- Fixed 'qkeras_mnist_dense' example build problem #423 by @siorpaes in #424
- Update for pyyaml 6.0 by @thesps in #435
axi_stream_driver
update by @nicologhielmetti in #420- Reshape fixes: don't repack stream for flatten; remove final reshape by @jmduarte in #443
- Fix Conv2D with
io_type = io_parallel
&Strategy: Resource
by @thesps in #448 - Support applying Softmax over multidimensional tensors by @vloncar in #384
- Disable some unsupported layers by @thesps in #447
- Fixes: quantized_relu & unsigned profiling part II by @thesps in #441
- GarNet and GarNetStack in config.py by @yiiyama in #344
- support ZeroPadding layers by @jmduarte in #480
- New backend development framework by @vloncar in #395
- Register
ApplyAlpha
layer templates by @thesps in #499 - Parsing extended by @nicologhielmetti in #501
- Remove intermediate casting in product by @jmitrevs in #490
- Add QKeras as a package dependency by @vloncar in #511
- Copy flows from config by @thesps in #510
- VivadoAccelerator backend updates by @thesps in #508
- Optimized look-up table by @nemerchiedde in #527
- Upsampling2D test case by @ChiRuiChen in #520
- Support UpSampling1D by @vloncar in #475
- RNN support (part 1) by @vloncar in #521
- Quartus Custom Matrix Multiplication & Quantization by @bo3z in #523
- Vivado-equivalent implementation of Softmax on Quartus by @bo3z in #540
- Ensure 2 bits for scale in po2 quantizers by @vloncar in #531
- Link update by @bkmgit in #519
- Fix removal of nodes ingested by multiple downstream nodes by @jmduarte in #544
- Enable SeparableConv2d by @jmduarte in #547
- Extension API by @vloncar in #528
- change string ReuseFactor to int by @jmitrevs in #416
- Make the size of bn scale and bias what they really are by @jmitrevs in #532
- Raise runtime error when a layer is named
input
by @jmduarte in #482 - fix insertion before a node with multiple inputs + support additional broadcasting by @jmduarte in #551
- Pointwise conv1d/2d resource by @jmduarte in #471
- Quartus Embedding Layer by @bo3z in #548
- Fix for QActivations passed as an argument by @AdrianAlan in #553
- Don't override precision directly in the QKeras optimizer by @vloncar in #567
- Remove the in/out size from top function by @vloncar in #559
- Transpose2d, Concatenate2d, and up to 3 Clones for io_stream by @jmduarte in #402
- Remove io_serial as io_stream and add some more info in docs. by @Duchstf in #334
- Update docs for v0.6.0 by @thesps in #453
- Use correct number of args for multiple outputs by @apfusco in #487
- Fixed a few typos in the documentation by @pitmonticone in #467
- returning integer from _compute_n_samples by @JochiSt in #537
- Providing support for Alveo boards by @selwyn96 in #552
- Make layer names case sensitive in config. by @jmitrevs in #577
- Add issue and PR templates by @jmduarte in #582
- Vivado Backend GRU/LSTM support by @drankincms in #560
- Update CI template syntax by @thesps in #593
- Update flow dependencies by @vloncar in #588
- Fix parsing of ZeroPadding layers by @vloncar in #595
- remove cppname by @jmitrevs in #562
- Remove email helpline from the docs by @vloncar in #601
- Fixes for GRU/LSTM in Vivado backend by @drankincms in #598
- Remove io_serial by @vloncar in #609
- Fix test_graph by @vloncar in #611
- Override parent backend optimizer passes with derived backend passes by @thesps in #597
- Enforce function pipelining when using io_parallel with Resource strategy by @vloncar in #605
- FIFO depth optimization by @nicologhielmetti in #509
- Add tracing support for the quartus backend by @jmitrevs in #583
- Quartus streaming support for Activations, Dense & Batch Normalization by @bo3z in #557
- QConv alpha != 1 bug fix by @bo3z in #612
- Quartus Stream Embedding by @bo3z in #625
- change master to main by @jmitrevs in #602
- Edit order of the optimizers in the flow so that BramFactor is followed by @jmitrevs in #621
- Softmax LUT Optimization by @bo3z in #570
- Quartus Synthesis Flow Improvement by @bo3z in #618
- Quartus Extensions by @bo3z in #628
- Quartus GRU by @bo3z in #596
- Quartus Merge layers by @bo3z in #634
- fix nondefault project name handling by @jmitrevs in #626
- Fix parsing of logic synthesis reports by @vloncar in #639
- Fix conv1d stream implementation hls directives by @Jonathan-Shoemaker in #635
- Implementation and optimizations linked to Simple-RNN and LSTM for qu… by @nemerchiedde in #575
- Softsign optimization by @nemerchiedde in #585
- Parallel CNNs, Pooling & Image Layers for Quartus Backend by @bo3z in #561
- Quartus Streaming Softsign (PR #585 contd.) by @bo3z in #655
- Remove final reshapes even for Quartus by @jmitrevs in #661
- Unrolled CNN implementation by @vloncar in #600
- the strategy was not propagated in the pytest by @jmitrevs in #663
- Fix keras model loading issue with loading model with KerasH5 by @calad0i in #664
- append applied_flows container before filling instead of after by @jmitrevs in #641
- set version using
setuptools_scm
by @jmduarte in #479 - Argmax Softmax by @bo3z in #627
- Fix version extraction in Sphinx config by @vloncar in #669
- Add requested citations to README by @jmduarte in #615
- skip BatchNorm fusion when input/output is used multiple times by @jmduarte in #481
- Use wider accum_t for (average) pooling by @vloncar in #681
- Quartus Streaming Conv, Pooling & Image layers by @bo3z in #656
- Create branch on PR by @jmduarte in #636
- Delete
example-prjs
directory by @jmduarte in #682 - Adiabatically turn on
pre-commit
by @jmduarte in #678 - Add causal padding by @cgutsche in #688
- Update
pre-commit
GitHub Action by @jmduarte in #689 - New config_from_keras_model by @vloncar in #690
- remove obsolete np.int and np.float by @jmitrevs in #703
- Update p-clang-format to work on mac by @jmduarte in #704
- Fix function call in Alveo tcl script by @vloncar in #694
- add readme for contrib by @jmitrevs in #706
- WIP Add custom KL loss layer HLS implementation by @katyagovorkova in #606
- Fix incorrectly linted build() command by @vloncar in #709
- For encoded convolution, add check for when min_width would have been larger than in_width by @jmitrevs in #610
- fifo_depth_optimization flow require ip, not writer, before running by @jmitrevs in #642
- update isort to fix pre-commit by @jmduarte in #719
- Fixed sign parsing for ac_fixed and ac_int by @jmitrevs in #727
- Correctly expand dims of pointwise layer by @vloncar in #715
- Support keepdims in GlobalPooling layers by @vloncar in #716
- Register layer attributes in VivadoAccelerator backend by @vloncar in #724
- Add quantized sigmoid, fix quantized tanh for QKeras by @jmitrevs in #569
- print_vivado_report function for nicer reports by @vloncar in #730
- Quartus bram factor by @jmitrevs in #700
- Fix inplace variables by @jmitrevs in #714
- Fix for cloned stream that is subsequently flattened by @jmduarte in #708
- Vitis HLS backend by @vloncar in #629
- Update documentation for v0.7.0 release by @jmduarte in #710
- Fix release notes + version in docs by @jmduarte in #742
- Fix precommits by @jmitrevs in #741
- mv
dependabot.yml
by @jmduarte in #743 - Bump actions/setup-python from 2 to 4 by @dependabot in #748
- fix Vitis pragmas messed up by pre-commit by @jmitrevs in #751
- Additional cleanup of the codebase by @vloncar in #750
- Fix for BatchNormalization layers with
center=False
orscale=False
by @jmduarte in #754 - Remove references to GPL since we now use a different license by @jmitrevs in #761
- Fix pooling layers when padding is applied from the left/top by @JanFSchulte in #757
- Further update documentation for 0.7.0 by @jmitrevs in #744
- Update pypi-publish.yml by @jmduarte in #763
- Fix pypi version by @jmduarte in #766
New Contributors
- @siorpaes made their first contribution in #424
- @nemerchiedde made their first contribution in #527
- @ChiRuiChen made their first contribution in #520
- @bo3z made their first contribution in #523
- @bkmgit made their first contribution in #519
- @apfusco made their first contribution in #487
- @pitmonticone made their first contribution in #467
- @JochiSt made their first contribution in #537
- @selwyn96 made their first contribution in #552
- @Jonathan-Shoemaker made their first contribution in #635
- @calad0i made their first contribution in #664
- @cgutsche made their first contribution in #688
- @dependabot made their first contribution in #748
- @JanFSchulte made their first contribution in #757
Full Changelog: v0.6.0...v0.7.0rc1
v0.6.0: coris
Released on 2021-11-12 - GitHub - PyPI
What's Changed
VivadoAccelerator
backend: targetpynq-z2
andzcu102
boards directly from hls4ml by @nicologhielmetti- Updated
PyTorch
andONNX
converters by @Duchstf line_buffer
Conv2D implementation forio_stream
: reduced resource usage and latency by @Keb-L, @violatingcp, @vloncar- Support
QConv2DBatchnorm
layer fromQKeras
by @nicologhielmetti - Improved profiling plots - easier to compare original vs
hls4ml
converted models by @maksgraczyk - Better derivation of data types for
QKeras
models by @jmduarte, @thesps - Improved CI by @thesps
- More support for models with branches, skip connections,
Merge
andConcatenate
layers by @jmduarte, @vloncar - Support for
Dense
layers over multi-dimensional tensors by @vloncar - Overall improvements by @vloncar, @jmduarte, @thesps, @jmitrevs & others
New Contributors
- @siorpaes made their first contribution in #424
- @jmitrevs made their first contribution in #403
- @anders-wind made their first contribution in #302
- @KOVI89alipes made their first contribution in #318
- @maksgraczyk made their first contribution in #323
- @Keb-L made their first contribution in #332
- @ConsVin made their first contribution in #307
- @nicologhielmetti made their first contribution in #298
Full Changelog: v0.5.0...v0.6.0
v0.5.0: bartsia
Released on 2021-03-05 - GitHub - PyPI
What's new:
- Streaming IO layer implementations, especially of Convolutional layers, accessed through the config with
IOType: io_stream
. Scales CNN support to much larger models than previously possible (see arXiv:2101.05108) - New documentation and API reference
- Further optimizations for QKeras / quantization aware training. A 'shift' operation is now used for
po2
quantizers - Allow redefinition of weights directory for standalone project compilation
profiling
for PyTorch models
Deprecated:
IOType : io_serial
is deprecated, and superceded by newIOType: io_stream
Bugfixes:
- Fix to Initiation Interval and different min/max latency for
Strategy: Resource
- Fix warnings in
hls4ml
command line script flow - Write yml config from Python API - for mixed API / command line flow
v0.5.0-beta
Released on 2021-01-18 - GitHub - PyPI
Pre-release of hls4ml version v0.5.0
.
What's new:
- Streaming IO layer implementations, especially of Convolutional layers, accessed through the config with
io_type: io_stream
. Scales CNN support to much larger models than previously possible (see paper) - New documentation and API reference
- Further optimizations for QKeras / quantization aware training. A 'shift' operation is now used for
po2
quantizers - Allow redefinition of weights directory for standalone project compilation
v0.4.0: aster
Released on 2020-10-30 - GitHub - PyPI
What's new:
- Support for GarNet layer (see paper)
- Input layer precision added to config generator utility
- New 'SkipOptimizers' config option. Now you can run all Optimizers by default (as in v0.3.0) but subtract any specified by 'SkipOptimizers' e.g.
hls_config['SkipOptimizers'] = ['fuse_consecutive_batch_normalization']
- Print out the latency report from Cosimulation
Bugfixes:
- Fixes related to tensorflow 2.3: new Functional API, changes to handling of Input layer
- Fix error with config generator utility and activation layers gor
granularity='name'
- Fix issue with reloading of emulation library after configuration change
- Fix to handling of layers with
use_bias=False
and merged Dense and BatchNormalization
v0.3.0
Released on 2020-07-31 - GitHub - PyPI
What's new:
- API expansion:
- Create configuration dictionary from model object
- Run 'C Simulation' from Python with
hls_model.predict(X)
- Trace model layer output with
hls_model.trace(X)
- Write HLS project, run synthesis flow from Python
- QKeras support: convert models trained using layers and quantizers from QKeras
- Example models moved to separate repo, added as a submodule with an API to retrieve them
- New Softmax implementations
- Minor fixes: weights exported at higher precision, concatenate layer shape corrected
v0.2.0
Released on 2020-03-31 - GitHub - PyPI
What's new:
tf_to_hls
: convert tensorflow protobuf (.pb
) models to HLS projects- Support for Keras model
.h5
files (extending existing support for.json
architecture +.h5
weights format) - Support larger Conv1D / 2D layers
- Support for binary and ternary layers from QKeras
- API enhancements for addition of custom layer and new backends
- Keras and HLS model profiling tool
hls4ml report
command to gather HLS build reportshls4ml build -l
command to run logic synthesis- Fused Batch Normalization and Dense layer optimization pass
v0.1.6
Released on 2020-02-10 - GitHub - PyPI
- Support for larger Dense layers (enabled with
Strategy: Resource
in the configuration file) - Binary/Ternary NN refinements
- Built-in optimization framework
- Optional C/RTL validation
v0.1.5
v0.1.2
Released on 2018-03-20 - GitHub - PyPI
Update license
v0.1.1
Released on 2018-03-16 - GitHub - PyPI
second beta version: fixed README