Releases: lanl/bml
Releases · lanl/bml
v2.4.0
What's Changed
- Pull devcontainer from Docker Hub by @nicolasbock in #709
- Set bash shell for devcontainer user by @nicolasbock in #710
- Fix CMake warning by @nicolasbock in #711
- Add some default extensions to devcontainer by @nicolasbock in #712
- Update Conda badges by @nicolasbock in #713
- Add missing sync calls after magma functions by @jeanlucf22 in #714
- Rocsparse add targeting OLCF Frontier by @mewall in #716
- Add script to install rocsparse headers on OLCF machines by @mewall in #717
- Fix bug in indent.sh script by @mewall in #722
- Fix OMP offload build memory leaks by @mewall in #720
- Fix progress rocsparse tests by @mewall in #723
- Remove support for ellsort format by @jeanlucf22 in #724
- Add doxygen package for API builds by @nicolasbock in #725
- Update API documentation by @nicolasbock in #726
- Updated tranpose operations to support non-symmetric matrix structure. by @oseikuffuor1 in #727
- Implement noinit allocation for dense and distributed2d by @jeanlucf22 in #728
- Bug fixes for norm calculation. by @oseikuffuor1 in #729
- Remove unused variables, duplicate code by @jeanlucf22 in #730
- Reorganize "domain" code by @jeanlucf22 in #732
- Separate real offload diagonalize routines. by @jmohdyusof in #733
Full Changelog: v2.3.0...v2.4.0
Release v2.3.0
What's Changed
- Also lint the shell scripts by @nicolasbock in #528
- Fix element_multiply with magma by @jeanlucf22 in #581
- Fix bcast function for dense format with magma by @jeanlucf22 in #586
- Update bml_allocate.c by @tokshgithub in #376
- Clean up CI tests by @nicolasbock in #587
- Add example for assembly generation by @nicolasbock in #588
- Simplify BLAS/LAPACK find in cmake by @jeanlucf22 in #583
- Add scripts for MAGMA builds on spock and crusher by @mewall in #585
- Add
BLA_VENDOR
to build.sh by @nicolasbock in #589 - Fix which line is highlighted by @nicolasbock in #590
- Add key by @nicolasbock in #591
- Fix MAGMA case for write/read distributed2d by @jeanlucf22 in #593
- Remove unnecessary build time checks by @jeanlucf22 in #594
- Fix build to use ScaLapack by @jeanlucf22 in #595
- Implement rocSOLVER dsyevd for MAGMA build by @mewall in #597
- Add build script for summit at OLCF by @jeanlucf22 in #596
- Clean up build script with magma on Summit by @jeanlucf22 in #600
- Build scripts for crusher offload by @mewall in #601
- Clean up c-tests by @jeanlucf22 in #606
- Enable use of IBM XL compilers with openMP Offload support by @oseikuffuor1 in #605
- Add simple test to verify OpenMP offload by @jeanlucf22 in #607
- Added cusparse functionalities for ELLPACK. by @oseikuffuor1 in #609
- Add file to run CI on GPU at OLCF by @jeanlucf22 in #611
- Only check
push
events onmaster
branch by @nicolasbock in #614 - Intel offload by @jmohdyusof in #610
- Add pre-commit hook script by @nicolasbock in #613
- Upgrade CI to llvm-12 by @nicolasbock in #615
- Upgrade
stale
action to latest release by @nicolasbock in #616 - Enable CI testing on OS X by @nicolasbock in #172
- Add
BML_VERSION
by @nicolasbock in #622 - Switch to OpenBLAS on crusher to avoid fortran errors by @mewall in #624
- Add sanity check for
build.sh
options by @nicolasbock in #620 - Minor revisions to cuSPARSE code and offload flags by @mewall in #626
- Add
indented
files togitignore
by @nicolasbock in #629 - Add build script for GPU offload on Summit by @jeanlucf22 in #630
- Fix offload bug in bml_set_diagonal_ellpack() by @mewall in #631
- Add
less
to CI container by @nicolasbock in #636 - Strengthen test get_set_diagonal by @jeanlucf22 in #625
- Update bml_add_dense_typed.c by @cnegre in #638
- Offload mods for crusher development by @mewall in #633
- Turn complex code off when BML_COMPLEX off by @jeanlucf22 in #639
- Fix no-complex build with MPI by @jeanlucf22 in #641
- Fail at build time if magma asked for but not found by @jeanlucf22 in #643
- Lessen cmake requirements for non-GPU build by @jeanlucf22 in #640
- Add rocSPARSE method for ellpack multiply by @mewall in #645
- Store result of failed lint job by @nicolasbock in #648
- Fix no-complex build for bml_elemental code by @jeanlucf22 in #649
- Support latest magma on crusher by @mewall in #650
- Updated deprecated Dockerfile commands by @nicolasbock in #655
- Replace
_Complex
withcomplex
by @nicolasbock in #651 - Test with OpenBLAS as an alternative by @nicolasbock in #656
- Add missing include statement by @nicolasbock in #658
- Optimize openmp offload for rocsparse by @mewall in #657
- Look for Lapack functions in OpenBLAS by @jeanlucf22 in #654
- Specify magma version to use for CI on Ascent@OLCF by @jeanlucf22 in #659
- Remove check for fabs by cmake by @jeanlucf22 in #661
- Change macro DO_MPI to BML_USE_MPI by @jeanlucf22 in #666
- Couple with ELPA by @jeanlucf22 in #665
- Fix some compiler warnings related to calloc by @jeanlucf22 in #670
- Remove OpenCL from CMakeLists.txt by @jeanlucf22 in #673
- Fix functions/values used for multiply x2 test by @jeanlucf22 in #669
- Fix accumulate_offdiag for distributed2d GPU by @jeanlucf22 in #675
- Simplify offload code using BML_OFFLOAD_CHUNKS macro by @mewall in #663
- Added cusparse capability for bml_transpose_ellpack. by @oseikuffuor1 in #674
- Fix tolerance in matrix norm test by @jeanlucf22 in #676
- Fix conflict with ELPA macro by @jeanlucf22 in #677
- Add MPI to GPU testing by @jeanlucf22 in #678
- Clean up OpenMP flags setup in CMakeLists.txt by @jeanlucf22 in #682
- Modify macro to please clang compiler by @jeanlucf22 in #680
- Fix link line in bml.pc by @jeanlucf22 in #679
- Fix cmake ROCm dependencies on OLCF crusher by @mewall in #683
- Move CMake package definitions by @nicolasbock in #685
- Update container workflow by @nicolasbock in #686
- Renamed default image to Bionic by @nicolasbock in #687
- Add Docker Hub badge by @nicolasbock in #689
- Adjust tolerance float matrix multiply tests by @jeanlucf22 in #690
- Add new function to print version number by @jeanlucf22 in #691
- Expose
bml_print_version
to Fortran by @nicolasbock in #694 - Print diff if linter failure by @nicolasbock in #696
- Fix rocsparse multiply using rocm 5.3 by @mewall in #695
- Debug bml_add_ellpack and Fortran build on crusher by @mewall in #700
- Dense offload by @jmohdyusof in #701
- Add build time option of using syevd by @jeanlucf22 in #702
- Split function bml_diagonalize_dense_single/double_real by @jeanlucf22 in #704
- Upgrade base image to Focal by @nicolasbock in #688
- Minor fix to enable debug build w magma by @jeanlucf22 in #705
- Fix memory management bug in rocsparse multiply by @mewall in #706
- Add devcontainer definition by @nicolasbock in #707
- Add local user to devcontainer by @nicolasbock in #708
- Version bump to v.2.3.0 by @nicolasbock in #699
Full Changelog: v2.2.0...v2.3.0
Release v2.2.0
Feature updates and bug fixes
What's Changed
- Fix dir local variables by @nicolasbock in #529
- Modifications to support additional offload builds by @mewall in #527
- Enable override of variables for tests by @nicolasbock in #532
- added sparse matrix example by @finkeljos in #534
- Fix copy and setters by @oseikuffuor1 in #536
- Fix paths in tag update script by @nicolasbock in #537
- Modification to enable Cray fortran build by @mewall in #538
- Fix Summit build script by @jeanlucf22 in #540
- Change the way MPI datatype is built for dense case by @jeanlucf22 in #543
- Add CUDA libraries when linking if CUDA is found by cmake by @mewall in #544
- set_row distributed2d by @jeanlucf22 in #548
- Bug fix in transpose for distributed2d by @jeanlucf22 in #549
- Add CUDA libraries when linking if CUDA is found by cmake by @mewall in #545
- Split container jobs into separate jobs by @nicolasbock in #550
- Add Conda badge by @nicolasbock in #552
- Update webpage with new badge by @nicolasbock in #553
- Add bml_get_deep_type function by @jeanlucf22 in #551
- Bug fix and add test for fixed function by @jeanlucf22 in #554
- Add functionalities for distributed2d gershgorin by @jeanlucf22 in #555
- Switch to non-blocking MPI recv in multiplication by @jeanlucf22 in #556
- Use irecv in distributed transpose by @jeanlucf22 in #558
- Add get_sparsity function to distributed2d by @jeanlucf22 in #559
- Fix issues with ScaLapack call by @jeanlucf22 in #560
- Enable testing Fortran interface with MPI by @jeanlucf22 in #561
- Minor fixes in outputs for MPI runs by @jeanlucf22 in #563
- Fix matrix size for fortran tests by @jeanlucf22 in #564
- Add workflow to check for stale issues and PRs by @nicolasbock in #565
- Fix permission issue with stale workflow by @nicolasbock in #567
- Bug fix in datatype for normalize_distributed2d by @jeanlucf22 in #569
- Element-wise matrix multiplication and corresponding tests are added by @ares201005 in #562
- Automate API doc building by @nicolasbock in #435
- Ignore local worktrees by @nicolasbock in #573
- Fixup the style of the rst documentation by @nicolasbock in #574
- Add RTD badge by @nicolasbock in #575
- Fix for
sphinx<2
(as used on RTD) by @nicolasbock in #576 - Fix readthedocs build by @nicolasbock in #577
- Make list style consistent by @nicolasbock in #578
- Upgrade pinned package versions by @nicolasbock in #579
- Shorten paths in
__FILE__
expansion by @nicolasbock in #370
New Contributors
- @finkeljos made their first contribution in #534
Full Changelog: v2.1.1...v2.2.0
Release v2.1.2
Maintenance release
Release v2.1.1
Maintenance Release
- Fix
install
verb ofbuild.sh
script
Release v2.1.0
Release Notes
- Change usage of queues in MAGMA
- Add MKL GPU support for the dense format
- Implement write/read functions for distributed2d
- Support OpenMP GPU offload on iris nodes at ANL
- Accelerated GPU offload bml_add_ellpack and bml_multiply_ellpack
- Implement diagonalization, norm for distributed2d
- Add trace, transpose functions for distributed2d
- Initial implementation of Cannon's algorithm
- Implement first distributed functionalities
Release v2.0.1
Maintenance release
Release v2.0.0
Release v2.0.0
Release v1.3.1
Bugfix release. The bml is now tested on gcc-9.
Release v1.3.0
Some bug fixes and improvements.