Releases: davidsd/sdpb
3.0.0
Optimizations:
Implemented new distributed matrix multiplication algorithm for calculating Q matrix.
It employs MPI shared windows, Chinese Remainder Theorem, FLINT and BLAS libraries.
Our benchmarks for large SDPB problems demonstrated more than ~2.5x overall program speedup and much better performance scaling with increasing number of CPUs and nodes, as compared to the 2.7.0 release.
In addition, improved RAM usage in the new algorithm now allows to solve even larger problems where Q matrix does not fit into single node memory.
New features:
- New option
--maxSharedMemory
allowing to reduce memory usage in the new matrix multiplication algorithm. If the limit is not set, it is calculated automatically. See #209, #229. - Added new
--verbosity trace
level #230.
Other improvements:
- Print iterations data and condition numbers to
iterations.json
in the output folder. See #228, #231, #232, #233. - In debug mode, write profiling data for both timing run and actual run, see #215.
- In debug mode, print maximal memory usage #231.
- Graceful exit on SIGTERM, see #208.
- Build multiplatform Docker image (AMD64+ARM64) on CircleCI, see #225.
New dependencies:
What's Changed
- Fix #201 Graceful exit on SIGTERM + cosmetic fixes by @vasdommes in #208
- Fix #207 bigint-syrk-blas: account for MPI shared memory limits (by splitting shared windows) by @vasdommes in #209
- Calculate Q matrix using Chinese Remainder Theorem, FLINT and BLAS libraries by @vasdommes in #142
- Fix #211 Optimize reduce-scatter for Q matrix by @vasdommes in #212
- Fix #175 Debug mode: write profiling for actual run + misc improvements by @vasdommes in #215
- Misc improvements: shared window warnings, openblas->cblas in waf configure, update dependencies in Dockerfile, add non-HPD to docs by @vasdommes in #217
- Fix printing window size messages: input and output window were mixed up by @vasdommes in #218
- Fix #219 Updating block_timings leads to checkpoint loading errors by @vasdommes in #220
- Minor memory-related fixes and improvements: pretty-print bytes, by @vasdommes in #221
- Build multiplatform Docker image (AMD64+ARM64) on CircleCI by @vasdommes in #225
- Minor fixes and improvements by @vasdommes in #227
- Print SDPB iterations to out/iterations.json, compute condition numbers for each step by @vasdommes in #228
- Fix #206 Determine --maxSharedMemory automatically, if not set by user. by @vasdommes in #229
- Add --verbosity=trace level by @vasdommes in #230
- Minor improvements: full precision for iterations.json, print max MemUsed, improve test output by @vasdommes in #231
- Minor fixes: fix compilation for Boost 1.81, fix precision for iterations.json by @vasdommes in #232
- Compute R-err and print it to iterations.json by @vasdommes in #233
- Minor fixes: UB in compute_block_grid_mapping(), tests for iterations.json, compilation by @vasdommes in #237
- Update docs for FLINT + minor configuration changes by @vasdommes in #239
- Update docs to 3.0.0 by @vasdommes in #240
Full Changelog: 2.7.0...3.0.0
2.7.0
What's new
Breaking changes:
- New executable
pmp2sdp
instead of separatesdp2input
andpvm2sdp
. Reads Polynomial Matrix Programs in JSON, Mathematica, or XML formats. We recommend using JSON since it is more compact, efficient and universal than Mathematica or XML.
See #181, #184.
Example:
pmp2sdp --precision 768 --input in/pmp.json --output out/sdp --verbosity regular
By default, pmp2sdp
writes sdp to plain directory (as in 2.4.0 and earlier). Run it with --zip
flag to write to zip archive instead (as in 2.5.0 - 2.6.1).
Note that writing to zip can be slow for large problems, see plots in #176.
sdp2input
and pvm2sdp
are deprecated and will be removed in future versions.
sdpb
option--procsPerNode
is deprecated and will be removed in future versions. Now MPI configuration is determined automatically. See #169.- Mathematica script SDPB.m now writes PMP in JSON format instead of XML. Use
WritePmpJson
(or its aliasWriteBootstrapSDP
) function.WritePmpXml
is left for backward compatibility.
NB: you should also change PMP file extension from.xml
to.json
in your Mathematica scripts.
Using JSON instead of XML results in ~x2 speedup when running Bootstrap2dExample.m script. See #199.
New features:
- Added
sdpb
option--writeSolution=z
. Restores the vectorz
(see SDPB Manual, eq. 3.1) from the vectory
(eq. 2.2) andsdp/normalization.json
, and writes it toout/z.txt
.
See #191. - Added new field
"block_path"
tospectrum
JSON output. For each block,"block_path"
contains its path from the PMP input. See #190. - Added optional fields to
pmp.json
format:"samplePoints"
,"sampleScalings"
and"bilinearBasis"
, same as in the old XML format.
See #199.
Other improvements:
- Optimized IO and RAM usage when reading big SDP blocks, see #167.
- Optimized reading and writing for
pmp2sdp
, got order-of-magnitude speedup.
See #176 and #181. - Fix undefined behaviour (sometimes leading to SEGFAULT) when reading PMP with a prefactor containing duplicate poles. See #195.
- Fix incorrect block mapping for cyclic MPI job distribution across nodes, see #170.
- Print number of MPI processes and nodes, print SDP dimensions and number of blocks. See #169.
- Print source code location and stacktrace for exceptions, see #180 and #198.
- Rewrote Section 3 of SDPB Manual, see #199.
- Internal improvements: new macros
RUNTIME_ERROR
,ASSERT
,ASSERT_EQUAL
andDEBUG_STRING
, new JSON parser, better GMP<->MPFR conversion, fix compiler warnings, refactor integration tests.
See #180, #189, #196, #197.
List of merged PRs
- Fix #164 Excessive IO and RAM usage when reading big SDP blocks by @vasdommes in #167
- Fix #141 Determine --procsPerNode automatically, reuse shared memory communicator, improve output by @vasdommes in #169
- Fix #166 Incorrect block mapping for cyclic MPI job distribution across nodes by @vasdommes in #170
- EPIC sdp2input/pvm2sdp refactoring + parallel reading optimization by @vasdommes in #176
- Print file and line number in exception, add ASSERT() and RUNTIME_ERROR() macros by @vasdommes in #180
- New pmp2sdp executable instead of separate sdp2input and pvm2sdp by @vasdommes in #181
- Fix #183 pmp2sdp options: --zip flag instead of --zip=(false|true) option by @vasdommes in #184
- Internal improvements: ASSERT_EQUAL/DEBUG_STRING macros, new JSON parser, better GMP<->MPFR conversion by @vasdommes in #189
- Fix #187 spectrum: write input file path for each block in the output JSON file by @vasdommes in #190
- Fix #185 Add --writeSolution=z option to SDPB, write to output z.txt by @vasdommes in #191
- Fix #194 [pmp2sdp with duplicate poles] ERROR: AddressSanitizer: stack-buffer-overflow in operator_plus_set_Derivative_Term.cxx by @vasdommes in #195
- Fix compilation for gcc 9.2.0 + different warnings by @vasdommes in #196
- Refactor integration tests by @vasdommes in #197
- Universal JSON format for Polynomial Matrix Program + write JSON from SDPB.m + update SDPB Manual by @vasdommes in #199
- Fix #115 Print exception stacktrace when SDPB fails by @vasdommes in #198
- Update docs for 2.7.0 + fix compilation on Imperial HPC by @vasdommes in #200
Full Changelog: 2.6.1...2.7.0
2.6.1
What's new
- Fixed major memory leak in binary SDP deserialization. This may significantly reduce RAM usage compared to 2.6.0. See #160.
- Improved debug output: print
MemUsed
andMemTotal
for each node, printmax MemUsed
at the end,
disableproc/self/statm
. See #161.
List of merged PRs
- Move common headers from src/ to src/sdpb_util/ or other projects by @vasdommes in #157
- Fix #159 Memory leak in binary SDP deserialization by @vasdommes in #160
- Improve debug mode output: print MemUsed and MemTotal for each node, disable proc/self/statm by @vasdommes in #161
- Update docs to 2.6.1 by @vasdommes in #163
Full Changelog: 2.6.0...2.6.1
2.6.0
Version 2.6.0
What's new
sdpb
- New INCOMPATIBLE format for sdp.zip.
Instead of a singleblock_XXX.json
file now we useblock_info_XXX.json
andblock_data_XXX.json
(orblock_data_XXX.bin
), see SDPB_input_format.md. See #114. - Compact binary format instead of JSON for
block_data_XXX
in sdp.zip.
When generating sdp.zip bysdp2input
orpvm2sdp
, you can use optional command-line argument to choose between
binary and JSON formats:
sdp2input --outputFormat FORMAT
pvm2sdp FORMAT PRECISION INPUT... OUTPUT
where FORMAT
is either bin
(used by default) or json
.
We recommend using the default binary format since it is more compact and efficient than json. See #114, #128, #119, #149.
Tests and CI/CD
- Tests are reorganized and rewritten from shell scripts to C++ Catch2 framework.
Added unit tests and realistic end-to-end tests. See #91, #102, #109, #119. - CI/CD pipelines on CircleCI.
Tests are run for eachgit push
to the repo and for each pull request. Docker images for themaster
branch and for each release starting from 2.6.0 are uploaded to Docker Hub automatically. See #133, #136.
Installations
- Updated installations for BU, Caltech, Expanse, Harvard, and Imperial College clusters, added example scripts.
- Added Apple MacBook installation instructions.
- Building Docker images and using them in Docker/Podman/Singularity is now much easier.
- New lightweight Docker image does not contain
scalar_blocks
andblocks_3d
anymore.
Separate images for sdpb, scalar_blocks and blocks_3d are now uploaded to https://hub.docker.com/u/bootstrapcollaboration. See #130. - Switched from C++14 to C++17. See #118, #121.
List of merged PRs
- Notes on current EPFL installation by @jmarucha in #69
- fix header references for boost 1.79.0 by @jmarucha in #70
- Print memory usage by @vasdommes in #83
- Cosmetic improvements by @vasdommes in #84
- Fix #81 Block_Info::read_block_costs(): use more realistic RAM estimate by @vasdommes in #82
- Organize tests by @vasdommes in #91
- Minor fixes by @vasdommes in #95
- Minor fixes: fix relative paths in .nsv + some cosmetics by @vasdommes in #101
- Rewrite integration tests from shell scripts to C++ Catch2 framework by @vasdommes in #102
- Realistic end-to-end tests for sdp2input + sdpb by @vasdommes in #109
- Integration tests improvements by @vasdommes in #112
- Fix #79 Binary input for sdpb by @vasdommes in #114
- Minor fixes: use cxx17, improve unit_tests output, optimize SDP block read by @vasdommes in #118
- Test and benchmark for block_data SDP serialization (JSON/binary) by @vasdommes in #119
- Fix #120 Replace boost::filesystem with std::filesystem (C++17) by @vasdommes in #121
- Fix #124 parse empty bilinear bases by @vasdommes in #126
- Various installation fixes and improvements by @vasdommes in #125
- Binary sdp: enable by default and add option to pvm2sdp by @vasdommes in #128
- New Dockerfile by @vasdommes in #130
- Fix #131 Simple CI: automatically run tests for each pull request by @vasdommes in #133
- Add badges to Readme.md: License, Release (latest), CircleCI build status, open issues by @vasdommes in #135
- CircleCI: add deploy-master and deploy-tag jobs to the workflow. by @vasdommes in #136
- Minor improvements: Docker, CircleCI, tests, .gitignore, .clang-format by @vasdommes in #139
- Added version info while calling --help by @bharathr98 in #144
- Fix #99 Push/pop Timer name prefixes by @vasdommes in #146
- Fix #148 Binary SDP format is larger than json format if blocks contain many zeros by @vasdommes in #149
- Minor fixes: more timers and output messages, lightweight git tags, use total time for --maxRuntime by @vasdommes in #151
- Update docs for 2.6.0 by @vasdommes in #138
- Minor improvements: new spectrum test, updated Macbook instructions by @vasdommes in #154
New Contributors
- @jmarucha made their first contribution in #69
- @vasdommes made their first contribution in #83
- @bharathr98 made their first contribution in #144
Full Changelog: 2.5.1...2.6.0
2.5.1
Full Changelog: 2.5.0...2.5.1
Version 2.5.1
outer_limits
-
Treat the behavior near zero with special care. This means that
input files now need to specifyepsilon_value
in addition to
infinity_value
. -
Added the
--meshThreshold
option to customize how finely the
mesh is divided when searching for negative regions. -
Added the
--useSVD
option to control whether to regularize the
problem with an SVD. -
Added the
sdp2functions
andpvm2functions
programs to convert
JSON and XML input files to the format thatouter_limits
expects. -
Fix an uninitialized matrix bug.
-
Increase the maximum number of allowed SVD iterations because we are
working at high precision.
spectrum
- Initial release.