Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Big refactoring #30

Merged
merged 9 commits into from
Jun 18, 2020
Merged

Big refactoring #30

merged 9 commits into from
Jun 18, 2020

Conversation

cuviper
Copy link
Collaborator

@cuviper cuviper commented Jun 12, 2020

Sorry for a mega-PR, but I've been tinkering with upgrading dependencies
and fixing the big-endian problems. In trying to recoup performance, all
of these changes got intertwined. At a high level:

  • Bump all crates to 0.3 to account for breaking changes.
  • Upgrade to smallvec 1 -- raises the minimum Rust to 1.36.
  • Various dev-dependencies are upgraded as well.
  • primal_bit::BitVec now stores a Vec<u8>, and no longer exposes
    methods that depend on endianness. Fixes primal fails on big-endian machines #10.
    • A lot of the wheel operations were byte-oriented already, so they
      really depended on little-endian order before. With Vec<u8>, even
      big-endian targets have the expected byte order to find bits.
  • Iterators specialize methods like fold and all for performance.

We should probably update to 2018 edition too, but I didn't bother yet.

Sorry for a mega-PR, but I've been tinkering with upgrading dependencies
and fixing the big-endian problems. In trying to recoup performance, all
of these changes got intertwined. At a high level:

- Bump all crates to 0.3 to account for breaking changes.
- Upgrade to smallvec 1 -- raises the minimum Rust to 1.36.
- Various dev-dependencies are upgraded as well.
- `primal_bit::BitVec` now stores a `Vec<u8>`, and no longer exposes
  methods that depend on endianness. Fixes huonw#10.
  - A lot of the wheel operations were byte-oriented already, so they
    really depended on little-endian order before. With `Vec<u8>`, even
    big-endian targets have the expected byte order to find bits.
- Iterators specialize methods like `fold` and `all` for performance.

We should probably update to 2018 edition too, but I didn't bother yet.
@autohuonw
Copy link
Collaborator

r? @huonw

(rust_highfive has picked a reviewer for you, use r? to override)

@cuviper
Copy link
Collaborator Author

cuviper commented Jun 12, 2020

I used critcmp to compare criterion results with the master branch. It's a little weird in that instead of just calculating from old to new, it holds the fastest one as 1.00 on each benchmark (in bold green, but the color's not showing here), and reports the ratio >1 for the other.

primal benchmarks
group                                 master                                 refactor
-----                                 ------                                 --------
is_prime/Sieve::is_prime              1.29     42.8±1.06µs        ? B/sec    1.00     33.3±0.68µs        ? B/sec
is_prime/Sieve::is_prime with init    1.08    174.3±1.62µs        ? B/sec    1.00    160.9±0.22µs        ? B/sec
is_prime/is_prime                     1.00    640.3±0.82µs        ? B/sec    1.00    643.0±4.22µs        ? B/sec
primal-sieve benchmarks
group                                master                                 refactor
-----                                ------                                 --------
factor/Sieve/1048573                 1.02    792.2±3.80ns        ? B/sec    1.00    777.2±1.62ns        ? B/sec
factor/Sieve/131                     1.09     42.3±1.26ns        ? B/sec    1.00     38.9±0.22ns        ? B/sec
factor/Sieve/65521                   1.02    249.3±2.89ns        ? B/sec    1.00    245.4±0.54ns        ? B/sec
factor/Sieve/7561                    1.05    114.0±1.70ns        ? B/sec    1.00    108.8±0.46ns        ? B/sec
factor/Sieve/9699690                 1.44    133.5±1.54ns        ? B/sec    1.00     92.8±0.64ns        ? B/sec
iterate/Primes/100                   1.01    135.1±0.45µs        ? B/sec    1.00    133.1±0.07µs        ? B/sec
iterate/Primes/10000                 1.02    140.1±3.22µs        ? B/sec    1.00    137.4±0.07µs        ? B/sec
iterate/Primes/100000                1.00    165.5±1.05µs        ? B/sec    1.02    168.5±0.44µs        ? B/sec
iterate/Primes/1000000               1.00    481.4±3.70µs        ? B/sec    1.03    495.1±2.40µs        ? B/sec
iterate/Primes/10000000              1.00      3.2±0.04ms        ? B/sec    1.07      3.5±0.02ms        ? B/sec
iterate/Sieve with init/100          1.29    123.0±1.32ns        ? B/sec    1.00     95.6±0.61ns        ? B/sec
iterate/Sieve with init/10000        1.73      3.8±0.02µs        ? B/sec    1.00      2.2±0.02µs        ? B/sec
iterate/Sieve with init/100000       1.49     35.8±0.28µs        ? B/sec    1.00     24.0±0.09µs        ? B/sec
iterate/Sieve with init/1000000      1.36    385.5±1.93µs        ? B/sec    1.00    283.4±2.05µs        ? B/sec
iterate/Sieve with init/10000000     1.34      3.0±0.04ms        ? B/sec    1.00      2.2±0.00ms        ? B/sec
iterate/Sieve/100                    1.66     89.7±1.66ns        ? B/sec    1.00     54.1±0.14ns        ? B/sec
iterate/Sieve/10000                  1.83      3.7±0.05µs        ? B/sec    1.00      2.0±0.01µs        ? B/sec
iterate/Sieve/100000                 1.76     30.1±0.33µs        ? B/sec    1.00     17.2±0.04µs        ? B/sec
iterate/Sieve/1000000                1.64    248.1±2.26µs        ? B/sec    1.00    151.0±1.02µs        ? B/sec
iterate/Sieve/10000000               1.58      2.1±0.02ms        ? B/sec    1.00   1342.1±6.61µs        ? B/sec
new/Sieve/100                        1.03     38.6±1.49ns        ? B/sec    1.00     37.5±0.24ns        ? B/sec
new/Sieve/10000                      1.01    131.4±0.36ns        ? B/sec    1.00    130.2±2.60ns        ? B/sec
new/Sieve/100000                     1.00      6.3±0.01µs        ? B/sec    1.02      6.5±0.04µs        ? B/sec
new/Sieve/1000000                    1.00    129.1±0.72µs        ? B/sec    1.01    130.3±1.03µs        ? B/sec
new/Sieve/10000000                   1.01    860.6±6.44µs        ? B/sec    1.00    851.2±4.66µs        ? B/sec
nth_prime/Primes/100                 1.01    134.8±0.67µs        ? B/sec    1.00    134.0±0.88µs        ? B/sec
nth_prime/Primes/10000               1.00    166.9±1.28µs        ? B/sec    1.04    172.9±0.60µs        ? B/sec
nth_prime/Primes/100000              1.00    554.8±4.39µs        ? B/sec    1.05    581.2±1.97µs        ? B/sec
nth_prime/Primes/1000000             1.00      4.8±0.05ms        ? B/sec    1.07      5.2±0.02ms        ? B/sec
nth_prime/Sieve with init/100        1.00    110.0±1.16ns        ? B/sec    1.04    114.2±0.75ns        ? B/sec
nth_prime/Sieve with init/10000      1.01      7.8±0.23µs        ? B/sec    1.00      7.7±0.02µs        ? B/sec
nth_prime/Sieve with init/100000     1.02    134.5±4.45µs        ? B/sec    1.00    132.3±0.19µs        ? B/sec
nth_prime/Sieve with init/1000000    1.00  1279.1±44.83µs        ? B/sec    1.00   1283.7±3.81µs        ? B/sec
nth_prime/Sieve/100                  1.03     28.8±0.27ns        ? B/sec    1.00     27.9±0.25ns        ? B/sec
nth_prime/Sieve/10000                1.00    873.9±5.57ns        ? B/sec    1.01    884.9±2.75ns        ? B/sec
nth_prime/Sieve/100000               1.00      2.1±0.02µs        ? B/sec    1.02      2.1±0.01µs        ? B/sec
nth_prime/Sieve/1000000              1.00      2.2±0.01µs        ? B/sec    1.01      2.2±0.02µs        ? B/sec
nth_prime/StreamingSieve/100         1.01    359.4±3.39ns        ? B/sec    1.00    355.1±1.31ns        ? B/sec
nth_prime/StreamingSieve/10000       1.00      7.6±0.04µs        ? B/sec    1.00      7.6±0.11µs        ? B/sec
nth_prime/StreamingSieve/100000      1.03    134.3±0.83µs        ? B/sec    1.00    130.8±0.11µs        ? B/sec
nth_prime/StreamingSieve/1000000     1.01  1261.4±53.15µs        ? B/sec    1.00   1249.9±3.99µs        ? B/sec
prime_pi/Primes/100                  1.01    136.5±1.71µs        ? B/sec    1.00    134.9±1.05µs        ? B/sec
prime_pi/Primes/10000                1.04    142.5±1.89µs        ? B/sec    1.00    137.6±0.52µs        ? B/sec
prime_pi/Primes/100000               1.00    165.1±1.02µs        ? B/sec    1.02    168.7±0.58µs        ? B/sec
prime_pi/Primes/1000000              1.00   473.8±10.18µs        ? B/sec    1.04    492.9±1.84µs        ? B/sec
prime_pi/Primes/10000000             1.00      3.2±0.06ms        ? B/sec    1.10      3.5±0.01ms        ? B/sec
prime_pi/Sieve with init/100         1.09     56.9±0.41ns        ? B/sec    1.00     52.3±1.09ns        ? B/sec
prime_pi/Sieve with init/10000       1.05    234.2±2.27ns        ? B/sec    1.00    222.5±2.08ns        ? B/sec
prime_pi/Sieve with init/100000      1.01      6.9±0.25µs        ? B/sec    1.00      6.9±0.02µs        ? B/sec
prime_pi/Sieve with init/1000000     1.00    130.2±1.79µs        ? B/sec    1.01    131.1±0.29µs        ? B/sec
prime_pi/Sieve with init/10000000    1.00    835.2±6.62µs        ? B/sec    1.00    836.5±5.56µs        ? B/sec
prime_pi/Sieve/100                   1.08     15.3±0.45ns        ? B/sec    1.00     14.1±0.06ns        ? B/sec
prime_pi/Sieve/10000                 1.02    102.2±0.53ns        ? B/sec    1.00    100.6±0.41ns        ? B/sec
prime_pi/Sieve/100000                1.11    351.1±2.29ns        ? B/sec    1.00    316.7±1.16ns        ? B/sec
prime_pi/Sieve/1000000               1.02    117.6±0.58ns        ? B/sec    1.00    115.7±0.42ns        ? B/sec
prime_pi/Sieve/10000000              1.11    442.3±2.14ns        ? B/sec    1.00   398.2±10.04ns        ? B/sec
prime_pi/StreamingSieve/100          1.10    155.5±1.46ns        ? B/sec    1.00    142.0±0.50ns        ? B/sec
prime_pi/StreamingSieve/10000        1.00   1079.8±6.57ns        ? B/sec    1.01   1092.9±9.03ns        ? B/sec
prime_pi/StreamingSieve/100000       1.02      6.4±0.04µs        ? B/sec    1.00      6.3±0.02µs        ? B/sec
prime_pi/StreamingSieve/1000000      1.02    128.7±4.58µs        ? B/sec    1.00    125.6±0.12µs        ? B/sec
prime_pi/StreamingSieve/10000000     1.02   844.3±16.20µs        ? B/sec    1.00    826.4±2.47µs        ? B/sec

@cuviper

This comment has been minimized.

@codecov
Copy link

codecov bot commented Jun 16, 2020

Codecov Report

Merging #30 into master will decrease coverage by 1.76%.
The diff coverage is 92.30%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #30      +/-   ##
==========================================
- Coverage   51.22%   49.45%   -1.77%     
==========================================
  Files          17       17              
  Lines        3934     4531     +597     
==========================================
+ Hits         2015     2241     +226     
- Misses       1919     2290     +371     
Impacted Files Coverage Δ
generators/src/bin/wheel-generator.rs 0.00% <0.00%> (ø)
primal-sieve/src/wheel/wheel210.rs 0.00% <0.00%> (ø)
primal-sieve/src/streaming/primes.rs 90.52% <84.31%> (-8.01%) ⬇️
primal-sieve/src/streaming/mod.rs 90.90% <90.00%> (+1.78%) ⬆️
primal-sieve/src/sieve.rs 89.94% <92.30%> (-1.85%) ⬇️
primal-bit/src/inner.rs 100.00% <100.00%> (ø)
primal-bit/src/iter.rs 100.00% <100.00%> (ø)
primal-bit/src/lib.rs 100.00% <100.00%> (+6.14%) ⬆️
primal-bit/tests/from_rust.rs 100.00% <100.00%> (+0.31%) ⬆️
primal-sieve/src/streaming/presieve.rs 93.54% <100.00%> (-0.21%) ⬇️
... and 20 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1b2d907...ff6010c. Read the comment docs.

Copy link
Owner

@huonw huonw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the little that this is worth (as an absent maintainer), this direction looks great to me.

primal-sieve/src/sieve.rs Outdated Show resolved Hide resolved
primal-sieve/src/streaming/primes.rs Show resolved Hide resolved
@cuviper
Copy link
Collaborator Author

cuviper commented Jun 17, 2020

@huonw I do appreciate your review, absent or not. 🙂

If there are no further comments, I think I'll merge and publish it this afternoon.

@cuviper
Copy link
Collaborator Author

cuviper commented Jun 17, 2020

bors r+

bors bot added a commit that referenced this pull request Jun 17, 2020
30: Big refactoring r=cuviper a=cuviper

Sorry for a mega-PR, but I've been tinkering with upgrading dependencies
and fixing the big-endian problems. In trying to recoup performance, all
of these changes got intertwined. At a high level:

- Bump all crates to 0.3 to account for breaking changes.
- Upgrade to smallvec 1 -- raises the minimum Rust to 1.36.
- Various dev-dependencies are upgraded as well.
- `primal_bit::BitVec` now stores a `Vec<u8>`, and no longer exposes
  methods that depend on endianness. Fixes #10.
  - A lot of the wheel operations were byte-oriented already, so they
    really depended on little-endian order before. With `Vec<u8>`, even
    big-endian targets have the expected byte order to find bits.
- Iterators specialize methods like `fold` and `all` for performance.

We should probably update to 2018 edition too, but I didn't bother yet.

Co-authored-by: Josh Stone <[email protected]>
@cuviper
Copy link
Collaborator Author

cuviper commented Jun 17, 2020

err, let's try that with your review reflected, at least.

@cuviper
Copy link
Collaborator Author

cuviper commented Jun 17, 2020

bors r-

@bors
Copy link
Contributor

bors bot commented Jun 17, 2020

Canceled.

@cuviper
Copy link
Collaborator Author

cuviper commented Jun 17, 2020

bors r=huonw

bors bot added a commit that referenced this pull request Jun 17, 2020
30: Big refactoring r=huonw a=cuviper

Sorry for a mega-PR, but I've been tinkering with upgrading dependencies
and fixing the big-endian problems. In trying to recoup performance, all
of these changes got intertwined. At a high level:

- Bump all crates to 0.3 to account for breaking changes.
- Upgrade to smallvec 1 -- raises the minimum Rust to 1.36.
- Various dev-dependencies are upgraded as well.
- `primal_bit::BitVec` now stores a `Vec<u8>`, and no longer exposes
  methods that depend on endianness. Fixes #10.
  - A lot of the wheel operations were byte-oriented already, so they
    really depended on little-endian order before. With `Vec<u8>`, even
    big-endian targets have the expected byte order to find bits.
- Iterators specialize methods like `fold` and `all` for performance.

We should probably update to 2018 edition too, but I didn't bother yet.

Co-authored-by: Josh Stone <[email protected]>
@bors
Copy link
Contributor

bors bot commented Jun 17, 2020

Build failed:

@cuviper
Copy link
Collaborator Author

cuviper commented Jun 17, 2020

bors r=huonw

bors bot added a commit that referenced this pull request Jun 17, 2020
30: Big refactoring r=huonw a=cuviper

Sorry for a mega-PR, but I've been tinkering with upgrading dependencies
and fixing the big-endian problems. In trying to recoup performance, all
of these changes got intertwined. At a high level:

- Bump all crates to 0.3 to account for breaking changes.
- Upgrade to smallvec 1 -- raises the minimum Rust to 1.36.
- Various dev-dependencies are upgraded as well.
- `primal_bit::BitVec` now stores a `Vec<u8>`, and no longer exposes
  methods that depend on endianness. Fixes #10.
  - A lot of the wheel operations were byte-oriented already, so they
    really depended on little-endian order before. With `Vec<u8>`, even
    big-endian targets have the expected byte order to find bits.
- Iterators specialize methods like `fold` and `all` for performance.

We should probably update to 2018 edition too, but I didn't bother yet.

Co-authored-by: Josh Stone <[email protected]>
@bors
Copy link
Contributor

bors bot commented Jun 17, 2020

Build failed:

@huonw
Copy link
Owner

huonw commented Jun 17, 2020

If there are no further comments, I think I'll merge and publish it this afternoon.

All good by me 👍

@cuviper
Copy link
Collaborator Author

cuviper commented Jun 17, 2020

Great, thanks! I'm just working through CI now...

bors r=huonw

@bors
Copy link
Contributor

bors bot commented Jun 18, 2020

Build succeeded:

@bors bors bot merged commit a7b5592 into huonw:master Jun 18, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

primal fails on big-endian machines
3 participants