From 0a7bf7ea7ad3adc68aaa932e47b479f7ef3e96d4 Mon Sep 17 00:00:00 2001 From: Daniel at CosmicDNA Date: Mon, 19 Feb 2024 10:46:26 +0000 Subject: [PATCH 1/2] :memo: Add Assembly information and instructions to PrimeCPP solution 3. --- PrimeCPP/solution_3/README.md | 42 ++++++++++++++++++++++++++--------- 1 file changed, 32 insertions(+), 10 deletions(-) diff --git a/PrimeCPP/solution_3/README.md b/PrimeCPP/solution_3/README.md index f9f794f8f..d34cabde2 100644 --- a/PrimeCPP/solution_3/README.md +++ b/PrimeCPP/solution_3/README.md @@ -46,36 +46,58 @@ Compared to other C++ implementations: Computing primes to 10000000 on 8 threads for 5 seconds. Passes: 2264, Threads: 8, Time: 5.00982, Average: 0.00221282, Limit: 10000000, Counts: 664579/664579, Valid : Pass - + davepl_par;2264;5.00982;8;algorithm=base,faithful=yes,bits=1 ### Docker performance - Single-threaded + Single-threaded ┌───────┬────────────────┬──────────┬────────┬────────┬──────────┬─────────┬───────────┬──────────┬──────┬───────────────┐ │ Index │ Implementation │ Solution │ Label │ Passes │ Duration │ Threads │ Algorithm │ Faithful │ Bits │ Passes/Second │ ├───────┼────────────────┼──────────┼────────┼────────┼──────────┼─────────┼───────────┼──────────┼──────┼───────────────┤ │ 1 │ cpp │ 1 │ davepl │ 3982 │ 5.00001 │ 1 │ base │ yes │ 1 │ 796.39857 │ └───────┴────────────────┴──────────┴────────┴────────┴──────────┴─────────┴───────────┴──────────┴──────┴───────────────┘ - - Multi-threaded + + Multi-threaded ┌───────┬────────────────┬──────────┬────────────┬────────┬──────────┬─────────┬───────────┬──────────┬──────┬───────────────┐ │ Index │ Implementation │ Solution │ Label │ Passes │ Duration │ Threads │ Algorithm │ Faithful │ Bits │ Passes/Second │ ├───────┼────────────────┼──────────┼────────────┼────────┼──────────┼─────────┼───────────┼──────────┼──────┼───────────────┤ │ 1 │ cpp │ 2 │ davepl_par │ 13192 │ 5.00080 │ 4 │ base │ yes │ 1 │ 659.49448 │ └───────┴────────────────┴──────────┴────────────┴────────┴──────────┴─────────┴───────────┴──────────┴──────┴───────────────┘ - - - Single-threaded + + + Single-threaded ┌───────┬────────────────┬──────────┬─────────────────────┬───────────┬──────────┬─────────┬───────────┬──────────┬──────┬────────────────┐ │ Index │ Implementation │ Solution │ Label │ Passes │ Duration │ Threads │ Algorithm │ Faithful │ Bits │ Passes/Second │ ├───────┼────────────────┼──────────┼─────────────────────┼───────────┼──────────┼─────────┼───────────┼──────────┼──────┼────────────────┤ │ 1 │ PrimeCPP │ 3 │ flo80_pol_constexpr │ 234051587 │ 5.00000 │ 1 │ base │ no │ 1 │ 46810317.40000 │ └───────┴────────────────┴──────────┴─────────────────────┴───────────┴──────────┴─────────┴───────────┴──────────┴──────┴────────────────┘ - - Multi-threaded + + Multi-threaded ┌───────┬────────────────┬──────────┬─────────────────────┬───────────┬──────────┬─────────┬───────────┬──────────┬──────┬───────────────┐ │ Index │ Implementation │ Solution │ Label │ Passes │ Duration │ Threads │ Algorithm │ Faithful │ Bits │ Passes/Second │ ├───────┼────────────────┼──────────┼─────────────────────┼───────────┼──────────┼─────────┼───────────┼──────────┼──────┼───────────────┤ │ 1 │ PrimeCPP │ 3 │ flo80_pol_constexpr │ 587300645 │ 5.00052 │ 24 │ base │ no │ 1 │ 4893659.18618 │ - └───────┴────────────────┴──────────┴─────────────────────┴───────────┴──────────┴─────────┴───────────┴──────────┴──────┴───────────────┘ \ No newline at end of file + └───────┴────────────────┴──────────┴─────────────────────┴───────────┴──────────┴─────────┴───────────┴──────────┴──────┴───────────────┘ + +### Derived Assembly + +The generated assembly code of this solution can be inspected by running the following command: + +```shell +clang++ $CXX_ARGS -S -masm=intel PrimeCPP_CONSTEXPR.cpp -o PrimeAssembly.s +``` + +This code might be further optimised by a seasoned assembly developer. To generate the binary, run: + +```shell +ASM_ARGS="-pthread -O3 -m64 -mtune=native" + +clang++ $ASM_ARGS PrimeAssembly.s -o primes +``` + +and to run the solution simply execute the binary: + +```shell +./primes +``` \ No newline at end of file From 7ed9535c99f5d1c664212b262583a38bfc5e3669 Mon Sep 17 00:00:00 2001 From: Daniel at CosmicDNA Date: Mon, 19 Feb 2024 14:05:07 +0000 Subject: [PATCH 2/2] :memo: Organise README.md accoringly. --- PrimeCPP/solution_3/README.md | 47 ++++++++++++++++++----------------- 1 file changed, 24 insertions(+), 23 deletions(-) diff --git a/PrimeCPP/solution_3/README.md b/PrimeCPP/solution_3/README.md index d34cabde2..6c48719db 100644 --- a/PrimeCPP/solution_3/README.md +++ b/PrimeCPP/solution_3/README.md @@ -16,9 +16,32 @@ Since the standard library does not provide required functions, sqrt and bitfiel *Note*: this solution is limited to numbers up to around 50,000,000 (stack size limit on Mac OS it seems). ## Run instructions +### From CPP Binary `./run.sh`, requires CLANG in a fairly recent version (supporting C++ 20) +### From Derived Assembly + +The generated assembly code of this solution can be inspected by running the following command: + +```shell +clang++ $CXX_ARGS -S -masm=intel PrimeCPP_CONSTEXPR.cpp -o PrimeAssembly.s +``` + +This code might be further optimised by a seasoned assembly developer. To generate the binary, run: + +```shell +ASM_ARGS="-pthread -O3 -m64 -mtune=native" + +clang++ $ASM_ARGS PrimeAssembly.s -o primes +``` + +and to run the solution simply execute the binary: + +```shell +./primes +``` + ## Output All on Apple M1 (Macbook Air) @@ -78,26 +101,4 @@ Compared to other C++ implementations: │ Index │ Implementation │ Solution │ Label │ Passes │ Duration │ Threads │ Algorithm │ Faithful │ Bits │ Passes/Second │ ├───────┼────────────────┼──────────┼─────────────────────┼───────────┼──────────┼─────────┼───────────┼──────────┼──────┼───────────────┤ │ 1 │ PrimeCPP │ 3 │ flo80_pol_constexpr │ 587300645 │ 5.00052 │ 24 │ base │ no │ 1 │ 4893659.18618 │ - └───────┴────────────────┴──────────┴─────────────────────┴───────────┴──────────┴─────────┴───────────┴──────────┴──────┴───────────────┘ - -### Derived Assembly - -The generated assembly code of this solution can be inspected by running the following command: - -```shell -clang++ $CXX_ARGS -S -masm=intel PrimeCPP_CONSTEXPR.cpp -o PrimeAssembly.s -``` - -This code might be further optimised by a seasoned assembly developer. To generate the binary, run: - -```shell -ASM_ARGS="-pthread -O3 -m64 -mtune=native" - -clang++ $ASM_ARGS PrimeAssembly.s -o primes -``` - -and to run the solution simply execute the binary: - -```shell -./primes -``` \ No newline at end of file + └───────┴────────────────┴──────────┴─────────────────────┴───────────┴──────────┴─────────┴───────────┴──────────┴──────┴───────────────┘ \ No newline at end of file