You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
rcx is the static array pointer, rax is the index. without rax you'd be accessing absolute memory addresses 0 to 99, i.e. null pointer dereference.
I personally use a slightly different but ultimately equivalent method, that is also good for arbitrary length arrays (array pointer in rdi, array length in rsi as number of 32-bit ints):
sumarray:
lea rdi, [rdi + rsi * 4] ;; convert to an endpointer
neg rsi ;; we'll loop till rsi is no longer negative
xor eax, eax ;; clear summing register
.loop:
add eax, [rdi + rsi * 4] ;; dword ptr is implicit from using eax as target register, no immediate word in instruction stream
add rsi, 1 ;; we're counting whole int's. inc doesn't set all flags so creates a false dependency on prior flags
jl .loop ;; add;jl will macro-fuse into a single µop.
ret
https://github.com/algorithmica-org/algorithmica/blob/master/content/english/hpc/architecture/loops.md
I don't understand why the +rcx is needed? Isn't it enough to add back the 100?
The text was updated successfully, but these errors were encountered: