Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cxlrubytes - fast zero alloc lru for []bytes only. i'll be honored to have your feedback on this lru i created for your comparison, remarks and comments #16

Open
kolinfluence opened this issue Apr 18, 2024 · 13 comments
Labels
question Further information is requested

Comments

@kolinfluence
Copy link

i'll be honored to have your feedback on this lru i created for your comparison, remarks and comments.
hope to have your comments / input
https://github.com/cloudxaas/gocache/tree/main/lru/bytes

you can try on the hit / misses etc which i have no idea how you would do the benchmark comparison. thx in advance.

i created it coz most lru need to predefined a capacity but i would prefer to be limited by memory size set.

i've been following your lru for some time and your analysis is the most comprehensive.

@phuslu
Copy link
Owner

phuslu commented Apr 18, 2024

yes, I also think about it for a long time, I called it "BytesCache" before, please allow me share some thoughts:

  1. where key&value is []byte, It's no doubt that the size should be memory usage rather than items count
  2. the indexed map type, map[string]int or map[hash]int?
    map[string]int
    pros: faster due to cpu-cache friendly
    cons: cost more memory
    map[hash]int
    pros: minimalized memory usage and gc overhead, that currently phuslu/lru using.
    cons: a bit slow than map[string]int, because it need fetch and compare key in another place.
    So which should be chosen? I'd like say it depends you :)
    But after swiss table is merged to go, see runtime: use SwissTable golang/go#54766 (comment) I think we should definably use map[string]int always. because it is obviously fast than rhh hashtable in large scale.
  3. potential minor improvements, such as BCE with unsafe(like phuslu/lru), struct alignment, etc.

@kolinfluence
Copy link
Author

@phuslu hope to hv your contribution to the repo too. in fact hope u to have you as one of the maintainers.

@phuslu
Copy link
Owner

phuslu commented Apr 18, 2024

I still have interests the roadmap of #15 (comment)
although the last step of it (make it shareable) is too hard to complete, I still have motivation to try step 1&2 in my spare time.

Probably the cxlrubytes is simpler and faster choice in your case. But the rest parts(a mmaped and minimalized memory&gc bytes cache) of golang caches is still need someone(me) to make it done.

I regard it as a technical challenge on a larger scale, worth work it out but maybe no one will ever use it. 😄

@kolinfluence
Copy link
Author

@phuslu i will use it.
i found out my lru has caveats. it is slower if there are many puts and deletes. which will create alloc. not sure how to resolve. maybe you have better ideas.

step 3 is difficult to find an elegant solution. if you can help solve the alloc first. step 3... well, yeah. but i will definitely use it if you can get it done

@phuslu
Copy link
Owner

phuslu commented Apr 18, 2024

which will create alloc.

maybe there're is a implicit copy action in map[string]int -- just a guess.

@kolinfluence
Copy link
Author

kolinfluence commented Apr 18, 2024

@phuslu i thought there was issue, it was with my test code. the code is fine. still need some test. just got it working a few hours ago.

it's usable and 100% zero alloc.

p.s. : my other test code was the one generating the alloc.

@kolinfluence
Copy link
Author

@phuslu i updated batch eviction to speed it up else 1 by 1 too slow.

@phuslu
Copy link
Owner

phuslu commented Apr 18, 2024

Congrats, I guess you currently fingure out that a most suitable solution for your case.

BTW: considering https://github.com/cockroachdb/swiss impressive throughputs in large scale map, I think/guess the max performance solution of "lru as map" is forking the cockroachdb/swiss then add "prev/next" field to its entry to make it behavior like lru also. This is similar with what is elastic/go-freelru done before.

In particular, swiss no longer requires the bucket size to grow by a power of 2, which will avoid the “overcommit” issue of freelru.

@phuslu
Copy link
Owner

phuslu commented Apr 18, 2024

The overhead of generic function calls(seems that it cannot be inlined) prevents deeper performance optimization, which is why I plan to make one "BytesCache" or "MmapCache".

see golang/go#54766 (comment) and https://www.reddit.com/r/golang/comments/ts9neg/generics_can_make_your_go_code_slower/

@phuslu phuslu added the question Further information is requested label Apr 18, 2024
@kolinfluence
Copy link
Author

@phuslu thx for sharing insights, will wait for ur mmap version with eager anticipation.
i've created what i need to solve what i need now though. hope of course mmapcache can be memory optimized as current high performance lru sort of waste lots of mem. somehow i'm confident you'll get it done.

@kolinfluence
Copy link
Author

kolinfluence commented Apr 19, 2024

@phuslu updated and fixed cxlrubytes on some performance issues etc, it's now a bit faster.

  1. swiss map could be interesting. but i dont think i will add it in for quite some time because i've tried and hit a road bump
  2. do see if u can use anything from my code for the mmap cache u have in mind.

@phuslu
Copy link
Owner

phuslu commented Apr 19, 2024

  1. I have similar feelings about swiss, but my plan is to first remove its generics and turn it into a map[uint32]uint32 instance(I think this is relatively easier) then integrate to phuslu/lru. Although the performance on the CPU cache is degraded, it at least improves the performance of phuslu/lru on large numbers and retain the lowest memory usage.
  2. thanks, I will give a try soon.

@kolinfluence
Copy link
Author

kolinfluence commented Apr 20, 2024

@phuslu i've tried map[uint32]uint32 here, it's faster but it's limited to 4 bil keys, there can be some use case etc. but for my use case, i prefer the map[string]int as it's more "secure".

anyway, for your reference:
https://github.com/cloudxaas/gocache/tree/main/lrux/bytes

i'm looking forward to the mmap version actually for now, but that's beyond me coz i cant figure out a reasonably fast global lock for it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants