Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Enhancement]: simplify segment interface in segcore #38118

Open
1 task done
tedxu opened this issue Nov 30, 2024 · 1 comment
Open
1 task done

[Enhancement]: simplify segment interface in segcore #38118

tedxu opened this issue Nov 30, 2024 · 1 comment
Labels
kind/enhancement Issues or changes related to enhancement

Comments

@tedxu
Copy link
Contributor

tedxu commented Nov 30, 2024

Is there an existing issue for this?

  • I have searched the existing issues

What would you like to be added?

The segment interface, as well as its internal interface SegmentInternalInterface, is supposed to have only 2 memory access patterns, namely retrieve (data of 1 row) and sequential scan. Allowing random access (e.g., chunk_data() and chunk_view()) is prohibitive to segcore's future evolve. Take the transparent encryption feature as example, if random access interface is allowed, the scale of decryption operation is expanded to the whole chunk, which is not preferred.

Why is this needed?

No response

Anything else?

No response

@tedxu tedxu added the kind/enhancement Issues or changes related to enhancement label Nov 30, 2024
@xiaofan-luan
Copy link
Collaborator

Is there an existing issue for this?

  • I have searched the existing issues

What would you like to be added?

The segment interface, as well as its internal interface SegmentInternalInterface, is supposed to have only 2 memory access patterns, namely retrieve (data of 1 row) and sequential scan. Allowing random access (e.g., chunk_data() and chunk_view()) is prohibitive to segcore's future evolve. Take the transparent encryption feature as example, if random access interface is allowed, the scale of decryption operation is expanded to the whole chunk, which is not preferred.

Why is this needed?

No response

Anything else?

No response

But this seems to be avoidable?
because after search people need to specify output field and retrieve topk result.
I think control the block size might be enough. 16K/64K block size seems to be a great tradeoff under most cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/enhancement Issues or changes related to enhancement
Projects
None yet
Development

No branches or pull requests

2 participants