Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Composite keys for the key-value store. #179

Open
setrofim opened this issue Jul 26, 2023 · 0 comments
Open

Composite keys for the key-value store. #179

setrofim opened this issue Jul 26, 2023 · 0 comments
Labels
enhancement New feature or request

Comments

@setrofim
Copy link
Collaborator

Currently, the K-V store uses simple string keys. Since the key acts the retrieval index, this limits us to one efficient access pattern. This ok if we only assume retrieval as part of the provisioning/verification pipelines, however makes it awkward to implement more flexible access patterns that might be needed for management operations. Essentially, if we want to use the K-V store as more than a simple cache, the current interface is insufficient.

We could expose a more database-like interface instead, allowing running arbitrary queries, however that would limit potential implementations (that would need to support the query language).

An alternative would be to allow composite keys. That is a key would be trated as a vector/tuple with each field having meaning specific to a particular application. In addition, there would be a reserved value that can be applied to any field that means "ignore this field" when creating the access query.

Conceptually, our current K-V store looks like this:

key -> [value0, ..., valuen]

A single key maps onto zero or more values (our keys are not unique). When accessing values, we specify the key as the "query".

The proposed change is to

[key_elt0, ...,  key_eltn] -> [value0, ..., valuen]

A key is comprised of multiple elements that together form the overall key under which the value is stored. When accessing values we can use the reserved "any" value in place of any of the elements as part of the "query". As a shortcut, we can also say that if the query is shorter than the total number of elements, all values for elements past the end of the query are assumed to be "any". E.g. accessing [v0, *, vs] would return all values who's keys have key_elt0 set to v0, key_elt2 set to v2, and key_elt1 as well as key_elt3 to key_eltn can be any value.

Example use case: policy management

There are two major access patterns for policies -- based on tenant_id/scheme, and based on their UUID. Since the former is used during verification, it is considered more important and so our current policy store key is in the form tenant_id:scheme. The problem is that in order to access a policy with a specific UUID, we need to retrieved all policies for the associated tenant-id/scheme form the store, and then filter the results in code.

With composite keys, our key would be [tenant-id, scheme, uuid]. During verification, we can retrieve the relevant policies using something like ["myteant", "myscheme", *], i.e. indicating "any" uuid. For by-uuid retrieveal, the query key becomes `["myteant", *, "aaa-bb-cc-dddd"], i.e. specifying only the uuid and tenant, and indicating the scheme as "any".

Implementation considerations

Implementing keys with arbitrary element numbers would be difficult in practice. We would need to set the maximum number of elements to be fixed. There is a trade-off between query flexibility (want as many elements a possible) and efficient resource utilisation (want as few elements as possible). For our current purposes, three for four elements should be sufficient. For maximum flexibility, we might wan to make it an on-deployment configuration.

For specific kinds of backing stores, this can be implemented as follows:

  • For SQL databases, each element becomes a column, e.g. CREATE TABLE kvstore (elt0 text, elt1 text, elt2 text, value text);
  • For document-based database (such as MongoDB), we have a top-level document that looks like {key: {elt0: v0, elt1: v1, elt2: v2}, value: {...}}
  • For in-memory implementation, we join the elments into a single key (similar to what we have now): "v0:v1:v2", and use that as the key in a map. Note that in this case, we end up in the same inefficient situation that we have at the moment. We can improve on this if we assume that values are pointers (and so the same value can be efficiently "duplicated"), in which case we can store the pointer to each value under multiple keys: "v0:v1:*", "v0:*:v2", "v0:*:*", "*:v1:v2", "*:v1:*", and "*:*:v2" (note that the total number of such keys quickly explodes with the number of elements in our composite keys).
@setrofim setrofim added the enhancement New feature or request label Jul 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant