You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, the K-V store uses simple string keys. Since the key acts the retrieval index, this limits us to one efficient access pattern. This ok if we only assume retrieval as part of the provisioning/verification pipelines, however makes it awkward to implement more flexible access patterns that might be needed for management operations. Essentially, if we want to use the K-V store as more than a simple cache, the current interface is insufficient.
We could expose a more database-like interface instead, allowing running arbitrary queries, however that would limit potential implementations (that would need to support the query language).
An alternative would be to allow composite keys. That is a key would be trated as a vector/tuple with each field having meaning specific to a particular application. In addition, there would be a reserved value that can be applied to any field that means "ignore this field" when creating the access query.
Conceptually, our current K-V store looks like this:
key -> [value0, ..., valuen]
A single key maps onto zero or more values (our keys are not unique). When accessing values, we specify the key as the "query".
A key is comprised of multiple elements that together form the overall key under which the value is stored. When accessing values we can use the reserved "any" value in place of any of the elements as part of the "query". As a shortcut, we can also say that if the query is shorter than the total number of elements, all values for elements past the end of the query are assumed to be "any". E.g. accessing [v0, *, vs] would return all values who's keys have key_elt0 set to v0, key_elt2 set to v2, and key_elt1 as well as key_elt3 to key_eltn can be any value.
Example use case: policy management
There are two major access patterns for policies -- based on tenant_id/scheme, and based on their UUID. Since the former is used during verification, it is considered more important and so our current policy store key is in the form tenant_id:scheme. The problem is that in order to access a policy with a specific UUID, we need to retrieved all policies for the associated tenant-id/scheme form the store, and then filter the results in code.
With composite keys, our key would be [tenant-id, scheme, uuid]. During verification, we can retrieve the relevant policies using something like ["myteant", "myscheme", *], i.e. indicating "any" uuid. For by-uuid retrieveal, the query key becomes `["myteant", *, "aaa-bb-cc-dddd"], i.e. specifying only the uuid and tenant, and indicating the scheme as "any".
Implementation considerations
Implementing keys with arbitrary element numbers would be difficult in practice. We would need to set the maximum number of elements to be fixed. There is a trade-off between query flexibility (want as many elements a possible) and efficient resource utilisation (want as few elements as possible). For our current purposes, three for four elements should be sufficient. For maximum flexibility, we might wan to make it an on-deployment configuration.
For specific kinds of backing stores, this can be implemented as follows:
For SQL databases, each element becomes a column, e.g. CREATE TABLE kvstore (elt0 text, elt1 text, elt2 text, value text);
For document-based database (such as MongoDB), we have a top-level document that looks like {key: {elt0: v0, elt1: v1, elt2: v2}, value: {...}}
For in-memory implementation, we join the elments into a single key (similar to what we have now): "v0:v1:v2", and use that as the key in a map. Note that in this case, we end up in the same inefficient situation that we have at the moment. We can improve on this if we assume that values are pointers (and so the same value can be efficiently "duplicated"), in which case we can store the pointer to each value under multiple keys: "v0:v1:*", "v0:*:v2", "v0:*:*", "*:v1:v2", "*:v1:*", and "*:*:v2" (note that the total number of such keys quickly explodes with the number of elements in our composite keys).
The text was updated successfully, but these errors were encountered:
Currently, the K-V store uses simple string keys. Since the key acts the retrieval index, this limits us to one efficient access pattern. This ok if we only assume retrieval as part of the provisioning/verification pipelines, however makes it awkward to implement more flexible access patterns that might be needed for management operations. Essentially, if we want to use the K-V store as more than a simple cache, the current interface is insufficient.
We could expose a more database-like interface instead, allowing running arbitrary queries, however that would limit potential implementations (that would need to support the query language).
An alternative would be to allow composite keys. That is a key would be trated as a vector/tuple with each field having meaning specific to a particular application. In addition, there would be a reserved value that can be applied to any field that means "ignore this field" when creating the access query.
Conceptually, our current K-V store looks like this:
A single key maps onto zero or more values (our keys are not unique). When accessing values, we specify the
key
as the "query".The proposed change is to
A key is comprised of multiple elements that together form the overall key under which the value is stored. When accessing values we can use the reserved "any" value in place of any of the elements as part of the "query". As a shortcut, we can also say that if the query is shorter than the total number of elements, all values for elements past the end of the query are assumed to be "any". E.g. accessing
[v0, *, vs]
would return all values who's keys havekey_elt0
set tov0
,key_elt2
set tov2
, andkey_elt1
as well askey_elt3
tokey_eltn
can be any value.Example use case: policy management
There are two major access patterns for policies -- based on tenant_id/scheme, and based on their UUID. Since the former is used during verification, it is considered more important and so our current policy store key is in the form
tenant_id:scheme
. The problem is that in order to access a policy with a specific UUID, we need to retrieved all policies for the associated tenant-id/scheme form the store, and then filter the results in code.With composite keys, our key would be
[tenant-id, scheme, uuid]
. During verification, we can retrieve the relevant policies using something like["myteant", "myscheme", *]
, i.e. indicating "any" uuid. For by-uuid retrieveal, the query key becomes `["myteant", *, "aaa-bb-cc-dddd"], i.e. specifying only the uuid and tenant, and indicating the scheme as "any".Implementation considerations
Implementing keys with arbitrary element numbers would be difficult in practice. We would need to set the maximum number of elements to be fixed. There is a trade-off between query flexibility (want as many elements a possible) and efficient resource utilisation (want as few elements as possible). For our current purposes, three for four elements should be sufficient. For maximum flexibility, we might wan to make it an on-deployment configuration.
For specific kinds of backing stores, this can be implemented as follows:
CREATE TABLE kvstore (elt0 text, elt1 text, elt2 text, value text);
{key: {elt0: v0, elt1: v1, elt2: v2}, value: {...}}
"v0:v1:v2"
, and use that as the key in a map. Note that in this case, we end up in the same inefficient situation that we have at the moment. We can improve on this if we assume that values are pointers (and so the same value can be efficiently "duplicated"), in which case we can store the pointer to each value under multiple keys:"v0:v1:*"
,"v0:*:v2"
,"v0:*:*"
,"*:v1:v2"
,"*:v1:*"
, and"*:*:v2"
(note that the total number of such keys quickly explodes with the number of elements in our composite keys).The text was updated successfully, but these errors were encountered: