[Bug] Modify caching strategy to better utilize cache space and distribution of resources. #274

csuwildcat · 2024-08-29T15:11:20Z

Describe the bug
The current caching strategy effectively renders caching unavailable to the vast majority of DIDs, as it doesn't seem to account for frecency (frequency/recency) of access, activity, or other metrics that substantiate why a DID should be cached over another.

To Reproduce:

Try to build an app and not get hit with 10s+ resolutions virtually every time
Fail at doing Step 1

Expected behavior:
Step 1 should be the common case.

Supporting Material
I built an app and this the behavior.

Environment (please complete the following information):

Literally everywhere you can send an HTTP request

Additional context
Not really, just want some caching.

decentralgabe · 2024-08-29T15:22:56Z

The primary problem is a given server does not know, and has no way of knowing, whether a record has been updated without querying the DHT.

Caching ignores this possibility and increases the chance of failure cases when an update has been published to a different DHT node.

To mitigate this, we have some options:

Support introspecting NS records to determine if the gateway is authoritative #199.
This is the first obvious one. If a DID Document marks a certain node as 'authoritative' we have less of a reason to query the DHT. There's still a possibility the DID updates its authoritative gateway without letting a prior authoritative gateway known, but we can consider that an edge case. We could handle this with an async fetch.
We can do an async fetch anyway.
With this I mean that we favor returning a cached version for the common case. We can make the timeout 2 hours. Asynchronously we can start a background process to query the DHT and update the records, so that on subsequent requests you'll receive an updated DID Document if it's available. This is nice because it's non-blocking. A tradeoff is the risk I mention above (updates external to the node).
Async fetch + a consistent read flag
Same as the prior suggestion, however we enable a new query parameter which is default set to false. Setting a consistentRead flag to true would always query the DHT and take longer. This leaves the behavior up to the client. There is prior art in distributed DBs like Dynamo.

Let me know what you think.

csuwildcat · 2024-08-29T15:41:54Z

Third option sounds best, with a pref for cache slots given to the DIDs that have the highest frecency of hits.

decentralgabe · 2024-08-29T15:58:29Z

Frequency of hits does not change the fact that the DID can be updated on another node.

I think supporting 3, then 2, then 1 will give you the behavior you're looking for.

csuwildcat added the bug Something isn't working label Aug 29, 2024

decentralgabe added enhancement New feature or request discuss discussion and removed bug Something isn't working labels Aug 29, 2024

decentralgabe added hacktoberfest small labels Sep 13, 2024

taniashiba mentioned this issue Sep 23, 2024

🎃 Hacktoberfest Project Hub: DID DHT #292

Open

18 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] Modify caching strategy to better utilize cache space and distribution of resources. #274

[Bug] Modify caching strategy to better utilize cache space and distribution of resources. #274

csuwildcat commented Aug 29, 2024

decentralgabe commented Aug 29, 2024

csuwildcat commented Aug 29, 2024

decentralgabe commented Aug 29, 2024

[Bug] Modify caching strategy to better utilize cache space and distribution of resources. #274

[Bug] Modify caching strategy to better utilize cache space and distribution of resources. #274

Comments

csuwildcat commented Aug 29, 2024

decentralgabe commented Aug 29, 2024

csuwildcat commented Aug 29, 2024

decentralgabe commented Aug 29, 2024