Allow cache to be distributed / better persisted #150

d47853 · 2021-02-05T09:52:09Z

The Gaffer cache containing named operations / federated store graphs should be distributed when using multiple Gaffer REST API instances and also so that you don't lose your graphs if updating the Federated Store.

Therefore we should consider either configuring the JCS cache or using the Hazelcast cache service.

Alternatively (and probably more preferably) Gaffer should consider storing this important data somewhere like a database rather than a cache.

I'll leave this as an open question so that we can discuss the options

n3101 · 2021-08-11T09:23:40Z

@t92549 Does this question really belong in gaffer-docker?
Also, it sounds like a gaffer issue that we have been looking at.

t92549 · 2021-08-11T09:43:13Z

This kind of scenario with multiple Gaffer instances sharing a cache is described in gchq/Gaffer#2457 so people are trying it. I think there is a PR associated with that issue to remove de-sync in federated stores with that set up. However, perhaps the solution of a database rather than a cache would be quite nice and remove a lot of the tricky issues with this set up. This would involve work in Gaffer, but gaffer-docker would also need some changes to ensure this deployment option is available and setup correctly.
What do @sw96411 and @GCHQDev404 think?

sw96411 · 2021-08-11T17:06:13Z

In theory (ie. see gh-2457) the existing Gaffer Cache can be backed by a database, either via configuring and plugging the JCS cache implementation already used by gaffer (e.g. via https://commons.apache.org/proper/commons-jcs/JDBCDiskCache.html) or by plugging in your own implementation of uk.gov.gchq.gaffer.cache.ICacheService.

I've not done the former, but I have an example of code which does the latter - I'll share it with you personally @t92549 via a different mechanism. There's some specific environmental reasons I wouldn't simply cut-and-shut my example into the open-source project but it could absolutely act as a starting point.

If someone wants to experiment with using off-the-shelf JCS components and configuration to persist Gaffer caches (the federatedStores and NamedOperation caches in particular), then I'd be really interested in the outcome.

Slight aside: IMNSHO, gaffer is taking a bit of a liberty in calling some of it's "Caches"... well, "Caches" :-) Really, they are the primary stores of some of it's non-graph metadata.

One last note - I agree with @n3101 that this issue might be better in the regular gaffer project rather than gaffer-docker. Unless, perhaps, the ticket is solely scoped to using existing configuration and JCS components to implement a persistent cache in this docker implementation of Gaffer. That's just my idle thoughts, though - I just raise issues on these backlogs, I don't manage them :-)

t92549 · 2021-08-12T10:13:01Z

In theory (ie. see gh-2457) the existing Gaffer Cache can be backed by a database, either via configuring and plugging the JCS cache implementation already used by gaffer (e.g. via https://commons.apache.org/proper/commons-jcs/JDBCDiskCache.html) or by plugging in your own implementation of uk.gov.gchq.gaffer.cache.ICacheService.

I've not done the former, but I have an example of code which does the latter - I'll share it with you personally @t92549 via a different mechanism. There's some specific environmental reasons I wouldn't simply cut-and-shut my example into the open-source project but it could absolutely act as a starting point.

If someone wants to experiment with using off-the-shelf JCS components and configuration to persist Gaffer caches (the federatedStores and NamedOperation caches in particular), then I'd be really interested in the outcome.

Okay excellent. I can take a look at the code you sent and also perhaps look into using these JCS features, get them tested and see if these would be useful to add as an example perhaps somewhere.

Slight aside: IMNSHO, gaffer is taking a bit of a liberty in calling some of it's "Caches"... well, "Caches" :-) Really, they are the primary stores of some of it's non-graph metadata.

Yeah I think that is the main reason we should start looking into this, especially as more people start looking into load balanced setups.

One last note - I agree with @n3101 that this issue might be better in the regular gaffer project rather than gaffer-docker. Unless, perhaps, the ticket is solely scoped to using existing configuration and JCS components to implement a persistent cache in this docker implementation of Gaffer. That's just my idle thoughts, though - I just raise issues on these backlogs, I don't manage them :-)

I suppose it depends on what the solution is. If we offer some sort of persistent cache like you sent to me then the code would live in Gaffer, but if we just want to set some config files as part of a deployment that can set up this cache without any code changes then it lives here.

GCHQDev404 · 2021-08-12T11:09:00Z

gchq/Gaffer#2457 is likely to stay because this will make the behaviour of cache between FederatedStore and other stores behave more similarly.

If you want to configure the docker to have magic buttons to have a configured persist cache somewhere like JSC I don't have strong opinions. Am I right in understanding Gaffer already has everything you need to do this?

d47853 added enhancement Improvement to existing functionality/feature question Specific query about part of the repo labels Feb 5, 2021

GCHQDeveloper314 added this to the post-v2.0_backlog milestone Nov 2, 2022

GCHQDeveloper314 added feature A proposed new feature and removed enhancement Improvement to existing functionality/feature labels Jul 12, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow cache to be distributed / better persisted #150

Allow cache to be distributed / better persisted #150

d47853 commented Feb 5, 2021

n3101 commented Aug 11, 2021

t92549 commented Aug 11, 2021

sw96411 commented Aug 11, 2021

t92549 commented Aug 12, 2021

GCHQDev404 commented Aug 12, 2021

Allow cache to be distributed / better persisted #150

Allow cache to be distributed / better persisted #150

Comments

d47853 commented Feb 5, 2021

n3101 commented Aug 11, 2021

t92549 commented Aug 11, 2021

sw96411 commented Aug 11, 2021

t92549 commented Aug 12, 2021

GCHQDev404 commented Aug 12, 2021