-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow cache to be distributed / better persisted #150
Comments
@t92549 Does this question really belong in gaffer-docker? |
This kind of scenario with multiple Gaffer instances sharing a cache is described in gchq/Gaffer#2457 so people are trying it. I think there is a PR associated with that issue to remove de-sync in federated stores with that set up. However, perhaps the solution of a database rather than a cache would be quite nice and remove a lot of the tricky issues with this set up. This would involve work in Gaffer, but gaffer-docker would also need some changes to ensure this deployment option is available and setup correctly. |
In theory (ie. see gh-2457) the existing Gaffer Cache can be backed by a database, either via configuring and plugging the JCS cache implementation already used by gaffer (e.g. via https://commons.apache.org/proper/commons-jcs/JDBCDiskCache.html) or by plugging in your own implementation of uk.gov.gchq.gaffer.cache.ICacheService. I've not done the former, but I have an example of code which does the latter - I'll share it with you personally @t92549 via a different mechanism. There's some specific environmental reasons I wouldn't simply cut-and-shut my example into the open-source project but it could absolutely act as a starting point. If someone wants to experiment with using off-the-shelf JCS components and configuration to persist Gaffer caches (the federatedStores and NamedOperation caches in particular), then I'd be really interested in the outcome. Slight aside: IMNSHO, gaffer is taking a bit of a liberty in calling some of it's "Caches"... well, "Caches" :-) Really, they are the primary stores of some of it's non-graph metadata. One last note - I agree with @n3101 that this issue might be better in the regular gaffer project rather than gaffer-docker. Unless, perhaps, the ticket is solely scoped to using existing configuration and JCS components to implement a persistent cache in this docker implementation of Gaffer. That's just my idle thoughts, though - I just raise issues on these backlogs, I don't manage them :-) |
Okay excellent. I can take a look at the code you sent and also perhaps look into using these JCS features, get them tested and see if these would be useful to add as an example perhaps somewhere.
Yeah I think that is the main reason we should start looking into this, especially as more people start looking into load balanced setups.
I suppose it depends on what the solution is. If we offer some sort of persistent cache like you sent to me then the code would live in Gaffer, but if we just want to set some config files as part of a deployment that can set up this cache without any code changes then it lives here. |
gchq/Gaffer#2457 is likely to stay because this will make the behaviour of cache between FederatedStore and other stores behave more similarly. If you want to configure the docker to have magic buttons to have a configured persist cache somewhere like JSC I don't have strong opinions. Am I right in understanding Gaffer already has everything you need to do this? |
The Gaffer cache containing named operations / federated store graphs should be distributed when using multiple Gaffer REST API instances and also so that you don't lose your graphs if updating the Federated Store.
Therefore we should consider either configuring the JCS cache or using the Hazelcast cache service.
Alternatively (and probably more preferably) Gaffer should consider storing this important data somewhere like a database rather than a cache.
I'll leave this as an open question so that we can discuss the options
The text was updated successfully, but these errors were encountered: