Skip to content
This repository has been archived by the owner on Dec 12, 2018. It is now read-only.

Collection resources not cached? #143

Open
jamesskinner opened this issue Feb 26, 2015 · 18 comments
Open

Collection resources not cached? #143

jamesskinner opened this issue Feb 26, 2015 · 18 comments
Labels

Comments

@jamesskinner
Copy link

The call to account.getGroups() does not seem to be cached. From what I could gather from the code it seems like collections are not cached, just the individual members.

I am using this functionality for authorisation to a REST API. I want to grant access to a route based on group membership; for example, only serve requests to accounts in both group X and Y. At the moment I am getting the list of groups the account is a member of and checking if the ones I am looking for are in there.

I want to avoid having to call out the Stormpath for each request. Is there a way to achieve this so that subsequent requests from the same account are served from the cache?

Thanks

@robertjd
Copy link
Member

Hi James,

You're correct - collection results are not yet cached. It's pretty annoying! We have worked out a few thoughts on how we might fix this but have not worked on implementation yet. I am hoping we get to this very soon. In the meantime, please feel free to share any mods you make if you try to fix this your self.

Thanks so much for the ping on this, the validation of the pain point is always useful.

Best,
Robert

@jamesskinner
Copy link
Author

Hi Robert, thanks for the clarification.

Is there no other way you know I could achieve this then?

I'll certainly post back here if I make any adjustments that I think might be useful. Most likely though I will just cache the result in my application.

Cheers

James

@robertjd
Copy link
Member

At the moment I don't have any workarounds, caching in your application is probably the quickest workaround for now. I've been focused hardcore on our new Angular libraries, but I should be freeing up for more Node work in the next week :)

@jamesskinner
Copy link
Author

Great, thanks. I have just implemented the said workaround for now - I'll keep an eye out for this feature turning up though

@lhazlewood
Copy link
Member

I just want to chime in here: this is a non-trivial problem to solve without explicit direction/intervention from the app developer: any modification of any resource within a collection causes the entire collection to be invalidated, and a new collection reflecting the latest quantity and order needs to be acquired.

For example, if I have a collection that is all application accounts sorted by username, and someone changes their username, the cached version of that collection is now invalid: a new one needs to be obtained that reflects the new username in the correct position in the collection.

While auto-invalidating collections can be done via the SDK (i.e. if you change an account and call account.save(), then you invalidate the application.getAccounts()), you also have to invalidate any group or directory collection that contains the account. This means you need to know, up front, what all of those collections are (this is doable, but complicated and not very efficient). Any time an application is added to a group or to a directory or to an application, or any time an account store is assigned to or removed from an application, all of these collections all need to be tracked.

To make matters worse, if these collections are updated at even remotely irregular intervals (i.e. add an account to an application from a signup and this happens once every few minutes), it effectively makes collection caching for that collection useless - if any collection changes with random frequency, the new collection must be obtained, which requires bypassing the cache.

In summary, collection caching is useless if the collection state changes frequently, either by the application using the SDK or out-of-band (e.g. an admin using the user interface to make changes). This is a common problem encountered by almost all ORMs (Hibernate, Rails, etc) and they all pretty much require the app developer to jump in and configure exactly which collections are cached and for how long.

@jamesskinner
Copy link
Author

Hi. Thanks for that explanation - very helpful, I can see the issue. Not sure that it is particularly specific to the concept of a collection though - I could see a similar issue if someone was say, using the account resource to store some state which changed frequently. Seems like (as you say) the solution is fined grained control over the caching TTL, maybe something like account.getGroups({ttl:500},function(err,groups){...})

For my use case, I'm happy just maintaining my own caching layer.

@lhazlewood
Copy link
Member

Thanks for the feedback @jamesskinner! I do want to point out that we will put in support for collection caching - I was just explaining that we can't really do it easily without direction from the application developer - i.e. the SDK can't really do it well 'automatically' like we can for other resources. Setting the cache ttl either in client config or at the method level (like you show - great idea by the way!) is probably the best we can do.

@cwhatley
Copy link

Don't want to sound grumpy because overall I'm extremely happy with StormPath's offerings.

However, this is clearly a known and well-understood issue and that calls for some kind of warning in the docs for both stormpath and express-stormpath.

In our case, we wanted to use the simple groupsRequired middleware from express-stormpath and found that, where we use this, we are making multiple calls to stormpath for every request to our app. Our naive assumption was that if we expand groups in the user, the expanded list would get used.

It took valuable time to dig through express-stormpath first and then stormpath before determining that "it just doesn't work that way" and finding this ticket.

@peebles
Copy link

peebles commented Sep 7, 2015

There are two problems I see, both in lib/cache/CacheHandler.js.

The first is getCacheByHref(). It gets a "region" like this:

region = href.split('/').slice(-2)[0];

but that fails then a href has something after the resource id, as in "https://api.stormpath.com/v1/groups/14uvgQaAUf0kCVdL8lXGvf/customData". It should return "groups" but returns the resource id, and so the DisableCache region is used. I think this line of code should be:

region = href.split('/' )[4];

The next problem is, although put() is recursive, storing all of the sub-resources when storing a resource, get() is not recursive. get() should retrieve the resource, and then try to retrieve its sub-resources. Here is an implementation using async that seems to work:

CacheHandler.prototype.get = function getCachedResource(href, callback) {
    var _this = this;
    return getCacheByHref(_this.cacheManager, href).get(href, function( err, data ) {
    if ( err ) return callback( err );
    if ( ! data ) return callback( err, data );
    async.eachSeries( _.keys( data ), function( key, cb ) {
        var val = data[ key ];
        if ( ! ( val && val.href ) ) return cb();
        _this.get( val.href, function( err, val ) {
        if ( err ) return cb( err );
        if ( val ) data[ key ] = val;
        cb();
        });
    }, function( err ) {
        return callback( err, data );
    });
    });
};

These solutions seem to fix my problem; that passport-deserialize was always calling stormpath apis, even though I prefetched account customData (user profile) and account groups (roles) and their customData. Obviously I only want to do that once. With the changes above, here is my deserialize function that fetches account profile and groups, and the groups customData once, and then gets all from the cache on subsequent runs:

passport.serializeUser( strategy.serializeUser );
passport.deserializeUser( function( userHref, done ) {
    strategy.deserializeUser( userHref, function( err, account ) {
    if ( err ) return done( err );
    if ( account.customData && account.groups &&
         account.groups.items &&
         account.groups.items.length &&
         _.map( account.groups.items, 'customData' ).length == account.groups.items.length ) return done( null, account );  // got it from cache
    account.getCustomData( function( err, cd ) {
        if ( err ) return done( err );
        account.customData = cd;
        account.getGroups().expand({ customData: true }).exec( function( err, groups ) {
        account.groups.items = groups.items;
        done( null, account );
        });
    });
    });
});

I'll do a fork and a pull-request with these changes in a few minutes...

@typerandom typerandom added the bug label Dec 27, 2015
@vvrinne
Copy link

vvrinne commented Feb 9, 2016

So is this issue still open? If you want to authorize APIs by group membership you either have to hit Stormpath on every API call or cache the groups by hand?

I agree that overall caching lists is non-trivial, but group membership seems like a pretty specific case where groups are unlikely to change very often and it has a huge impact on overall performance.

So is the correct workaround for this to fetch the group information manually per token renewal (or whatever) and cache it separately from Stormpath?

@peebles
Copy link

peebles commented Feb 9, 2016

I am still using my private fork. I have had no indication that this has been fixed in the production release, and I am a little worried that my forked code will become incompatible with their servers at some point.

@robertjd
Copy link
Member

robertjd commented Feb 9, 2016

Hello everyone, thank you for bumping this issue. As list caching is difficult, we have not included it in our SDK's as a matter of practice. @peebles I see that you have a PR open, does that patch handle invalidation of cached collections if a member of the collection changes?

As such, we do leave this in your hands. If you are using our client credentials workflow for API authentication, we suggest that you add the names of the groups as scope to the token. @vvrinne does that help with your use case?

@peebles
Copy link

peebles commented Feb 9, 2016

Honestly I do not remember 100% if my patch invalidates a collection if there was a client side change. In my case however, the collection I am fetching (and caching) are the groups I use as "roles", which I am not changing on the client side, ever. So for my use case, if collection validation were broken, I'm not sure it would matter to me and I am not sure it would be detectable for me. The 5 minute TTL I use on the cache would invalidate things anyway at some point.

I admit, I need to pull master and re-evaluate my changes. I've just not had time to do it.

@vvrinne
Copy link

vvrinne commented Feb 9, 2016

I think it won't be too hard to find a workaround for this. It just seems like something I would expect the Stormpath client to implement out of box. API validation against groups that the user has seems like a pretty common scenario and one where caching the group list makes sense imo. But yeah we can simply work around this for now.

@peebles
Copy link

peebles commented Feb 9, 2016

Role based authorization is indeed important. I guess another workaround would be to put "roles" into an account's customData.

{
  "roles": [ "admin" ]
}

@vvrinne
Copy link

vvrinne commented Feb 9, 2016

Yeah that would have been an easy solution. I don't like it for our project since we are going to be using the Stormpath UI as a backoffice tool for maintaining roles and this method will add one way for a human to make an error. I much prefer the current way of adding group memberships through the UI.

We are currently simply reading the group information when ever the user logs in or renews their access token and caching them separately.

@rosskukulinski
Copy link

Chiming in that this is extremely frustrating - we're trying to do the same thing as @vvrinne

@4storia
Copy link

4storia commented Jun 13, 2016

Just going to also +1 this issue from #476

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

9 participants