Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NEW: Add Hierarchy::prepopulate_numchildren_cache() #8380

Merged
merged 12 commits into from
Sep 25, 2018
Merged

NEW: Add Hierarchy::prepopulate_numchildren_cache() #8380

merged 12 commits into from
Sep 25, 2018

Conversation

sminnee
Copy link
Member

@sminnee sminnee commented Sep 18, 2018

Adds the ability to prepopulate the cache for Hierarchy::numChildren()
in a batch.

This is still a WIP as it will give bad results if augmentStageChildren
is used in a way that alters the number of records.

This is an useful part of optimising the query count on tree generation.
See #8379

Parent issue

sminnee pushed a commit to sminnee/silverstripe-cms that referenced this pull request Sep 18, 2018
Only relevant if silverstripe/silverstripe-framework#8380 is avialable,
however coded defensively so it can be merged before that PR if needs 
be.

See silverstripe/silverstripe-framework#8379
@sminnee
Copy link
Member Author

sminnee commented Sep 18, 2018

Note: although this didn't have an order-of-magnitude impact on performance locally (reducing treeview execution from 3.5s to 2.5s), it did reduce the query count from ~140 to ~40. This should have a greater impact on deployments where the database is running on a separate server.

@maxime-rainville maxime-rainville self-assigned this Sep 18, 2018
Copy link
Contributor

@maxime-rainville maxime-rainville left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change seems to be altering the results in some way. When I enable prepopulate_numchildren_cache(), I get a bunch of Too many pages ( show as list ) warnings in my treeview that shouldn't be there.

For what it's worth, it's noticeably faster.
image

@sminnee
Copy link
Member Author

sminnee commented Sep 20, 2018

OK there's an issue with this: it conflicts with the lumberjack module because the lumberjack module overrides stageChildren. This causes the issue that Max found.

I'm going to see if I can amend the prepopulation logic to use a call to stageChildren as the basis of its query.

@sminnee
Copy link
Member Author

sminnee commented Sep 20, 2018

Hrm, so I looked into this, and I wonder if we declare the problem to be with lumberjack: silverstripe/silverstripe-lumberjack#86

@sminnee
Copy link
Member Author

sminnee commented Sep 20, 2018

You could for example do this instead:

@maxime-rainville
Copy link
Contributor

Just tweaks Sam's logic to avoid counting the children of childless object when the cache is complete.

We're now getting a pretty solid 28% performance boost when rendering treeview. 😎

image

@robbieaverill
Copy link
Contributor

You could for example do this instead:

The suspense is killing me

@maxime-rainville
Copy link
Contributor

maxime-rainville commented Sep 20, 2018

@robbieaverill Our key project will need to add this to their config:

SilverStripe\CMS\Model\SiteTree:
  hide_from_cms_tree:
  - App\Pages\RandomPageTypeOne
  - App\Pages\RandomPageTypeTwo
  - App\Pages\RandomPageTypeThree

I think that's what is meant to be in Sam's comment.

@sminnee
Copy link
Member Author

sminnee commented Sep 20, 2018

Your amendment looks good, Maxime. I think that there's one thing we want to note in the changelog:

Take care with stageChildren() overrides. Hierarchy::numChildren() results will only make use of stageChildren() results that are applied to the base class and don't include record-specific behaviour.

@sminnee sminnee dismissed maxime-rainville’s stale review September 21, 2018 03:02

This issue has been addressed.

@sminnee
Copy link
Member Author

sminnee commented Sep 21, 2018

@silverstripe/core-team can we get a sanity check on whether the change described in my most recent comment is okay for semver?

@chillu
Copy link
Member

chillu commented Sep 21, 2018

@silverstripe/core-team can we get a sanity check on whether the change described in my most recent comment is okay for semver?

I've done a sanity check via Github for stageChildren, and it brings up a few use cases: https://github.com/search?l=PHP&q=%22function+stageChildren%22+dataobject&type=Code. Seems to be pretty much exclusively symbiote's various "connector" modules, and then a whole bunch of core code copied into other git repos.

The base module of this isn't even 4.x compatible (https://github.com/nyeholt/silverstripe-external-content), so I don't think it'll cause wider issues in practice - but looping in the maintainer @nyeholt in here as a courtesy either way.

Sam Minnee and others added 3 commits September 21, 2018 17:57
This helper is necessary when you want to replace a parameter with a
non-literal such as a column name. This arises when you are
transforming queries such as we do in the prepopulation query for
Hierarchy::numChildren()
API: Hierarchy::stageChildren() customisations must be applied to the base class and not include record-specific behaviour.

Adds the ability to prepopulate the cache for Hierarchy::numChildren()
in a batch.

Note that this optimisation means that stageChildren() is not called on
each record in order to calculate numChildren(). This means that the
structure of the stageChildren() query must be the same for all records
and the behaviour cannot be customised only for a subclass of the base
data class. For example, apply your customisations to SiteTree and not
a subclass.

This is an useful part of optimising the query count on tree generation.
See #8379
This will avoid trying to count the children of childless object.
@sminnee
Copy link
Member Author

sminnee commented Sep 21, 2018

OK I've rebased this and added a note to the 4.3 changelog.

sminnee pushed a commit to sminnee/silverstripe-cms that referenced this pull request Sep 21, 2018
Only relevant if silverstripe/silverstripe-framework#8380 is avialable,
however coded defensively so it can be merged before that PR if needs 
be.

See silverstripe/silverstripe-framework#8379
@maxime-rainville
Copy link
Contributor

I've got the unit tests written, but there's an edge case that's being tripped that I didn't quite figure out how to resolve yet. I'll have another look at in the morning.

prepopulate_numchildren_cache doesn't honor the hide_from_hierarchy config.

There's also a bit of an issue with unit testing the hide_from_cms_tree config flag because it only gets triggered when Hierachy::showingCMSTree returns true. This method is based on the current controller being LeftAndMain and the current action being one of "treeview", "listview", "getsubtree". FYI I'm very judgemental of this method!

tests/php/ORM/HierarchyCachingTest.php Outdated Show resolved Hide resolved
@sminnee
Copy link
Member Author

sminnee commented Sep 24, 2018

@ScopeyNZ I agree with the concerns about the DB::replace_parameter implementation, but a couple of notes:

  • The implementation was copied from DB::inline_parameters(), rather than written from scratch.
  • The "real fix" to its necessity is probably an SS5 topic such as SS5 RFC: More batch-oriented ORM #8397

@sminnee
Copy link
Member Author

sminnee commented Sep 24, 2018

prepopulate_numchildren_cache doesn't honor the hide_from_hierarchy config.

This surprises me. Is the config applied to SiteTree or a subclass?

public static function setUpBeforeClass()
{
parent::setUpBeforeClass();
HideTestObject::config()->update(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you apply this to SiteTree instead?

Copy link
Contributor

@maxime-rainville maxime-rainville Sep 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather not as it ought to work with this and that's how the other Hierarchy test are using.

@sminnee
Copy link
Member Author

sminnee commented Sep 24, 2018

@maxime-rainville do you think I should merge the #8395 changes into this one?

The relevant changes are already in silverstripe/silverstripe-cms#2266

Simplify prepopulate_numchildren_cache by adding a $IDLess param to
stageChildren.
@maxime-rainville
Copy link
Contributor

Sorry, I bulked down and didn't see all the messages flying.

The problem was that some of the where statements in the loop where getting sent back as arrays and other were being sent back as DataQuery_SubGroup. The DataQuery_SubGroup were just being skipped altogether.

I just took some liberties with prepopulate_numchildren_cache logic. I added a flag to stageChildren that will suppress the ID/ParentID where clauses when set to true. I'm using that instead of looping through all the where clause.

I think it makes the logic a bit easier to follow and a bit cleaner and the generate SQL statement looks legit.

SELECT DISTINCT "ParentID", COUNT(DISTINCT "SiteTree"."ID") AS "NumChildren" FROM "SiteTree" LEFT JOIN "SiteTree_Localised" AS "SiteTree_Localised_en_NZ" ON "SiteTree"."ID" = "SiteTree_Localised_en_NZ"."RecordID" AND "SiteTree_Localised_en_NZ"."Locale" = ? WHERE (("SiteTree"."ClassName" NOT IN (?, ?, ?, ?) OR "SiteTree"."ClassName" IS NULL)) GROUP BY ParentID

However, it probably wouldn't hurt to have someone who actually knows what they're doing have another look at it.

On the topic #8395, it probably would be better to fold it into this one, as rebasing it at this stage would probably be more trouble than its worth.

Copy link
Contributor

@maxime-rainville maxime-rainville left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's in a good state now, but I ended up doing a big chunk of it, so it probably would be good for some one else to glance at it.

}
}
}
$query = $dummyObject->stageChildren(true, true)->dataQuery()->query();
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can probably stop the implementation of DB::replace_parameter if we go with this, since it's a bit gross and was only added for this use-case.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to add "Where ID != ParentID" as a filter here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added the Where ID != ParentID statement to stageChildren.

$query->setGroupBy([
"ParentID
"]);
$query->setGroupBy(['"ParentID"']);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's probably best to use Convert::symbol2sql("ParentID") here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@sminnee
Copy link
Member Author

sminnee commented Sep 24, 2018

@maxime-rainville I've left a few comments — do you want to address those, then I'll rebase/squash and maybe we can get Ingo to do the final merge?

This provides a more extensible way of preopulating caches for optimised
tree generation.

Fixes #8391
@maxime-rainville
Copy link
Contributor

All done.

@maxime-rainville maxime-rainville merged commit 5b7a841 into silverstripe:4 Sep 25, 2018
maxime-rainville pushed a commit to silverstripe/silverstripe-cms that referenced this pull request Sep 25, 2018
* FIX: Use Hierarchy::prepopulate_numchildren_cache in tree-generation

Only relevant if silverstripe/silverstripe-framework#8380 is avialable,
however coded defensively so it can be merged before that PR if needs 
be.

See silverstripe/silverstripe-framework#8379

* FIX: Use Hierarchy::prepopulateTreeDataCache() in CMS.

Requires silverstripe/silverstripe-framework#8395

* Cache tree_class instead of assuming it will always be SiteTree.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants