improved Collada loader #569

frede791 · 2024-01-11T16:55:47Z

Bug fix

Summary

This PR accelerates the loading of collada files which, for large meshes, drastically cuts down on the loading times.
This was done using several techniques, ranked below in order of impact:

Pass by reference rather than pass by value. The most significant cut in time was achieved by replacing copy assignments of large objects, with shared pointer copies. This meant that the object that had to be copied became much, much smaller and thus leading to significant boosts in performance
Data structure changes. On several occasions, std::map was changed to std::unordered_map to benefit from the constant time element access.
Smaller algorithmic improvements. Some other algorithmic improvements were done, such as improving the split() function and allocating memory ahead of time for known sized objects.

To see the improvements for yourself, I have linked a folder called torino below, containing a map of the center Turin, Italy with medium-high resolution. You can run this map using:
GZ_SIM_RESOUCE_PATH=/path/to/parent/dir gz sim -r /path/to/torino/torino.sdf

When running the default colladaloader.cc, this map took about 85 seconds to load on my machine. With this new implementation it took me about 10 seconds. For larger maps, I have seen even more drastic performance improvements. A map that previously did not load in more than 2 hours, now only takes 3 minutes to load.

One technique that I did not manage to use was parallelization. A majority of the startup time is still spent in the loadtriangles() function, which itself calls several functions. Based on my understanding it was not possible to parallelize these functions. The other major loss of time is now the xml parsing itself. The current implementation uses tinyxml2. While good, after some research I found that there are several other parsers that are even faster (pugixml, rapidxml). Rewriting the xml sections to use one of these would be a worthwhile endeavor for the future, as it could lead to another large performance gain for big files.

Link to "torino" map:
https://drive.google.com/drive/folders/13TVY2aOQiS-SN2l3HkUXauy1qlJFKaGa?usp=sharing

Checklist

Signed all commits for DCO
codecheck passed (See contributing)
All tests passed (See test coverage) (no new failures)

Note to maintainers: Remember to use Squash-Merge and edit the commit message to match the pull request summary while retaining Signed-off-by messages.

Signed-off-by: frederik <[email protected]>

frede791 · 2024-01-11T18:34:04Z

@marcoag @mjcarroll It appears that 3 tests are failing. I am not sure whether this is to do with the modifications done by me as I ran the tests on main branch and they seem to be failing there as well. Do you have an idea?

bperseghetti · 2024-01-11T18:42:08Z

@marcoag @mjcarroll It appears that 3 tests are failing. I am not sure whether this is to do with the modifications done by me as I ran the tests on main branch and they seem to be failing there as well. Do you have an idea?

@frede791 Mind splitting out the linting changes from the functional code changes or remove the linting changes completely? Will make it easier to look at the actual changes and also easier to debug.

caguero

Thanks, this patch looks very promising! I was able to reproduce your numbers as well. I had a few very minor changes around the code. Here's my most important question:

Your description mentions that the optimizations are ranked in order of impact. Could you share some extra information about how much each optimization contributes to the speedup? I'm surprised that the optimization (1) is so important as we were mostly passing references.

graphics/src/ColladaLoader.cc

caguero · 2024-01-11T18:46:25Z

graphics/src/ColladaLoader.cc

@@ -330,7 +339,7 @@ namespace gz
 void hash_combine(std::size_t &_seed, const double &_v)
 {
  std::hash<double> hasher;
-  _seed ^= hasher(_v) + 0x9e3779b9 + (_seed << 6) + (_seed >> 2);


Could you explain why removing the constant please?

Adding a constant doesn't change the distribution of the hash function but costs about 0.2 seconds.

That is unexpected!

I think I took the code from https://stackoverflow.com/a/2595226. That's a while back. Maybe there's a better way to do it now.

The current implementation of the hash function accounts for about 0.1 seconds on the Torino map.

graphics/src/ColladaLoader.cc

frede791 · 2024-01-11T21:12:42Z

@caguero I ran the code throught Intel VTune and then analyzed where the majority of the start up time was spent. By far the largest time save comes from changing the _duplicates and _values objects in the LoadTexCoords() function to shared pointers. I applied this idea to the other load functions which had similar implementations. This alone lowered the time taken to load the torino map to about 20 seconds (with a 35 second drop alone being accounted for by the LoadTexCoords() function). The other improvements are relatively minor in comparison but they still seemed significant enough to improve. I'm sure there are still more improvements that could be done (especially with respect to the GeometryVertices class which is now one of the most significant time factors at about 1 second); however, the time saves here are now minimal and for personally it doesn't matter too much if a map loads in 7 seconds or 10 seconds as for larger maps the biggest bottleneck is now tinyxml2 anyways.
Speed up time improvements on torino map:

ca. 65 seconds
ca. 8 seconds
<3 seconds

@bperseghetti @caguero I will undo/improve on the linting changes and other requests tomorrow.

mjcarroll · 2024-01-12T00:29:33Z

@marcoag @mjcarroll It appears that 3 tests are failing.

I can take a closer look in the morning.

Signed-off-by: frederik <[email protected]>

frede791 · 2024-01-12T09:31:44Z

@caguero @bperseghetti I have addressed all issues raised above. Codecheck passes as well.

…loop, use emplace_back Signed-off-by: frederik <[email protected]>

Signed-off-by: frederik <[email protected]>

mjcarroll · 2024-01-12T15:48:48Z

So I think the failures are "real" in that those tests are segfaulting.

Edit: and of course they don't reproduce locally.

mjcarroll · 2024-01-12T17:27:15Z

graphics/src/ColladaLoader.cc

@@ -2028,10 +2241,10 @@ void ColladaLoader::Implementation::LoadPolylist(
    std::string offset = polylistInputXml->Attribute("offset");
    if (semantic == "VERTEX")
    {
-      unsigned int count = norms.size();
+      unsigned int count = (*norms).size();


This is where the segfault is happening in CI. Since norms is an uninitialized shared_ptr, norms->size() isn't a valid call at this point. I suppose it depends on your compiler/system if something kinda bad (data corruption) or really bad (segfault) happens at this point, because it seems to only consistently happen on my jammy system.

I fixed this with the latest commit.

The remaining test run failure for the Collada Loader seems to be the result of a mismatch between the actual mesh vertex and normal count and the expected counts

mjcarroll · 2024-01-12T17:28:07Z

Overall, I am on board with the approach of minimizing copies where possible here, but I'm not sure if switching to shared_ptrs is the exact approach that we want to do yet. I'm going to spend a little more time with this today and see if I can come up with concrete feedback.

mjcarroll · 2024-01-12T17:29:48Z

There are also exisiting memory issues in some of these tests that ASAN catches, I'm going to try to get those resolved so we have a clean baseline to discuss from.

Make it work right and then make it work fast, right?

Signed-off-by: frederik <[email protected]>

mjcarroll · 2024-01-18T21:10:41Z

Okay, I have addressed any current ASAN issues in the graphics component in #571

frede791 · 2024-01-18T22:10:29Z

@mjcarroll I am not quite sure why the test failure is happening for the ColladaLoader. CI tells me that there is a mismatch between expected vertex/normal count and actual count but I am not quite sure why this is happening especially since it is only 1/14 tests in the test suite. I could see if the changes from #571 would have an impact. What do you think?

azeey · 2024-06-10T18:55:20Z

@mjcarroll Would you be able to take another look at this?

iche033

I made a comment with a patch that gets the test to pass. It reverts a check but I'm not sure if that change is intended or not.

iche033 · 2024-08-08T20:24:00Z

graphics/src/ColladaLoader.cc


-      // create a map of duplicate indices
+    // create a map of duplicate indices
+    if((*prev_vec) != vec){


Looked into the test failure and found that this check here results in more unique normals. Not sure if I understand the logic here. A vector is marked as unique if it's not equal to the previous one? So say if we have the following vectors: A = [0, 0, 1], B= [0, 1, 0], C =[0, 0, 1]. I think with this logic, C will be marked as unique because it does not equal to B?

If I remove the extra check with the following patch, the test passes:

collada.patch

diff --git a/graphics/src/ColladaLoader.cc b/graphics/src/ColladaLoader.cc index b7979a3..c6c231c 100644 --- a/graphics/src/ColladaLoader.cc +++ b/graphics/src/ColladaLoader.cc @@ -1562,8 +1562,6 @@ void ColladaLoader::Implementation::LoadPositions(const std::string &_id, auto values = toDoubleVec(valueStr, totCount); gz::math::Vector3d vec; - std::shared_ptr<gz::math::Vector3d> prev_vec = - std::make_shared<gz::math::Vector3d>(gz::math::Vector3d::Zero); if (!_values) _values = std::make_shared<std::vector<gz::math::Vector3d>>(); if (!_duplicates) @@ -1579,16 +1577,10 @@ void ColladaLoader::Implementation::LoadPositions(const std::string &_id, (*_values).emplace_back(vec); // create a map of duplicate indices - if((*prev_vec) != vec){ - if (unique.find(vec) != unique.end()) - (*_duplicates)[(*_values).size()-1] = unique[vec]; - else - unique[vec] = (*_values).size()-1; - } + if (unique.find(vec) != unique.end()) + (*_duplicates)[(*_values).size()-1] = unique[vec]; else unique[vec] = (*_values).size()-1; - - (*prev_vec) = vec; } this->positionDuplicateMap[_id] = _duplicates; @@ -1728,8 +1720,6 @@ void ColladaLoader::Implementation::LoadNormals(const std::string &_id, auto values = toDoubleVec(valueStr, totCount); gz::math::Vector3d vec; - std::shared_ptr<gz::math::Vector3d> prev_vec = - std::make_shared<gz::math::Vector3d>(gz::math::Vector3d::Zero); if (!_values) _values = std::make_shared<std::vector<gz::math::Vector3d>>(); if (!_duplicates) @@ -1747,16 +1737,10 @@ void ColladaLoader::Implementation::LoadNormals(const std::string &_id, (*_values).emplace_back(vec); // create a map of duplicate indices - if((*prev_vec) != vec){ - if (unique.find(vec) != unique.end()) - (*_duplicates)[(*_values).size()-1] = unique[vec]; - else - unique[vec] = (*_values).size()-1; - } + if (unique.find(vec) != unique.end()) + (*_duplicates)[(*_values).size()-1] = unique[vec]; else unique[vec] = (*_values).size()-1; - - (*prev_vec) = vec; } this->normalDuplicateMap[_id] = _duplicates; @@ -1917,8 +1901,6 @@ void ColladaLoader::Implementation::LoadTexCoords(const std::string &_id, auto values = toDoubleVec(valueStr, totCount); gz::math::Vector2d vec; - std::shared_ptr<gz::math::Vector2d> prev_vec = - std::make_shared<gz::math::Vector2d>(gz::math::Vector2d::Zero); if (!_values) _values = std::make_shared<std::vector<gz::math::Vector2d>>(); if (!_duplicates) @@ -1933,16 +1915,10 @@ void ColladaLoader::Implementation::LoadTexCoords(const std::string &_id, (*_values).emplace_back(vec); // create a map of duplicate indices - if((*prev_vec) != vec){ - if (unique.find(vec) != unique.end()) - (*_duplicates)[(*_values).size()-1] = unique[vec]; - else - unique[vec] = (*_values).size()-1; - } + if (unique.find(vec) != unique.end()) + (*_duplicates)[(*_values).size()-1] = unique[vec]; else unique[vec] = (*_values).size()-1; - - (*prev_vec) = vec; } this->texcoordDuplicateMap[_id] = _duplicates;

azeey · 2024-08-22T16:08:53Z

I'll go ahead and remove beta from this PR. I don't think we have enough time to continue iterating on it. Since this will hopefully not break behavior, it can be merged after Ionic is released.

improved Collada loader

d1bfb4d

Signed-off-by: frederik <[email protected]>

frede791 requested a review from marcoag as a code owner January 11, 2024 16:55

github-actions bot added 🌱 garden Ignition Garden 🎵 harmonic Gazebo Harmonic labels Jan 11, 2024

caguero reviewed Jan 11, 2024

View reviewed changes

remove pure linter changes and fix camel cases

c22b4d9

Signed-off-by: frederik <[email protected]>

frede791 force-pushed the collada_acceleration branch from 9516485 to c22b4d9 Compare January 12, 2024 09:21

more linting changes

f0ff34f

Signed-off-by: frederik <[email protected]>

frede791 force-pushed the collada_acceleration branch from 1098a84 to f0ff34f Compare January 12, 2024 09:28

more optimizations. Remove iss, move ptr check conditions outside of …

baa748a

…loop, use emplace_back Signed-off-by: frederik <[email protected]>

frede791 force-pushed the collada_acceleration branch from 4cff57c to baa748a Compare January 12, 2024 11:22

frede791 added 2 commits January 12, 2024 13:40

more improvements, reduce find() time

d5322d2

Signed-off-by: frederik <[email protected]>

more minor improvements

6911db6

Signed-off-by: frederik <[email protected]>

frede791 force-pushed the collada_acceleration branch from 1571a65 to 6911db6 Compare January 12, 2024 13:04

frede791 requested a review from caguero January 12, 2024 14:53

mjcarroll reviewed Jan 12, 2024

View reviewed changes

fix two of three seg faults

f4962c0

Signed-off-by: frederik <[email protected]>

mjcarroll mentioned this pull request Jan 16, 2024

Multiple memory cleanup fixes #571

Merged

Merge branch 'gz-common5' into collada_acceleration

1aaba8f

Merge branch 'gz-common5' into collada_acceleration

c0a5bc1

azeey added the beta Targeting beta release of upcoming collection label Jul 29, 2024

iche033 reviewed Aug 8, 2024

View reviewed changes

Merge branch 'gz-common5' into collada_acceleration

9fd5ab7

azeey removed the beta Targeting beta release of upcoming collection label Aug 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

improved Collada loader #569

improved Collada loader #569

frede791 commented Jan 11, 2024 •

edited

Loading

frede791 commented Jan 11, 2024

bperseghetti commented Jan 11, 2024 •

edited

Loading

caguero left a comment

caguero Jan 11, 2024

frede791 Jan 11, 2024

mjcarroll Jan 11, 2024

iche033 Jan 12, 2024

frede791 Jan 12, 2024

frede791 commented Jan 11, 2024 •

edited

Loading

mjcarroll commented Jan 12, 2024

frede791 commented Jan 12, 2024

mjcarroll commented Jan 12, 2024 •

edited

Loading

mjcarroll Jan 12, 2024

frede791 Jan 13, 2024 •

edited

Loading

frede791 Jan 13, 2024

mjcarroll commented Jan 12, 2024

mjcarroll commented Jan 12, 2024

mjcarroll commented Jan 18, 2024

frede791 commented Jan 18, 2024 •

edited

Loading

azeey commented Jun 10, 2024

iche033 left a comment

iche033 Aug 8, 2024

azeey commented Aug 22, 2024

improved Collada loader #569

Are you sure you want to change the base?

improved Collada loader #569

Conversation

frede791 commented Jan 11, 2024 • edited Loading

Bug fix

Summary

Checklist

frede791 commented Jan 11, 2024

bperseghetti commented Jan 11, 2024 • edited Loading

caguero left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

frede791 commented Jan 11, 2024 • edited Loading

mjcarroll commented Jan 12, 2024

frede791 commented Jan 12, 2024

mjcarroll commented Jan 12, 2024 • edited Loading

Choose a reason for hiding this comment

frede791 Jan 13, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mjcarroll commented Jan 12, 2024

mjcarroll commented Jan 12, 2024

mjcarroll commented Jan 18, 2024

frede791 commented Jan 18, 2024 • edited Loading

azeey commented Jun 10, 2024

iche033 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

azeey commented Aug 22, 2024

frede791 commented Jan 11, 2024 •

edited

Loading

bperseghetti commented Jan 11, 2024 •

edited

Loading

frede791 commented Jan 11, 2024 •

edited

Loading

mjcarroll commented Jan 12, 2024 •

edited

Loading

frede791 Jan 13, 2024 •

edited

Loading

frede791 commented Jan 18, 2024 •

edited

Loading