Use marshal for save/load operators to improve performance #124

Spaceenter · 2017-07-22T14:44:26Z

No description provided.

Develop

allows for handling of even numbers of momenta modes in plane wave code

sync

Spaceenter · 2017-07-22T14:44:58Z

@babbush @jarrodmcc @idk3

damiansteiger

I would suggest to not use marshal over HDF5 because

It is Python specific
The format is not guaranteed to be identical for different python versions
It is not secure as I could put malicious data into it

https://docs.python.org/2/library/marshal.html

idk3 · 2017-07-22T22:45:28Z

@damiansteiger the first two points are true but the second isn't, see e.g. https://stackoverflow.com/questions/26931919/marshal-unserialization-not-secure - marshal doesn't allow you to insert arbitrary code (as e.g. pickle / eval would) and there are no known exploits for it.

The main reasons we want to change from h5py to marshal are speed (factor ~5 faster loading, and ~10+ saving), but more significantly memory usage. In the tests I've done with saving/loading larger operators (with >10 M terms), the conversion to HDF5 blows up in memory, getting as high as ~30 GB for a 2 GB FermionOperator. marshal for the same file only uses ~8 GB by comparison (though I'm still interested in lighter solutions if you know of any!). This means that loading the kind of FermionOperators that we'd be interested in saving in the first place is almost impossible with HDF5.

idk3 · 2017-07-22T23:26:22Z

src/fermilib/utils/_operator_utils.py

@@ -223,3 +217,19 @@ def get_file_path(file_name, data_directory):
        file_path = data_directory + '/' + file_name

    return file_path
+
+
+def numpy_scalar(number):


What if we were to cast to complex all float types to complex instead? This makes me wonder why we don't just make the coefficients complex when they're defined - is there a reason we need things like numpy.complex128? @damiansteiger @babbush @jarrodmcc

babbush · 2017-07-22T23:26:32Z

@damiansteiger: while I appreciate your concerns, right now performance is significantly more important for us. While I see that it might be problematic for people to try to share data between different versions of python or even another language, that is much less of a concern than the fact that we cannot run calculations we need to run without this pull request, or something similar which dramatically brings down memory usage. Right now the security concerns appear to be hypothetical, as Ian points out. So before we make the new release Sunday night, we'd like to have this PR accepted.

idk3 · 2017-07-22T23:37:20Z

src/fermilib/utils/_operator_utils.py

-                for k, v in operator.terms.items()],
-            ensure_ascii=False).encode('utf8'))
+    tm = operator.terms
+    f = open(file_path, 'wb')


Should use context manager here (with open(file_path, 'wb') as f:)

idk3 · 2017-07-23T00:38:23Z

src/fermilib/utils/_operator_utils.py

-            ensure_ascii=False).encode('utf8'))
+    tm = operator.terms
+    f = open(file_path, 'wb')
+    marshal.dump((operator_type, {k: numpy_scalar(tm[k]) for k in tm}), f)


After looking at this a little more, I'm not sure numpy_scalar will do anything fundamentally different from casting to complex directly - do you have any thoughts on this @Spaceenter ? Using complex instead is a little cleaner (no extra function needed) and also a bit faster.

By doing that as well as using itertools' izip and imap, this line can be made ~25% faster (dict(izip(iterkeys(tm), imap(complex, itervalues(tm))))), though it's by no means a bottleneck :p

In my opinion, the coefficients of both FermionOperator and QubitOperator should always be of native Python complex type when the operators are originally constructed, so that we don't need to apply the transformations/castings at all here. Is it possible to make that change in FermionOperator and QubitOperator?
If not, I agree that casting to complex directly here is cleaner.
@idk3 @babbush @jarrodmcc

I am not sure we can trust that it will always be initialized this way. Even if it is, inevitably people will write things like:
my_fermion_operator *= my_numpy_array[i]
which may cause the coefficient to become a numpy type. So I think we are going to need to cast directly.

idk3 · 2017-07-23T01:16:37Z

@Spaceenter The current failure is due to my mistake - izip isn't available in Python3 at all it seems. I think map and zip from future.builtins or future_builtins should do it.

edit: from future_builtins import map, zip does it.

idk3 · 2017-07-23T01:26:15Z

src/fermilib/utils/_operator_utils.py

@@ -153,8 +153,8 @@ def save_operator(operator, file_name=None, data_directory=None):

    tm = operator.terms
    with open(file_path, 'wb') as f:
-        marshal.dump((operator_type, dict(izip(tm.iterkeys(),
-                                          imap(complex, tm.itervalues())))), f)
+        marshal.dump((operator_type, dict(zip(tm.keys(),


For the dictionary parts you should still be able to use iterkeys and itervalues from future.utils! i.e. iterkeys(tm).

I was looking at this:
https://stackoverflow.com/questions/10189273/for-a-python-dictionary-does-iterkeys-offer-any-advantages-over-viewkeys
Is it a style preference?

I'm not sure actually - I actually thought that Python3's d.keys() was 2's d.iterkeys(). In fact I didn't know about viewkeys at all. It looks like both should work, but I know that future.utils.iterkeys passes tests for both 2 and 3 - it would be interesting to see if there's anything different with viewkeys, though I don't think we need any of its special properties like dynamic view.

Seems like tests for both 2 and 3 passed - should we go with this? Or do you want a change?

Should be fine as is! If it's significantly slower I'll make another PR after this one gets through, but I doubt there'll be any issues.

damiansteiger · 2017-07-23T19:27:55Z

@Spaceenter
Thanks for providing this solution to @babbush's and @idk3's performance problem. I am sure they will use your provided solution as it solves their problem especially as they don't have time to work out something different. 🥇

@idk3
Thanks for pointing out the performance issues with the previous approach. This is extremely helpful to put this pull request into perspective (please always add comments why a certain pull request is made) and provides one argument why you would like to change to using marshal.

Let me explain in more detail why I have raised objections about using marshal:

Python specific data format is / will be a real problem in the ProjectQ Framework, here is why:
We have decided that the ProjectQ-Framework uses Python as it's main language and everything has to work in Python. However, as Python is slower than, e.g., C++, we implement code which is a performance bottleneck also in C++ (e.g. the simulator in ProjectQ). This gives the users a choice between the pure Python version or the faster C++ version. E.g. C++ Implementation of FermionOperator and QubitOperator #13 wants to have a C++ implementation of FermionOperators and QubitOperators. If we choose a language independent format, then the faster C++ code can read and write independently without going through Python which would kill the performance.
According to the own documentation marshal is not supposed to be python version or python implementation independent. Since the first release of ProjectQ and FermiLib we have spent a lot of time to make sure that we python version independent. Just because Travis CI passes for both Python2 and Python3 does not guaranteed that other Python implementations won't fail reading this format.
If the maintainers of the marshal package think they need to warn about malicious code execution, I think it is a fair point to assume it is possible.

I am not yet convinced the performance argument outweighs any of these three points (maybe a little bit point 3 but for sure not 1 and 2). Did you try the following:

Make a toy example problem which demonstrates your performance bottleneck (maybe somebody else browsing through the repo might have a solution...)
Did you try to play around with the different options of h5py to increase performance? Ask an HDF5 expert...
It may be worth figuring out if the performance bottleneck is JSON and it might be faster to create a custom datatype without going the route through JSON...

@babbush and @idk3
I can understand that you are under time pressure and we are all trying our best to make sure FermiLib works for everyone. This means that we don't accept pull request to solve a performance problem for a particular Google project, while sacrificing fundamental design principles of the ProjectQ-Framework which will affect all other users. This means (hopefully very rarely) that there will be larger discussions to give everyone time to discuss their arguments and potentially find a better solution. Nevertheless, one great advantage of our framework being open source is that time pressure is almost never a valid argument for shortcutting proper code review and finding a good solution because there is always the following solution:

For the time being, you could just copy this code (which uses marshal and solves your immediate performance problem) and solve your problem. Is there a specific reason why this util function has to be inside FermiLib?
Just submit the code internally at Google and you can start using it or put this one function temporarily on your own GitHub account until we have found a solutions which works for everyone.

babbush · 2017-07-23T19:47:59Z

I understand and respect your objections. However, for us to keep developing FermiLib we need to be able to sometimes strike a compromise between functionality today and purity of engineering practices. Currently, FermiLib is implemented entirely in python and marshall is passing for all tested versions of python. While theoretically possible, there are no known security exploits in marshall. Simply put, that is good enough for us.

The solution is in some ways suboptimal so it makes sense to open an issue and eventually work towards a better solution. We can document what is going on with examples and wait for this hypothetical person who is browsing our repository to suggest something better. But currently the memory spikes associated with the current solution are completely unacceptable. We need a new release of FermiLib to happen Sunday night (as we discussed) and to include the marshall solution. Otherwise, we cannot check the new version into Google3 and that will cause us unacceptable delays. Considering that almost all FermiLib pull requests currently come from people involved in this project, I think this is a completely fair reason. Our projects are largely reasonable for moving FermiLib forward. So we need to be practical. This solution does work for everyone who actually uses the code today. Please accept this pull request in time for the new release.

If you want to discuss further, please call me.

damiansteiger · 2017-07-23T20:31:53Z

This is a open source project, therefore technical discussions should happen openly and not behind closed doors (e.g. phone calls) so that everyone using FermiLib can see why certain design decisions have been made.

Here is a compromise:
I remove my code review so that @babbush can merge and solve his time critical project. This only happens because FermiLib is still in an alpha release and users should be aware that the code can change anytime. But my experience has taught me that it is much harder to get someone to write a nice solution solving this problem (which does not have objections 1-3) if there exists already a "(not) good enough" solution working for some people. Hence, @babbush are you willing to fix this problem before we make the first release?

Spaceenter · 2017-07-23T22:04:35Z

Thanks for the discussions.

I actually have another idea to make this work without making it Python specific. Inside Google, almost all large engineering project use protobuf as a no-brain choice to represent data model (complex or not), because it is designed to be working across different languages, and have much better serializations than things like json.

So instead of using native Python dictionary to represent FermionOperator and QubitOperator (and a few other objects like InteractionRDM, MoleculeOperator, etc.), we define a protobuf message for each of them. And then write the serialized protobuf binary to file. https://developers.google.com/protocol-buffers/docs/pythontutorial#parsing-and-serialization

This is obviously more complex than marshal, and we'd need some time to develop and test.
@babbush @damiansteiger

babbush · 2017-07-23T22:14:26Z

@damiansteiger I agree we should try to keep discussions open. I also appreciate your point about it being more difficult to get people to fix issues when they are "mostly working" already and not badly in need of repair (as is the case now).

I will not personally promise to fix this myself. The first reason is because I am really not the right person to do this - I am not aware of any acceptable solutions and there are other people on this project who would be suitable (such as yourself or @Spaceenter). The second reason is because this is not a critical issue for me whereas there are many other features that I'd like to implement. But I will certainly encourage others to try to come up with a better solution before we move out of alpha. @Spaceenter, you just suggested moving to protocol buffers, which sounds like a reasonable fix to me. Is this something you could commit to trying to do in the next few weeks?

Hopefully this is enough for you to remove the block. I assume that if @Spaceenter agrees to help fix this then that will be good enough for you?

Spaceenter · 2017-07-23T22:49:24Z

Yes, I'll start being a bit more free later next week, I'll then try to send a PR with protobuf and work with @idk3 to test the performance. In my opinion other people who will use projectq/fermilib in the future will also benefit from protobuf as they might use Java, Go, JS etc. to access the operators, and having protobuf should be much easier. Anyways, we could discuss when I actually have the PR ready.
@babbush @damiansteiger @idk3

babbush · 2017-07-23T22:51:30Z

Great, thank you @Spaceenter! I agree that moving to protocol buffers makes a lot of sense for the long run. I suppose this should be enough for @damiansteiger to merge this PR.

Spaceenter added 18 commits May 12, 2017 21:28

Merge pull request #1 from ProjectQ-Framework/develop

0770e10

Develop

Merge pull request #3 from ProjectQ-Framework/develop

892f38b

allows for handling of even numbers of momenta modes in plane wave code

Merge pull request #4 from ProjectQ-Framework/develop

b4e5455

sync

Merge pull request #5 from ProjectQ-Framework/develop

0d97907

sync

Merge pull request #6 from ProjectQ-Framework/develop

33423b7

sync

Merge pull request #7 from ProjectQ-Framework/develop

305ee05

sync

Merge pull request #8 from ProjectQ-Framework/develop

43e7c53

sync

Merge pull request #9 from ProjectQ-Framework/develop

81e3d26

sync

Merge pull request #10 from ProjectQ-Framework/develop

9175595

sync

Merge pull request #11 from ProjectQ-Framework/develop

93b75c1

sync

Merge pull request #12 from ProjectQ-Framework/develop

746457b

sync

Merge pull request #13 from ProjectQ-Framework/develop

95a1666

sync

Merge pull request #14 from ProjectQ-Framework/develop

a3e37e2

sync

Merge pull request #15 from ProjectQ-Framework/develop

169b523

sync

Merge pull request #16 from ProjectQ-Framework/develop

ed6bd08

sync

Merge pull request #17 from ProjectQ-Framework/develop

10d8c1f

sync

Merge pull request #18 from ProjectQ-Framework/develop

d7e57f2

sync

Use marshal for save/load operators to improve performance

a4bc785

damiansteiger requested changes Jul 22, 2017

View reviewed changes

idk3 self-requested a review July 22, 2017 23:06

idk3 reviewed Jul 22, 2017

View reviewed changes

babbush previously approved these changes Jul 22, 2017

View reviewed changes

idk3 reviewed Jul 22, 2017

View reviewed changes

idk3 reviewed Jul 23, 2017

View reviewed changes

...

978c365

Spaceenter dismissed babbush’s stale review via 978c365 July 23, 2017 01:05

babbush approved these changes Jul 23, 2017

View reviewed changes

babbush previously approved these changes Jul 23, 2017

View reviewed changes

make it work with python3

ba224de

Spaceenter dismissed babbush’s stale review via ba224de July 23, 2017 01:22

babbush approved these changes Jul 23, 2017

View reviewed changes

idk3 reviewed Jul 23, 2017

View reviewed changes

damiansteiger approved these changes Jul 23, 2017

View reviewed changes

babbush merged commit 2482ab2 into ProjectQ-Framework:develop Jul 23, 2017

Spaceenter deleted the mar branch September 5, 2017 19:42

damiansteiger mentioned this pull request Oct 29, 2017

Time for new release? #154

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use marshal for save/load operators to improve performance #124

Use marshal for save/load operators to improve performance #124

Spaceenter commented Jul 22, 2017

Spaceenter commented Jul 22, 2017

damiansteiger left a comment

idk3 commented Jul 22, 2017

idk3 Jul 22, 2017

babbush commented Jul 22, 2017

idk3 Jul 22, 2017

Spaceenter Jul 23, 2017

idk3 Jul 23, 2017

Spaceenter Jul 23, 2017

babbush Jul 23, 2017

Spaceenter Jul 23, 2017

idk3 commented Jul 23, 2017 •

edited

Loading

idk3 Jul 23, 2017

Spaceenter Jul 23, 2017

idk3 Jul 23, 2017

Spaceenter Jul 23, 2017

idk3 Jul 23, 2017

damiansteiger commented Jul 23, 2017

babbush commented Jul 23, 2017

damiansteiger commented Jul 23, 2017 •

edited

Loading

Spaceenter commented Jul 23, 2017

babbush commented Jul 23, 2017

Spaceenter commented Jul 23, 2017

babbush commented Jul 23, 2017

Use marshal for save/load operators to improve performance #124

Use marshal for save/load operators to improve performance #124

Conversation

Spaceenter commented Jul 22, 2017

Spaceenter commented Jul 22, 2017

damiansteiger left a comment

Choose a reason for hiding this comment

idk3 commented Jul 22, 2017

Choose a reason for hiding this comment

babbush commented Jul 22, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

idk3 commented Jul 23, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

damiansteiger commented Jul 23, 2017

babbush commented Jul 23, 2017

damiansteiger commented Jul 23, 2017 • edited Loading

Spaceenter commented Jul 23, 2017

babbush commented Jul 23, 2017

Spaceenter commented Jul 23, 2017

babbush commented Jul 23, 2017

idk3 commented Jul 23, 2017 •

edited

Loading

damiansteiger commented Jul 23, 2017 •

edited

Loading