Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Constants computed on-the-fly when model is serialized to IR #27705

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

slyalin
Copy link
Contributor

@slyalin slyalin commented Nov 22, 2024

A way to avoid the materialization of big constants by computing them on-the-fly during IR serialization. Going to be used in NNCF when compressing big models.

…ization to IR. Helps to not materialize big computed constants in memory for the entire model. Instead make such Constants one by one during serialization to IR.

It consists of two parts:
 - Serialize pass modification to listen for 'postponed_constant' runtime attribute in a node and on-the-fly replace such a node by a Constant node with constant_fold method.
 - Python helper `make_postponed_constant` that wrap an arbitrary callback that creates a tensor into a custom Python operation that has runtime attribute `postponed_constant`. This operation will constant folded into a Constant instance with data obtained by calling that callback.
@github-actions github-actions bot added category: Core OpenVINO Core (aka ngraph) category: Python API OpenVINO Python bindings labels Nov 22, 2024
Comment on lines +139 to +143
// TODO: As it was mentioned above that there was a chance to have a collision easily, should we limit the number of iterations in the following loop?
// FIXME: Loop over items that have matching hash keys only, not entire range [found, m_hash_to_file_positions.end()).
// FIXME: According to https://en.cppreference.com/w/cpp/container/multimap/find, std::multimap::find returns any item with the matching key, not necessary
// the first one. But the loop below checking items starting with `found` item only even if there may be other items with matching hash before that position.
// So we are not checking all possible matches.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pavel-esir, please confirm that we have an issue here.

return PostponedConstant.class_type_info

def evaluate(self, outputs, _):
maker().copy_to(outputs[0])
Copy link
Contributor Author

@slyalin slyalin Nov 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we allow the assignment of new ov.Tensor instances in outputs, then this copying won't be required. Another (probably better) option is to pass outputs[0] to maker function to build the tensor in place.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: Core OpenVINO Core (aka ngraph) category: Python API OpenVINO Python bindings
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant