-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Constants computed on-the-fly when model is serialized to IR #27705
base: master
Are you sure you want to change the base?
Conversation
…ization to IR. Helps to not materialize big computed constants in memory for the entire model. Instead make such Constants one by one during serialization to IR. It consists of two parts: - Serialize pass modification to listen for 'postponed_constant' runtime attribute in a node and on-the-fly replace such a node by a Constant node with constant_fold method. - Python helper `make_postponed_constant` that wrap an arbitrary callback that creates a tensor into a custom Python operation that has runtime attribute `postponed_constant`. This operation will constant folded into a Constant instance with data obtained by calling that callback.
// TODO: As it was mentioned above that there was a chance to have a collision easily, should we limit the number of iterations in the following loop? | ||
// FIXME: Loop over items that have matching hash keys only, not entire range [found, m_hash_to_file_positions.end()). | ||
// FIXME: According to https://en.cppreference.com/w/cpp/container/multimap/find, std::multimap::find returns any item with the matching key, not necessary | ||
// the first one. But the loop below checking items starting with `found` item only even if there may be other items with matching hash before that position. | ||
// So we are not checking all possible matches. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pavel-esir, please confirm that we have an issue here.
return PostponedConstant.class_type_info | ||
|
||
def evaluate(self, outputs, _): | ||
maker().copy_to(outputs[0]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we allow the assignment of new ov.Tensor
instances in outputs
, then this copying won't be required. Another (probably better) option is to pass outputs[0]
to maker
function to build the tensor in place.
A way to avoid the materialization of big constants by computing them on-the-fly during IR serialization. Going to be used in NNCF when compressing big models.