Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CPU] Refactor memory control and allocation #27259

Conversation

EgorDuplensky
Copy link
Contributor

@EgorDuplensky EgorDuplensky commented Oct 25, 2024

Details:

  • All the nested graphs are now can be a part of a global memory reuse logic
  • Code changes are required to enable global memory reuse for a node with a nested graph
  • LoRa and Composite nodes have been updated to support global memory reuse
  • Temporary property has been added to the GraphContext to propagate global memory reuse
    though the nesting levels. If a node does not support global memory reuse (i.e. If operation) then global memory reuse is disabled for all the nested graphs of than node).
    This allows to have both types of the nodes with a subgraph - the updated and not updated ones - at the same time in a single graph.

Tickets:

  • ticket-id

@EgorDuplensky EgorDuplensky requested review from a team as code owners October 25, 2024 21:35
@github-actions github-actions bot added the category: CPU OpenVINO CPU plugin label Oct 25, 2024
@EgorDuplensky EgorDuplensky force-pushed the enable_memory_reuse_across_inner_graphs branch 4 times, most recently from 28be812 to 9311b5e Compare October 31, 2024 15:32
@EgorDuplensky EgorDuplensky requested a review from a team as a code owner October 31, 2024 15:32
@github-actions github-actions bot added category: inference OpenVINO Runtime library - Inference category: build OpenVINO cmake script / infra labels Oct 31, 2024
@EgorDuplensky EgorDuplensky force-pushed the enable_memory_reuse_across_inner_graphs branch 13 times, most recently from 5d0ea7f to 8d45629 Compare November 6, 2024 09:01
@maxnick maxnick self-assigned this Nov 6, 2024
Comment on lines +167 to +168
memoryControl,
m_networkMemoryControl,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the memoryControl is derived from m_networkMemoryControl, may be it's just enough to pass only m_networkMemoryControl to the context?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current idea is that passing m_networkMemoryControl is now obsolete and should be dropped after all the nodes with inner graphs are updated according to the memory reuse changes. And all the nodes are supposed to use a particular memoryControl instance created by CompiledModel and not some random one from networkMemoryControl.

Comment on lines +57 to +59
std::shared_ptr<NetworkMemoryControl> get_network_memory_control() const {
return m_networkMemoryControl;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this method is not used.

@@ -82,6 +82,7 @@ class Edge {
}

std::string name() const;
const MemoryDesc& getDesc() const;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I remember this, this method was hidden from the public domain on purpose.
Making this method public, introduces two separate ways to access memory descriptor:

  1. Via the memory Object
  2. Via this method.
    And it becomes confusing for the node developer - which path should be used in which context. I would propose to revise the purpose of moving this method to public section and try to avoid this change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR is not really ready for review.
This change is temporary, just to enable functionality.

Comment on lines 34 to 42
struct MemoryRegion {
int start; // Execution order index of first use.
int finish; // Execution order index of last use. -1 means inf
int64_t size; // size in bytes
int64_t id; // ID unique for each region

enum class RegionType : uint8_t { VARIABLE, CONSTANT, INPUT, OUTPUT, IO } type;
enum class AllocType : uint8_t { POD, STRING, UNKNOWN } alloc_type;
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This structure should be a part of the memory management subsystem as it's the problem description for the mem management. What is the reason behind moving this structure to the graph header?

Copy link
Contributor Author

@EgorDuplensky EgorDuplensky Nov 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a tmp change, will be reverted

Comment on lines 360 to 362
void Graph::Activate(const std::vector<MemoryPtr>& externalInputMemory,
const std::vector<MemoryPtr>& externalOutputMemory) {
OPENVINO_ASSERT(status == Status::Initialized, "Invalid graph status");
const std::vector<MemoryPtr>& externalOutputMemory,
bool globalAllocation) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, a specific flag to indicate the global status. May be this is an another indicator of introducing a Subgraph class as a derivative of the cpu Graph?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check the updated documentation.
The flag is necessary to allow the nodes with inner graphs which are not updated yet, to use local memory control unit, as it is currently done on master.
Supposed to be dropped after all the nodes are updated.

Comment on lines 988 to 1198
if (memoryControl->allocated()) {
// std::cout << "Memory is already allocated for a subgraph: " << _name << "\n";
return;
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please shed some lite on the purpose of this check? Isn't it an unexpected situation that a memory control is in the allocated state even though the Allocate is called once?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the way to keep memory allocation procedure generic across all the graphs.
So, no graphs are unique (i.e. outer graph or subgraphs)
The idea is that the first graph which is being "Activated" is responsible to allocate the memory for all the "context" it has, which includes all the subgraphs.
Basically, it will always be the outer graph which actually does this.
I am not 100% satisfied with this approach to be honest, but on the other hand it does make sense.

Comment on lines +139 to +146
MemoryControl* network_memory_control = m_graph->getGraphContext()->getMemoryControl();
if (!network_memory_control) {
OPENVINO_THROW("Memory control unit is not initilized for graph: ", m_graph->GetName());
}

if (!network_memory_control->allocated()) {
network_memory_control->allocateMemory();
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the graph usage perspective, this action is not that obvious at all. So this is just an another example of implicit coupling between the infer request, graph, and memory control subsystems. Suppose we would like to develop yet another implementation of an infer request, or run the graph in other context (other than the infer request). How do we understand that we have to retrieve the memory subsystem, check the allocation status and call allocate?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really remember why I moved it out the Graph, but I think it can be reverted.
From the other perspective we kind of know that we don't have to perform this check for the SubGraphs, so it kind of make sense to move the check out of the Graph logic.

Comment on lines 26 to 30
static EdgeClusters formEdgeClusters(const std::vector<EdgePtr>& graphEdges);
static MemoryRegions formMemoryRegions(const EdgeClusters& clusters, size_t remaining, const GlobalExecutionIndex& globalExecIndex);
static OutputMemoryBlocks filterOutDynamicOutputEdges(MemoryRegions& memoryRegions,
const EdgeClusters& clusters,
const std::map<std::size_t, NodePtr>& outputNodes);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't it better to put these utility methods to some place other than the MemoryControl class to keep the latter plugin independent?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree.
I am going to move it back to the graph

Comment on lines +58 to +65
// @todo return std::reference_wrapper instead?
MemoryControl* createMemoryControlUnit();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can't we simply return a reference?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can
Just do not like the idea of storing a plain reference inside the GraphContext

Comment on lines +298 to +300
virtual bool canBeSkipped() const {
return getSelectedPrimitiveDescriptor()->hasZeroInputDims();
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May be change the name to isExecutableStatic and rename the existing isExecutable -> isExecutableDynamic in the analogy with executeStatic/Dynamic? What do you think?

Copy link
Contributor Author

@EgorDuplensky EgorDuplensky Nov 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The naming is hard in this case.
My thoughts are:

  1. We are trying to perform an optimization, when we completely drop a node from the execution graph, because we know it will never be executed. The good name for such check would be actually "isExecutable", meaning "is executable at all?"
  2. We are trying to check, whether the node execution can be skipped during the inference, because it gets zerodim input shapes or maybe it became inplace, when we will have dynamic inplace. This check could have a name "shouldBeSkipped", "mustBeSkipped" or "isDynamicallyExecutable().

@EgorDuplensky
Copy link
Contributor Author

Some refactoring and code the clean-ups can still be expected

@EgorDuplensky EgorDuplensky force-pushed the enable_memory_reuse_across_inner_graphs branch from 8d45629 to e4812a2 Compare November 9, 2024 13:01
@EgorDuplensky EgorDuplensky force-pushed the enable_memory_reuse_across_inner_graphs branch from e4812a2 to ba4ccff Compare November 11, 2024 16:17
@EgorDuplensky
Copy link
Contributor Author

The same changes + adaptations for If and TensorIterator node will be done in scope of:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: build OpenVINO cmake script / infra category: CPU OpenVINO CPU plugin category: inference OpenVINO Runtime library - Inference
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants