-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CPU] Refactor memory control and allocation #27259
[CPU] Refactor memory control and allocation #27259
Conversation
28be812
to
9311b5e
Compare
5d0ea7f
to
8d45629
Compare
memoryControl, | ||
m_networkMemoryControl, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the memoryControl
is derived from m_networkMemoryControl
, may be it's just enough to pass only m_networkMemoryControl
to the context?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current idea is that passing m_networkMemoryControl is now obsolete and should be dropped after all the nodes with inner graphs are updated according to the memory reuse changes. And all the nodes are supposed to use a particular memoryControl instance created by CompiledModel and not some random one from networkMemoryControl.
std::shared_ptr<NetworkMemoryControl> get_network_memory_control() const { | ||
return m_networkMemoryControl; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this method is not used.
@@ -82,6 +82,7 @@ class Edge { | |||
} | |||
|
|||
std::string name() const; | |||
const MemoryDesc& getDesc() const; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I remember this, this method was hidden from the public domain on purpose.
Making this method public, introduces two separate ways to access memory descriptor:
- Via the memory Object
- Via this method.
And it becomes confusing for the node developer - which path should be used in which context. I would propose to revise the purpose of moving this method to public section and try to avoid this change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR is not really ready for review.
This change is temporary, just to enable functionality.
src/plugins/intel_cpu/src/graph.h
Outdated
struct MemoryRegion { | ||
int start; // Execution order index of first use. | ||
int finish; // Execution order index of last use. -1 means inf | ||
int64_t size; // size in bytes | ||
int64_t id; // ID unique for each region | ||
|
||
enum class RegionType : uint8_t { VARIABLE, CONSTANT, INPUT, OUTPUT, IO } type; | ||
enum class AllocType : uint8_t { POD, STRING, UNKNOWN } alloc_type; | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This structure should be a part of the memory management subsystem as it's the problem description for the mem management. What is the reason behind moving this structure to the graph header?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just a tmp change, will be reverted
src/plugins/intel_cpu/src/graph.cpp
Outdated
void Graph::Activate(const std::vector<MemoryPtr>& externalInputMemory, | ||
const std::vector<MemoryPtr>& externalOutputMemory) { | ||
OPENVINO_ASSERT(status == Status::Initialized, "Invalid graph status"); | ||
const std::vector<MemoryPtr>& externalOutputMemory, | ||
bool globalAllocation) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, a specific flag to indicate the global status. May be this is an another indicator of introducing a Subgraph class as a derivative of the cpu Graph?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please check the updated documentation.
The flag is necessary to allow the nodes with inner graphs which are not updated yet, to use local memory control unit, as it is currently done on master.
Supposed to be dropped after all the nodes are updated.
src/plugins/intel_cpu/src/graph.cpp
Outdated
if (memoryControl->allocated()) { | ||
// std::cout << "Memory is already allocated for a subgraph: " << _name << "\n"; | ||
return; | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please shed some lite on the purpose of this check? Isn't it an unexpected situation that a memory control is in the allocated state even though the Allocate
is called once?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the way to keep memory allocation procedure generic across all the graphs.
So, no graphs are unique (i.e. outer graph or subgraphs)
The idea is that the first graph which is being "Activated" is responsible to allocate the memory for all the "context" it has, which includes all the subgraphs.
Basically, it will always be the outer graph which actually does this.
I am not 100% satisfied with this approach to be honest, but on the other hand it does make sense.
MemoryControl* network_memory_control = m_graph->getGraphContext()->getMemoryControl(); | ||
if (!network_memory_control) { | ||
OPENVINO_THROW("Memory control unit is not initilized for graph: ", m_graph->GetName()); | ||
} | ||
|
||
if (!network_memory_control->allocated()) { | ||
network_memory_control->allocateMemory(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the graph usage perspective, this action is not that obvious at all. So this is just an another example of implicit coupling between the infer request, graph, and memory control subsystems. Suppose we would like to develop yet another implementation of an infer request, or run the graph in other context (other than the infer request). How do we understand that we have to retrieve the memory subsystem, check the allocation status and call allocate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really remember why I moved it out the Graph, but I think it can be reverted.
From the other perspective we kind of know that we don't have to perform this check for the SubGraphs, so it kind of make sense to move the check out of the Graph logic.
static EdgeClusters formEdgeClusters(const std::vector<EdgePtr>& graphEdges); | ||
static MemoryRegions formMemoryRegions(const EdgeClusters& clusters, size_t remaining, const GlobalExecutionIndex& globalExecIndex); | ||
static OutputMemoryBlocks filterOutDynamicOutputEdges(MemoryRegions& memoryRegions, | ||
const EdgeClusters& clusters, | ||
const std::map<std::size_t, NodePtr>& outputNodes); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't it better to put these utility methods to some place other than the MemoryControl
class to keep the latter plugin independent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree.
I am going to move it back to the graph
// @todo return std::reference_wrapper instead? | ||
MemoryControl* createMemoryControlUnit(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why can't we simply return a reference?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can
Just do not like the idea of storing a plain reference inside the GraphContext
virtual bool canBeSkipped() const { | ||
return getSelectedPrimitiveDescriptor()->hasZeroInputDims(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May be change the name to isExecutableStatic
and rename the existing isExecutable
-> isExecutableDynamic
in the analogy with executeStatic/Dynamic? What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The naming is hard in this case.
My thoughts are:
- We are trying to perform an optimization, when we completely drop a node from the execution graph, because we know it will never be executed. The good name for such check would be actually "isExecutable", meaning "is executable at all?"
- We are trying to check, whether the node execution can be skipped during the inference, because it gets zerodim input shapes or maybe it became inplace, when we will have dynamic inplace. This check could have a name "shouldBeSkipped", "mustBeSkipped" or "isDynamicallyExecutable().
Some refactoring and code the clean-ups can still be expected |
8d45629
to
e4812a2
Compare
e4812a2
to
ba4ccff
Compare
The same changes + adaptations for |
Details:
though the nesting levels. If a node does not support global memory reuse (i.e. If operation) then global memory reuse is disabled for all the nested graphs of than node).
This allows to have both types of the nodes with a subgraph - the updated and not updated ones - at the same time in a single graph.
Tickets: