-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[GPU] Global Buffer manager and optimization #2816
Changes from 2 commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,42 @@ | ||
// SPDX-License-Identifier: Apache-2.0 | ||
/** | ||
* Copyright (C) 2024 Debadri Samaddar <[email protected]> | ||
* | ||
* @file cl_buffer_manager.cpp | ||
* @date 01 Dec 2024 | ||
* @see https://github.com/nnstreamer/nntrainer | ||
* @author Debadri Samaddar <[email protected]> | ||
* @bug No known bugs except for NYI items | ||
* @brief This file contains global Buffer objects and manages them | ||
*/ | ||
|
||
#include <cl_buffer_manager.h> | ||
|
||
namespace nntrainer { | ||
|
||
ClBufferManager &ClBufferManager::getInstance() { | ||
static ClBufferManager instance; | ||
return instance; | ||
} | ||
|
||
// to-do: Implementation to be updated with array of Buffer objects if required | ||
// fp16 Buffer objects to be added in future | ||
void ClBufferManager::initBuffers() { | ||
readBufferA = new opencl::Buffer(context_inst_, buffer_size_bytes, true); | ||
readBufferB = new opencl::Buffer(context_inst_, buffer_size_bytes, true); | ||
readBufferC = new opencl::Buffer(context_inst_, buffer_size_bytes, true); | ||
writeBufferA = new opencl::Buffer(context_inst_, buffer_size_bytes, false); | ||
writeBufferB = new opencl::Buffer(context_inst_, buffer_size_bytes, false); | ||
ml_logi("ClBufferManager: Buffers initialized"); | ||
} | ||
|
||
ClBufferManager::~ClBufferManager() { | ||
delete readBufferA; | ||
delete readBufferB; | ||
delete readBufferC; | ||
delete writeBufferA; | ||
delete writeBufferB; | ||
ml_logi("ClBufferManager: Buffers destroyed"); | ||
} | ||
|
||
} // namespace nntrainer |
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
@@ -0,0 +1,77 @@ | ||||||
// SPDX-License-Identifier: Apache-2.0 | ||||||
/** | ||||||
* Copyright (C) 2024 Debadri Samaddar <[email protected]> | ||||||
* | ||||||
* @file cl_buffer_manager.h | ||||||
* @date 01 Dec 2024 | ||||||
* @see https://github.com/nnstreamer/nntrainer | ||||||
* @author Debadri Samaddar <[email protected]> | ||||||
* @bug No known bugs except for NYI items | ||||||
* @brief This file contains global Buffer objects and manages them | ||||||
*/ | ||||||
|
||||||
#ifndef __CL_BUFFER_MANAGER_H__ | ||||||
#define __CL_BUFFER_MANAGER_H__ | ||||||
|
||||||
#include <string> | ||||||
|
||||||
#include <opencl_buffer.h> | ||||||
#include <opencl_context_manager.h> | ||||||
|
||||||
#include <nntrainer_log.h> | ||||||
|
||||||
namespace nntrainer { | ||||||
|
||||||
/** | ||||||
* @class ClBufferManager contains Buffer object management | ||||||
* @brief Support for Buffer management | ||||||
*/ | ||||||
|
||||||
class ClBufferManager { | ||||||
|
||||||
private: | ||||||
/** | ||||||
* @brief Private constructor to prevent object creation | ||||||
* | ||||||
*/ | ||||||
ClBufferManager(){}; | ||||||
|
||||||
/** | ||||||
* @brief OpenCl context global instance | ||||||
* | ||||||
*/ | ||||||
opencl::ContextManager &context_inst_ = opencl::ContextManager::GetInstance(); | ||||||
|
||||||
/** | ||||||
* @brief Buffer size in bytes preset (256 mebibytes) | ||||||
*/ | ||||||
size_t buffer_size_bytes = 8192 * 8192 * sizeof(float); | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If it's a fixed constant, what about adding
Suggested change
|
||||||
|
||||||
public: | ||||||
/** | ||||||
* @brief Get Global ClBufferManager. | ||||||
* | ||||||
* @return ClBufferManager& | ||||||
*/ | ||||||
static ClBufferManager &getInstance(); | ||||||
|
||||||
opencl::Buffer *readBufferA; | ||||||
opencl::Buffer *readBufferB; | ||||||
opencl::Buffer *readBufferC; | ||||||
opencl::Buffer *writeBufferA; | ||||||
opencl::Buffer *writeBufferB; | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is it reasonable to let them public? From the name of this class, cl_buffer_manager, it would be better not to expose the right managing the buffers outside. Isn't it better to make them private and implement some methods to access them? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. One more comment. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Added abstraction in |
||||||
|
||||||
/** | ||||||
* @brief Initialize Buffer objects. | ||||||
*/ | ||||||
void initBuffers(); | ||||||
|
||||||
/** | ||||||
* @brief Destroy Buffer pointers. | ||||||
* | ||||||
*/ | ||||||
~ClBufferManager(); | ||||||
}; | ||||||
} // namespace nntrainer | ||||||
|
||||||
#endif /* __CL_BUFFER_MANAGER_H__ */ |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -29,6 +29,7 @@ | |
#include <layer.h> | ||
#include <layer_devel.h> | ||
|
||
#include <cl_buffer_manager.h> | ||
#include <opencl_command_queue_manager.h> | ||
#include <opencl_context_manager.h> | ||
#include <opencl_kernel.h> | ||
|
@@ -79,12 +80,14 @@ class ClContext { | |
|
||
template <typename... Ts> using FactoryMap = std::tuple<IndexType<Ts>...>; | ||
|
||
// getting static instance of commandqueue and opencl context | ||
// getting static instance of commandqueue, opencl context and buffermanager | ||
opencl::CommandQueueManager &command_queue_inst_ = | ||
opencl::CommandQueueManager::GetInstance(); | ||
|
||
opencl::ContextManager &context_inst_ = opencl::ContextManager::GetInstance(); | ||
|
||
ClBufferManager &clbuffInstance = ClBufferManager::getInstance(); | ||
|
||
/** | ||
* @brief Default constructor | ||
*/ | ||
|
@@ -272,6 +275,9 @@ class ClContext { | |
|
||
// getContext() called inside createCommandQueue which creates clContext | ||
bool result = command_queue_inst_.CreateCommandQueue(); | ||
// initialize device buffers | ||
clbuffInstance.initBuffers(); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you gurantee that initBuffers() is called before any possible call to ClBufferManager::some-access-funtion? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. To guarantee the calling sequence (other memer functions are called after At least, set the buffer pointers NULL at the constructor so that the caller is guaranteed to know that something's wrong if init wasn't called. And state (doxygen for potentially harmful member functions) that it will return NULL if init is not called. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks for your insights. I have added initializers on the constructor and added relevant doc for the member functions. |
||
|
||
cl_initialized = result; | ||
return cl_initialized; | ||
}; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Simple question! Based on this PR, do we have to create all buffers for every kernel in advance?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Buffer creation consumes around 70-75% of the whole latency. This PR will create buffers only once at the beginning. All kernels will be able to re-use same or different buffers multiple times. Which means, buffer data update can happen multiple times but creation will happen only once.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understood the concept. The point I want to ask is how may buffers should be created in advance. Also, as you mentioned, the buffer can be reused. Then, the manager should schedule the proper buffer by hiding the internal buffer assets.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As per the PR, 5 buffers are created in advance each of 256 MiB. Also, as you suggested before I have added proper abstraction for
cl_buffer_manager
.