Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[util] Add numpy file reader #2492

Merged
merged 1 commit into from
Mar 4, 2024
Merged

Conversation

DonghakPark
Copy link
Member

In deep learning data feeding, there are often tasks that involve reading and processing numpy (.npy) files.

Therefore, it would be much more convenient to develop by entering the file name and returning the data in a vector instead of manually writing code to read it every time.

Function Signature

void read_npy_file(const char *file_path, std::vector<int> &dims,
                   std::vector<float> &values);

Self evaluation:

  1. Build test: [X]Passed [ ]Failed [ ]Skipped
  2. Run test: [X]Passed [ ]Failed [ ]Skipped

@taos-ci
Copy link

taos-ci commented Feb 28, 2024

📝 TAOS-CI Version: 1.5.20200925. Thank you for submitting PR #2492. Please a submit 1commit/1PR (one commit per one PR) policy to get comments quickly from reviewers. Your PR must pass all verificiation processes of cibot before starting a review process from reviewers. If you are new member to join this project, please read manuals in documentation folder and wiki page. In order to monitor a progress status of your PR in more detail, visit http://ci.nnstreamer.ai/.

@taos-ci
Copy link

taos-ci commented Feb 28, 2024

:octocat: cibot: @DonghakPark, Applications/utils/npy_reader/npy_reader.h does not include Doxygen tags such as @file @brief @author @bug. You must include the Doxygen tags in the source code. Please refer to a Doxygen manual at http://github.com/nnstreamer/TAOS-CI/blob/main/ci/doc/doxygen-documentation.md

Copy link

@taos-ci taos-ci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DonghakPark, 💯 All CI checkers are successfully verified. Thanks.

Copy link

@taos-ci taos-ci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DonghakPark, 💯 All CI checkers are successfully verified. Thanks.

Copy link
Contributor

@djeong20 djeong20 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

is a reader the only thing we need? (no saver?)

@DonghakPark
Copy link
Member Author

LGTM!

is a reader the only thing we need? (no saver?)

i will add numpy DataLoader for nntrainer, and Save later!

@taos-ci
Copy link

taos-ci commented Mar 4, 2024

:octocat: cibot: @DonghakPark, Applications/utils/npy_reader/npy_reader.h does not include Doxygen tags such as @file @brief @author @bug. You must include the Doxygen tags in the source code. Please refer to a Doxygen manual at http://github.com/nnstreamer/TAOS-CI/blob/main/ci/doc/doxygen-documentation.md

*
* @brief read numpy file from file_path
*/
void read_npy_file();
Copy link
Member

@myungjoo myungjoo Mar 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think read_npy_file is required. The constructur should've done it.

E.g.,

private:
   void read_npy_file (const char *file_path);
public:
   void reload(); // Later, if you think this is really required;
   void save(new_file_path = nullptr); // Later. if it's null, use old file_path. If both are nullptr, throw invalid-param.
...
NpyReader::NpyReader(file_path_):
   file_path(file_path_) {
  read_npy_file (file_path); // This will initilize dims and values;
}
NpyReader::NpyReader(_dims, _values):
   file_path(nullptr), dims(_dims), values(_values) {
}
void NpyReader::reload(new_file_path)
{
  if (new_file_path!=nullptr)
    strcpy(file_path, new_file_path);
  read_npy_file (file_path);
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FIX! Thank you for your feedback.
your proposed method much better scalability!! later i will add reload, save, nntrainer_dataloader method to this util function!!

@taos-ci
Copy link

taos-ci commented Mar 4, 2024

:octocat: cibot: @DonghakPark, Applications/utils/npy_reader/npy_reader.h does not include Doxygen tags such as @file @brief @author @bug. You must include the Doxygen tags in the source code. Please refer to a Doxygen manual at http://github.com/nnstreamer/TAOS-CI/blob/main/ci/doc/doxygen-documentation.md

@taos-ci
Copy link

taos-ci commented Mar 4, 2024

:octocat: cibot: @DonghakPark, Applications/utils/npy_reader/npy_reader.cpp includes bug(s). Please fix incorrect coding constructs in your commit before entering a review process.

@DonghakPark
Copy link
Member Author

The failure of the clang-format check in TAOS is due to a version issue, and we plan to disable the TAOS CI.

Copy link

@taos-ci taos-ci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DonghakPark, 💯 All CI checkers are successfully verified. Thanks.

* @param file_path file_path that want to read
* @param dims data shape
* @param values return value that contain npy's contents
*/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need docs for .cpp files as well?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FIX! good point


for (int i = 0; i < total_entries; ++i) {
float value;
fread(&value, 4, 1, file);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question : fread for 4 because it is float?

Copy link
Member Author

@DonghakPark DonghakPark Mar 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

numpy stores data in 4-byte increments for most types. It is necessary to check how many bytes are used for other data types.

In deep learning data feeding, there are often tasks that involve reading and processing numpy (.npy) files.

Therefore, it would be much more convenient to develop by entering the file name and returning the data in a vector instead of manually writing code to read it every time.

Function Signature
```
void read_npy_file(const char *file_path, std::vector<int> &dims,
                   std::vector<float> &values);
```

**Self evaluation:**
1. Build test:	 [X]Passed [ ]Failed [ ]Skipped
2. Run test:	 [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Donghak PARK <[email protected]>
Copy link

@taos-ci taos-ci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DonghakPark, 💯 All CI checkers are successfully verified. Thanks.

@myungjoo myungjoo merged commit c522c7f into nnstreamer:main Mar 4, 2024
30 of 32 checks passed
@DonghakPark DonghakPark deleted the npy_reader branch May 7, 2024 00:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants