Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cap tempfile number up to 200 #137

Merged
merged 1 commit into from
Sep 16, 2024
Merged

Cap tempfile number up to 200 #137

merged 1 commit into from
Sep 16, 2024

Conversation

chezou
Copy link
Member

@chezou chezou commented Sep 13, 2024

The current BulkImport writer uses tempfile, and it opens files at once before parallel bulk importing. In some cases, it creates too many opening files and OSError depending on the chunk size.

To avoid OSError when writing large files, the chunk number is to be capped at most 200.

To avoid OSError when writing large files, the chunk number is capped at
200. This number comes from the maximum number of open files allowed by
macOS.
@chezou chezou changed the title Set the chunk number cap to 1000 Set the chunk number cap to 200 Sep 15, 2024
@chezou chezou marked this pull request as draft September 16, 2024 00:30
@chezou chezou changed the title Set the chunk number cap to 200 Create tempfile within ThreadPoolExecutor Sep 16, 2024
@chezou chezou force-pushed the chunk-size-cap branch 3 times, most recently from 08d6a36 to f597395 Compare September 16, 2024 00:58
@chezou chezou changed the title Create tempfile within ThreadPoolExecutor Cap tempfile number up to 200 Sep 16, 2024
@chezou chezou marked this pull request as ready for review September 16, 2024 01:00
Copy link
Contributor

@tung-vu-td tung-vu-td left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@chezou chezou merged commit 0ed76b4 into master Sep 16, 2024
42 checks passed
@chezou chezou deleted the chunk-size-cap branch September 16, 2024 16:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants