Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

One-line large CSV file is not chunked #1073

Open
bogdan-bondar opened this issue Oct 31, 2024 · 0 comments
Open

One-line large CSV file is not chunked #1073

bogdan-bondar opened this issue Oct 31, 2024 · 0 comments

Comments

@bogdan-bondar
Copy link

The input File consists of one very long line/row in the form of abc1, abc2, abc3, ..., abc10000000. The file size is about 200 mb.
Here is the parsing code:

Papa.LocalChunkSize = 1024 * 1024;  // 1 mb
const onUploadCsv = (event: React.ChangeEvent<HTMLInputElement>): void => {
        const file = event.target.files?.[0];
        if (!file) {
            return;
        }

        Papa.parse(file, {
            worker: false,
            header: false,
            beforeFirstChunk: chunk => {
                console.log('Before first chunk callback chunk: ' + chunk);
                return chunk;
            },
            chunk: (results, parser) => {
                console.log(
                    'Chunk callback results data length: ' +
                        results.data.length,
                );
                console.log('Chunk callback results data: ' + results.data);
            },
            complete: () => {
                console.log('Complete!');
            },
        });

        // reset value to allow upload same file again
        event.target.value = '';
    };

The result is that beforeFirstChunk is executed as intended, but chunk callback just returns no data for each iteration and then returns the whole line in the last iteration, that defies the purpose of chunking/streaming. For multi-line/multi-row files everything is working as intended.

Could someone, please, explain the behaviour: is this format not supported or is there a bug in the library?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant