-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
append mode to sbdf export #45
Comments
Hi @lwlwlwlw! Can you be more specific (perhaps with an example of what API you are expecting) about what kind of appending you are looking for? Are you looking for appending rows, appending columns, or some other concept of appending? I'll also comment that the process of appending is complicated by the inherent structure of SBDF files. They are laid out as a sequence of table slices (consisting of a number of rows) that contain a sequence of column slices (consisting of all the values in one column in the rows covered by the containing table slice). To append a row involves rewriting (and growing) all the column slices in the last table slice; appending a column will have to rewrite all table slices (and would probably require rebalancing rows between slices since there is target number of values (rows x columns) in each table slice for performance reasons). In general, it's probably easier to import the data from the file, make any modifications to the data that are desired, and then exporting. |
@bbassett-tibco One of our customers wants to append data (rows) to existing sbdf file because the data is incremental. Preferable something like this, ("append=True" option to indicate appending) import spotfire.sbdf as sb |
Hello! I'd like to second lwlwlwlw's request for an append mode to SBDF files and provide additional context for why this feature would be extremely valuable. Many of my clients require processing and exporting of large amounts of data (often exceeding available RAM) from various file formats and SQL databases into SBDF files. Our typical workflow involves Python processes where we perform data cleaning and formatting before converting to SBDF. This approach ensures that the Spotfire project loads pre-processed, clean data, significantly improving load times and project performance. However, I am facing challenges with the existing Currently, my workaround is to split larger datasets into multiple SBDF files and Spotfire "concatenates" them as the project loads, but this increases loading time. This is particularly inefficient given the explanation provided earlier about the complexity of appending: "To append a row involves rewriting (and growing) all the column slices in the last table slice; appending a column will have to rewrite all table slices (and would probably require rebalancing rows between slices since there is target number of values (rows x columns) in each table slice for performance reasons)." Given these challenges, I am wondering if you could provide guidance or consider implementing features that allow for more memory-efficient handling of large datasets during the export process. Specifically:
I understand that the SBDF file structure makes this complex, but any insights or potential solutions would be greatly appreciated. If full append functionality isn't feasible, are there alternative approaches or best practices you'd recommend for handling these large, pre-processed datasets more efficiently? Thank you for your consideration of this feature request and any guidance you can provide. |
OK, given @bschwartzjetrock's well written problem description, I'm beginning to think that a potential solution to this request would look like:
We can definitely investigate the concept further in a future release. |
Feature requests:
Would it be possible to add append mode to sbdf export (append data to an existing sbdf file)?
Thank you.
The text was updated successfully, but these errors were encountered: