use default delimiter to flatten columns #330

shcheklein · 2024-08-20T23:29:47Z

Fixes #328

~~to_pandas(flatten=True) now returns columns in the same way as to_records and in the same way we store them in the DB (to fix the bug and for consistency)~~

changing this use .. It is simpler for end users and we don't want them to see DB details (at least it's not how our API operates atm).

cloudflare-workers-and-pages · 2024-08-20T23:32:14Z

Deploying datachain-documentation with Cloudflare Pages

Latest commit:	`7164839`
Status:	✅ Deploy successful!
Preview URL:	https://71fe28a1.datachain-documentation.pages.dev
Branch Preview URL:	https://fix-328.datachain-documentation.pages.dev

View logs

shcheklein · 2024-08-21T00:55:40Z

Probably not the right way to do this, tbh. I'll change it back (and replace __ with .). @iterative/datachain what was the reason to use __ instead of a . on the DB level? We now have two ways of interacting with names (even within public API - e.g. to_records).

codecov · 2024-08-21T01:21:00Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 86.88%. Comparing base (06fdd8c) to head (7164839).
Report is 5 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #330      +/-   ##
==========================================
- Coverage   86.94%   86.88%   -0.07%     
==========================================
  Files          90       90              
  Lines        9898     9896       -2     
  Branches     1995     1994       -1     
==========================================
- Hits         8606     8598       -8     
- Misses        944      949       +5     
- Partials      348      349       +1

Flag	Coverage Δ
datachain	`86.81% <100.00%> (-0.07%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

dberenbaum · 2024-08-21T14:27:04Z

src/datachain/lib/dc.py

-            columns = []
-            if headers:
-                columns = [".".join(filter(None, header)) for header in headers]
-            return pd.DataFrame.from_records(self.to_records(), columns=columns)


So the problem was this line (passing data as a dict and then assigning different column names)? Is the rest refactoring? Just want to make sure I understand the bug and the fix.

shcheklein requested review from mnrozhkov, dmpetrov and dberenbaum August 20, 2024 23:29

use default delimiter to flatten columns

8ac61cc

shcheklein force-pushed the fix-328 branch from 44a897a to 8ac61cc Compare August 20, 2024 23:30

shcheklein self-assigned this Aug 20, 2024

shcheklein added the bug Something isn't working label Aug 20, 2024

revert and use . as a column delimiter

3e95c1b

Merge branch 'main' into fix-328

7164839

dberenbaum reviewed Aug 21, 2024

View reviewed changes

dberenbaum approved these changes Aug 21, 2024

View reviewed changes

mnrozhkov approved these changes Aug 21, 2024

View reviewed changes

shcheklein merged commit 35bef8a into main Aug 21, 2024
38 checks passed

shcheklein deleted the fix-328 branch August 21, 2024 15:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

use default delimiter to flatten columns #330

use default delimiter to flatten columns #330

shcheklein commented Aug 20, 2024 •

edited

Loading

cloudflare-workers-and-pages bot commented Aug 20, 2024 •

edited

Loading

shcheklein commented Aug 21, 2024

codecov bot commented Aug 21, 2024 •

edited

Loading

dberenbaum Aug 21, 2024

shcheklein Aug 21, 2024

use default delimiter to flatten columns #330

use default delimiter to flatten columns #330

Conversation

shcheklein commented Aug 20, 2024 • edited Loading

cloudflare-workers-and-pages bot commented Aug 20, 2024 • edited Loading

Deploying datachain-documentation with Cloudflare Pages

shcheklein commented Aug 21, 2024

codecov bot commented Aug 21, 2024 • edited Loading

Codecov Report

dberenbaum Aug 21, 2024

Choose a reason for hiding this comment

shcheklein Aug 21, 2024

Choose a reason for hiding this comment

shcheklein commented Aug 20, 2024 •

edited

Loading

cloudflare-workers-and-pages bot commented Aug 20, 2024 •

edited

Loading

codecov bot commented Aug 21, 2024 •

edited

Loading