Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TST: dt64 units #56261

Merged
merged 6 commits into from
Dec 4, 2023
Merged

TST: dt64 units #56261

merged 6 commits into from
Dec 4, 2023

Conversation

jbrockmendel
Copy link
Member

Aimed at trimming the diff in #55901

Copy link
Member

@WillAyd WillAyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks pretty reasonable

@@ -130,8 +132,15 @@ def df_ref(datapath):
return df_ref


def adjust_expected(expected: DataFrame, read_ext: str) -> None:
def get_exp_unit(read_ext: str, engine: str | None) -> str:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So your ultimate vision here is different engines will return different bases? Or is there just considered a temporary solution?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In #55901 this becomes

def get_exp_unit(read_ext: str, engine: str | None) -> str:
    unit = "us"
    if (read_ext == ".ods") ^ (engine == "calamine"):
        unit = "s"
    return unit

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What makes ods files or the calamine reader different here? My first thought is that is surprising behavior to have those be the only ones return seconds

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no idea. id be OK with coercing them all to micros

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the largest representable date for microsecond precision?

From what I see Excel only can display millseconds, but doesn't offer first class formula support for it. The ODS specification point 18.3.14 links its datetime format to an XML Schema Part 2

https://www.w3.org/TR/2004/REC-xmlschema-2-20041028/#dateTime

Which only mentions "fractional seconds" but without too much detail.

So seems like a wild west of implementation possibilities. Excel has an upper limit on dates at December 31, 9999 so maybe we just try to cover that?

https://support.microsoft.com/en-us/office/excel-specifications-and-limits-1672b34d-7043-467e-8e27-269d656771c3

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

December 31, 9999 23:59:59.999999 is also the highest value supported by the stdlib datetime. microseconds can go further than that, but most of the time when we get microseconds it is because we got a stdlib datetime objects

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(btw i definitely want to have this discussion but would prefer it to happen in #55901, this PR is just aimed at diff-trimming to make that one easier)

@mroeschke mroeschke added Testing pandas testing functions or related to the test suite Non-Nano datetime64/timedelta64 with non-nanosecond resolution labels Dec 1, 2023
@mroeschke mroeschke added this to the 2.2 milestone Dec 4, 2023
@mroeschke mroeschke merged commit d44f6c1 into pandas-dev:main Dec 4, 2023
44 checks passed
@mroeschke
Copy link
Member

Thanks @jbrockmendel

@jbrockmendel jbrockmendel deleted the tst-adjust_expected branch December 4, 2023 20:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Non-Nano datetime64/timedelta64 with non-nanosecond resolution Testing pandas testing functions or related to the test suite
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants