Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SchemaNotEqualError not showing the difference in metadata #76

Open
henrytomsf opened this issue Oct 11, 2023 · 2 comments
Open

SchemaNotEqualError not showing the difference in metadata #76

henrytomsf opened this issue Oct 11, 2023 · 2 comments

Comments

@henrytomsf
Copy link

Related to #70 , when metadata is different between two schemas, the printout in the error logs don't explicitly show the difference in metadata since it only uses the __repr__ from pyspark's StructField.

@peterreckamp-8451
Copy link

Recently started using this library and this is the first issue I found, would love is this PR was reviewed and merged.

@MilkSilk
Copy link

MilkSilk commented Nov 25, 2024

+1 I encountered this problem today and took me half a day to figure out what was wrong. This issue is crucial due to 3 facts: chipsa not showing difference in metadata despite failing dataframe equality assertion, spark not containing metadata in str(df.schema) nor in df.printSchema() and lastly ignore_metadata parameter not being documented in the README.md. An easy improvement should be to add ignore_metadata into the README, which would probably help a lot of the developers that will encounter this issue in the future. In general chipsa should display the difference in metadata when the assertion is failing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants