Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

a user-friendly way to suppress the undefined field warnings during connector parsing #18153

Open
lmatz opened this issue Aug 21, 2024 · 2 comments
Milestone

Comments

@lmatz
Copy link
Contributor

lmatz commented Aug 21, 2024

https://risingwave-community.slack.com/archives/C03BW71523T/p1724178084269959

another question, what about fields which are sometimes empty: "Undefined field" i receive the following error:
risingwave_connector::parser: failed to parse non-pk column, padding with NULL error=Undefined field nename at ``

This message is just a warning and can be safely ignored in your case. It is saying that additionalCodeUom is defined in schema but not present in the actual json message, so it is treated as NULL . This is very likely the expected behavior, but unfortunately we do not have an option to suppress this warning completely.

Still ugly to show these errors ?

What about allowing users to suppress these warnings by explicitly defining a column as "nullable" in the table inside SQL?

Right now, the column by default is effectively nullable. But pretty often, the user may not even know if a column is supposed to be nullable or not.

Disabling all the warning messages is not ideal as unexpected situations may still occur and users want to be informed.

After change, in terms of parsing behavior:

  1. by default: nullable
  2. null: nullable
  3. not null: reject null

In terms of warning:

  1. by default: show warnings
  2. null: no warnings for this particular column
  3. not null: show warnings
@github-actions github-actions bot added this to the release-2.1 milestone Aug 21, 2024
@lmatz
Copy link
Contributor Author

lmatz commented Aug 21, 2024

It could also be an indicator of whether we should put the data into the "error table" or https://www.tinybird.co/docs/guides/ingesting-data/recover-from-quarantine, or just reject it.

If the user wants to ingest every row from upstream without knowing what fields can possibly be inside the data,
#12207 is the way to go.

@BugenZhao
Copy link
Member

#15525

@lmatz lmatz modified the milestones: release-2.1, release-2.2 Oct 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants