Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question]NLP text classification sample? #392

Open
LangDaoAI opened this issue Oct 18, 2022 · 11 comments
Open

[Question]NLP text classification sample? #392

LangDaoAI opened this issue Oct 18, 2022 · 11 comments

Comments

@LangDaoAI
Copy link

Hi Team,

I want to know whether has any NLP text classification sample about the evidently?

Thanks!

@LangDaoAI
Copy link
Author

e.g. I am using transformer related pretrained models (bert, roborta ...) to do sentimental classification based on text data. Then, how evidently can monitor these models and text data drift ?

@LangDaoAI
Copy link
Author

That is to say: how do we measure data drift (NLP、CV) without structured features?

@SangamSwadiK
Copy link
Contributor

That is to say: how do we measure data drift (NLP、CV) without structured features?

Hey! @LangDaoAI I guess you could use dimensionality reduction followed by KS test or any other test to measure the drift.

@LangDaoAI
Copy link
Author

That is to say: how do we measure data drift (NLP、CV) without structured features?

Hey! @LangDaoAI I guess you could use dimensionality reduction followed by KS test or any other test to measure the drift.

Hi, @SangamSwadiK in fact, in my scenes, most of them uses transformered-based models for text classfication and image classfication, how do we measure data drift (NLP、CV) without structured features using evidently ?

@LangDaoAI
Copy link
Author

I think it's a little strange, has no one asked the same question previously?

@elenasamuylova
Copy link
Collaborator

Hi @LangDaoAI, right now, Evidently only natively supports tabular data as inputs.

Supporting text data is in our mid-term roadmap. This is a feature request that comes up regularly, so we will definitely address it. However, we want to build up some of the core features for tabular data first and would need to perform some additional research before we implement drift detection, etc. for unstructured data. Will get there!

@LangDaoAI
Copy link
Author

LangDaoAI commented Oct 18, 2022 via email

@elenasamuylova
Copy link
Collaborator

Hi @LangDaoAI, as a quick note: we've recently released raw text data support in Evidently. You can read more here: https://www.evidentlyai.com/blog/evidently-data-quality-monitoring-and-drift-detection-for-text-data

@LangDaoAI
Copy link
Author

LangDaoAI commented Jan 30, 2023 via email

@LangDaoAI
Copy link
Author

Hi @LangDaoAI, as a quick note: we've recently released raw text data support in Evidently. You can read more here: https://www.evidentlyai.com/blog/evidently-data-quality-monitoring-and-drift-detection-for-text-data

Hi @elenasamuylova , noting the following:
"What’s more, you can pass multi-modal data that combines features of different types in a single dataset."
CV data also supported?

Thanks!

@LangDaoAI
Copy link
Author

This new feature and design/implementation is really awesome! However, after I have read the blog , I felt a litter confused for "Drift" true meaning. Although we can carefully read https://arxiv.org/pdf/1810.11953.pdf to spend some time or effort on understand the point, I always want some more clean explanations about "Drift" and even happened on real world.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants