-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Named entity extractor private models #8658
Conversation
Pull Request Test Coverage Report for Build 12429166430Warning: This coverage report may be inaccurate.This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.
Details
💛 - Coveralls |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code changes look quite good to me already. We only need to update from_dict/to_dict methods and I have a question about one test case and the comment in there, which I find confusing. The test won't and shouldn't fail if the HF_API_TOKEN
is not set. So let's simply remove the comment. Otherwise that would mean we are trying to download the model and in that case the test should be marked with
@pytest.mark.integration
@pytest.mark.skipif(
not os.environ.get("HF_API_TOKEN", None),
reason="Export an env var called HF_API_TOKEN containing the Hugging Face token to run this test.",
)
For how to update from_dict/to_dict you can have a look at other components, for example HuggingFaceLocalGenerator
. We should also update the tests to have a test_to_dict_default
and test_to_dict_with_parameters
. The tests with ...no_default_parameters_hf
should use monkeypatch.delenv("HF_API_TOKEN", raising=False)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One more question and then we're ready to merge.
"device": "cuda:0", | ||
"token": "another-test-token", | ||
} | ||
huggingface_pipeline_kwargs = {"model": "gpt2", "device": "cuda:0", "token": "another-test-token"} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please briefly explain this change? Before we explicitly set task to "text-generation"
in pipeline_kwargs and initialize the component with task="text2text-generation"
to then check if the pipeline_kwargs override the init parameter, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reverted as discussed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! 👍
Related Issues
Proposed Changes:
token
support toNamedEntityExtractor
and then update HF backend to pass it down to tokenizer and modelfrom_pretrained
methods.How did you test it?
Local unit testing / e2e
Checklist
fix:
,feat:
,build:
,chore:
,ci:
,docs:
,style:
,refactor:
,perf:
,test:
and added!
in case the PR includes breaking changes.