Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Too low accuracy result compared with the expected result #52

Open
xtchon opened this issue Nov 10, 2023 · 6 comments
Open

Too low accuracy result compared with the expected result #52

xtchon opened this issue Nov 10, 2023 · 6 comments

Comments

@xtchon
Copy link

xtchon commented Nov 10, 2023

Hi, thanks for your work.
I'm trying to test out the result of your work but found some difficulties on reproducing similar accuracy results.

Below is the Environment that I created:
channels:

  • default
    dependencies:
  • python=3.9.7
  • pip
  • pip:
    • transformers==4.17.0
    • scipy==1.7.3
    • datasets==2.00.0
    • scikit-learn==1.0.2
    • torch==1.10.2
    • black
    • wandb
    • matplotlib

I used datasets==2.00.0, cause when I install datasets==1.14.0, it would result the following conflict:
The conflict is caused by:
transformers 4.17.0 depends on huggingface-hub<1.0 and >=0.1.0
datasets 1.14.0 depends on huggingface-hub<0.1.0 and >=0.0.19

If I use datasets 2.00.0, it is able to run the evaluation.py MNLI ../CoFi-MNLI-s95, but the results seems wrong?
What can I do to solve this problem? Thanks a lot!

../CoFi-MNLI-s95 is what is downloaded from https://huggingface.co/princeton-nlp/CoFi-MNLI-s95
Results I obtained:
Task: mnli
Model path: ../CoFi-MNLI-s95
Model size: 4330279
Sparsity: 0.949
accuracy: 0.091
seconds/example: 0.000531

Too low accuracy compared to the expected result:
Task: MNLI
Model path: princeton-nlp/CoFi-MNLI-s95
Model size: 4920106
Sparsity: 0.943
mnli/acc: 0.8055
seconds/example: 0.010151

@xiamengzhou
Copy link
Collaborator

I think using datasets==1.14.0 is necessary in this case to get the model performance right. Maybe you can skip the version conflict for now?

@SHUSHENGQIGUI
Copy link

Hi, thanks for your work. I'm trying to test out the result of your work but found some difficulties on reproducing similar accuracy results.

Below is the Environment that I created: channels:

  • default
    dependencies:

  • python=3.9.7

  • pip

  • pip:

    • transformers==4.17.0
    • scipy==1.7.3
    • datasets==2.00.0
    • scikit-learn==1.0.2
    • torch==1.10.2
    • black
    • wandb
    • matplotlib

I used datasets==2.00.0, cause when I install datasets==1.14.0, it would result the following conflict: The conflict is caused by: transformers 4.17.0 depends on huggingface-hub<1.0 and >=0.1.0 datasets 1.14.0 depends on huggingface-hub<0.1.0 and >=0.0.19

If I use datasets 2.00.0, it is able to run the evaluation.py MNLI ../CoFi-MNLI-s95, but the results seems wrong? What can I do to solve this problem? Thanks a lot!

../CoFi-MNLI-s95 is what is downloaded from https://huggingface.co/princeton-nlp/CoFi-MNLI-s95 Results I obtained: Task: mnli Model path: ../CoFi-MNLI-s95 Model size: 4330279 Sparsity: 0.949 accuracy: 0.091 seconds/example: 0.000531

Too low accuracy compared to the expected result: Task: MNLI Model path: princeton-nlp/CoFi-MNLI-s95 Model size: 4920106 Sparsity: 0.943 mnli/acc: 0.8055 seconds/example: 0.010151

hello did you solve the problem? i meet the same problem

@xtchon
Copy link
Author

xtchon commented Jun 29, 2024

Hi, thanks for your work. I'm trying to test out the result of your work but found some difficulties on reproducing similar accuracy results.
Below is the Environment that I created: channels:

  • default
    dependencies:

  • python=3.9.7

  • pip

  • pip:

    • transformers==4.17.0
    • scipy==1.7.3
    • datasets==2.00.0
    • scikit-learn==1.0.2
    • torch==1.10.2
    • black
    • wandb
    • matplotlib

I used datasets==2.00.0, cause when I install datasets==1.14.0, it would result the following conflict: The conflict is caused by: transformers 4.17.0 depends on huggingface-hub<1.0 and >=0.1.0 datasets 1.14.0 depends on huggingface-hub<0.1.0 and >=0.0.19
If I use datasets 2.00.0, it is able to run the evaluation.py MNLI ../CoFi-MNLI-s95, but the results seems wrong? What can I do to solve this problem? Thanks a lot!
../CoFi-MNLI-s95 is what is downloaded from https://huggingface.co/princeton-nlp/CoFi-MNLI-s95 Results I obtained: Task: mnli Model path: ../CoFi-MNLI-s95 Model size: 4330279 Sparsity: 0.949 accuracy: 0.091 seconds/example: 0.000531
Too low accuracy compared to the expected result: Task: MNLI Model path: princeton-nlp/CoFi-MNLI-s95 Model size: 4920106 Sparsity: 0.943 mnli/acc: 0.8055 seconds/example: 0.010151

hello did you solve the problem? i meet the same problem

Actually, datasets==1.14.0 is not necessary, use datasets==2.14.6 solves this problem. Afterwards there should be another issues and some code is needed to adjust manually.

@SHUSHENGQIGUI
Copy link

Hi, thanks for your work. I'm trying to test out the result of your work but found some difficulties on reproducing similar accuracy results.
Below is the Environment that I created: channels:

  • default
    dependencies:

  • python=3.9.7

  • pip

  • pip:

    • transformers==4.17.0
    • scipy==1.7.3
    • datasets==2.00.0
    • scikit-learn==1.0.2
    • torch==1.10.2
    • black
    • wandb
    • matplotlib

I used datasets==2.00.0, cause when I install datasets==1.14.0, it would result the following conflict: The conflict is caused by: transformers 4.17.0 depends on huggingface-hub<1.0 and >=0.1.0 datasets 1.14.0 depends on huggingface-hub<0.1.0 and >=0.0.19
If I use datasets 2.00.0, it is able to run the evaluation.py MNLI ../CoFi-MNLI-s95, but the results seems wrong? What can I do to solve this problem? Thanks a lot!
../CoFi-MNLI-s95 is what is downloaded from https://huggingface.co/princeton-nlp/CoFi-MNLI-s95 Results I obtained: Task: mnli Model path: ../CoFi-MNLI-s95 Model size: 4330279 Sparsity: 0.949 accuracy: 0.091 seconds/example: 0.000531
Too low accuracy compared to the expected result: Task: MNLI Model path: princeton-nlp/CoFi-MNLI-s95 Model size: 4920106 Sparsity: 0.943 mnli/acc: 0.8055 seconds/example: 0.010151

hello did you solve the problem? i meet the same problem

Actually, datasets==1.14.0 is not necessary, use datasets==2.14.6 solves this problem. Afterwards there should be another issues and some code is needed to adjust manually.

Thank you. Where is the issue occurring, and which code needs to be modified?

@SHUSHENGQIGUI
Copy link

I think using datasets==1.14.0 is necessary in this case to get the model performance right. Maybe you can skip the version conflict for now?

hi. when i set transfomers=4.17.0 ,datasets=1.14.0, what version of huggingface-hub should be? there occurs version conflict from huggingface-hub ?

@SHUSHENGQIGUI
Copy link

SHUSHENGQIGUI commented Jul 2, 2024

ALL RIGHT. I finally find the key of this problem: I test the princeton-nlp/CoFi-MRPC-s95, the result is the same as the ReadMe table mentioned.
image
by the way, here is my setting: transfomers==4.17.0, datasets==2.1.0, huggingface-hub==0.19.0
So, I guess there are some bugs in evaluation.py to evaluate MNLI accuracy

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants