question: [Q8Bert experiment Setting] #219

daumkh402 · 2021-04-09T08:08:10Z

Hello, I read the Q8Bert paper and have tried to reproduce the experiment results.
But, on some GLUE tasks ( e.g cola, mrpc ), the differences between the fp32 results and quantized ones are much larger than the differences reported in the paper.
I tried sweeping initial learning rate but still the result was still far from the reported results.

So, I want to ask you if the experiment on Q8bert was done with default parameters set inside nlp-architect code as below.

If not, could you tell me the experiment setting.

ofirzaf · 2021-04-19T23:41:09Z

Hi,

What version of nlp_architect and transformers did you use to run the experiments?

Please note that both MRPC and CoLa tasks are known to be unstable in their results.

The experiments in the paper were done using a very early version of HF/transformers, here are the official results from HF relevant at the time of writing the paper: https://huggingface.co/transformers/v1.0.0/examples.html#glue-results-on-dev-set

daumkh402 · 2021-04-29T14:00:39Z

Hi,
The version of nlp_architect is 0.5.5.
The version of transformers is 2.4.1.

Thank you.

daumkh402 added the question Further information is requested label Apr 9, 2021

daumkh402 changed the title ~~question: [question topic]~~ question: [Q8Bert experiment Setting] Apr 9, 2021

peteriz assigned ofirzaf Apr 9, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

question: [Q8Bert experiment Setting] #219

question: [Q8Bert experiment Setting] #219

daumkh402 commented Apr 9, 2021 •

edited

Loading

ofirzaf commented Apr 19, 2021

daumkh402 commented Apr 29, 2021

question: [Q8Bert experiment Setting] #219

question: [Q8Bert experiment Setting] #219

Comments

daumkh402 commented Apr 9, 2021 • edited Loading

ofirzaf commented Apr 19, 2021

daumkh402 commented Apr 29, 2021

daumkh402 commented Apr 9, 2021 •

edited

Loading