Model migration consultation #14

yihp · 2024-08-27T07:00:29Z

Hi! Thanks for your contribution. It is an excellent piece of work!

My task language is Chinese. I have trained a Chinese tokenizer and trained it from scratch, but I have the following questions:
Can I still use CheXbert metrics? I am still using monitor: val_report_chexbert_f1_macro for my training. Should I change to other monitor?

Thank you very much for your time and consideration. I eagerly look forward to your response.

The text was updated successfully, but these errors were encountered:

anicolson · 2024-08-27T07:39:25Z

Hi @yihp,

Oof, unfortunately, I think you can only use CheXbert in English. Unless you can translate to English before evaluation? But you can certainly change monitor to something else.

yihp · 2024-08-27T07:56:30Z

OK, which monitor do you recommend for my Chinese task?

yihp · 2024-08-28T04:28:42Z

Hi @anicolson ,

I learned from your paper that CheXbert, RadGraph ER, and CXR-BERT were intended to capture the clinical semantic similarity between the generated and radiologist reports, but these models are for English tasks and I can't reuse them. BERTscore seems to be able to evaluate Chinese tasks. Then I have the following questions:

I can use BERTScore as the semantic similarity reward, but the results in your paper are not very good, and the effect of CXR-BERT is very good
Because CheXbert is only applicable to English tasks, I have to change the monitor: 'val_report_chexbert_f1_macro', do you have any suggestions for the choice of monitor? BERTscore, CIDEr, ROUGE-L, or BLEU-4?

anicolson · 2024-08-28T20:29:18Z

Hi @yihp,

I am not quite sure to be honest. Maybe you could use a Chinese BERT for BERTScore? You could modify here as such:

cxrmate/tools/metrics/bertscore.py

Line 84 in 820607a

bert_scorer = BERTScorer(

Here are those options you mentioned for monitor:

val_report_bertscore_f1
val_report_nlg_bleu_4
val_report_nlg_cider
val_report_nlg_rouge

I pushed bertscore to the repo as well.

yihp · 2024-08-29T03:07:58Z

Hi @anicolson ,

Thank you very much for your reply.
You use val_report_chexbert_f1_macro as monitor. I would like to ask you about the specific process. Are you use the trained cxrmate model to generate radiology reports, then let the chexbert model predict the labels (14 categories), and then calculate the chexbert_f1 value with the actual labels.

Is this the process?

anicolson · 2024-08-29T03:12:25Z

Hi @yihp,

So during validation/testing, the model will generate a report. Then, the generated report and the radiologist report are passed through chexbert (giving the chexbert labels for each). Classification scores are then calculated between the chexbert labels of the generated and radiologist reports.

yihp · 2024-08-29T03:30:00Z

Hi @anicolson ,

OK, I got it. I changed the tokenizer and retrained the model, and the results are as follows:

┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃                        Test metric                        ┃                       DataLoader 0                        ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│         test_report_chexbert_accuracy_atelectasis         │                    0.6504310369491577                     │
│        test_report_chexbert_accuracy_cardiomegaly         │                            1.0                            │
│        test_report_chexbert_accuracy_consolidation        │                            1.0                            │
│            test_report_chexbert_accuracy_edema            │                            1.0                            │
│ test_report_chexbert_accuracy_enlarged_cardiomediastinum  │                    0.9993842244148254                     │
│           test_report_chexbert_accuracy_example           │                    0.9673740863800049                     │
│          test_report_chexbert_accuracy_fracture           │                            1.0                            │
│         test_report_chexbert_accuracy_lung_lesion         │                            1.0                            │
│        test_report_chexbert_accuracy_lung_opacity         │                    0.9978448152542114                     │
│            test_report_chexbert_accuracy_macro            │                    0.9673740863800049                     │
│            test_report_chexbert_accuracy_micro            │                    0.9673740863800049                     │
│         test_report_chexbert_accuracy_no_finding          │                            1.0                            │
│      test_report_chexbert_accuracy_pleural_effusion       │                    0.9910714030265808                     │
│        test_report_chexbert_accuracy_pleural_other        │                            1.0                            │
│          test_report_chexbert_accuracy_pneumonia          │                            1.0                            │
│        test_report_chexbert_accuracy_pneumothorax         │                            1.0                            │
│       test_report_chexbert_accuracy_support_devices       │                    0.9045053124427795                     │
│            test_report_chexbert_f1_atelectasis            │                    0.7660866379737854                     │
│           test_report_chexbert_f1_cardiomegaly            │                            0.0                            │
│           test_report_chexbert_f1_consolidation           │                            0.0                            │
│               test_report_chexbert_f1_edema               │                            0.0                            │
│    test_report_chexbert_f1_enlarged_cardiomediastinum     │                            0.0                            │
│              test_report_chexbert_f1_example              │                    0.5966299176216125                     │
│             test_report_chexbert_f1_fracture              │                            0.0                            │
│            test_report_chexbert_f1_lung_lesion            │                            0.0                            │
│           test_report_chexbert_f1_lung_opacity            │                            0.0                            │
│               test_report_chexbert_f1_macro               │                    0.08107323199510574                    │
│               test_report_chexbert_f1_micro               │                    0.7244199514389038                     │
│            test_report_chexbert_f1_no_finding             │                            0.0                            │
│         test_report_chexbert_f1_pleural_effusion          │                            0.0                            │
│           test_report_chexbert_f1_pleural_other           │                            0.0                            │
│             test_report_chexbert_f1_pneumonia             │                            0.0                            │
│           test_report_chexbert_f1_pneumothorax            │                            0.0                            │
│          test_report_chexbert_f1_support_devices          │                    0.3689386248588562                     │
│            test_report_chexbert_num_dicom_ids             │                          2872.0                           │
│            test_report_chexbert_num_study_ids             │                          1624.0                           │
│        test_report_chexbert_precision_atelectasis         │                    0.8176434636116028                     │
│        test_report_chexbert_precision_cardiomegaly        │                            0.0                            │
│       test_report_chexbert_precision_consolidation        │                            0.0                            │
│           test_report_chexbert_precision_edema            │                            0.0                            │
│ test_report_chexbert_precision_enlarged_cardiomediastinum │                            0.0                            │
│          test_report_chexbert_precision_example           │                    0.6533148884773254                     │
│          test_report_chexbert_precision_fracture          │                            0.0                            │
│        test_report_chexbert_precision_lung_lesion         │                            0.0                            │
│        test_report_chexbert_precision_lung_opacity        │                            0.0                            │
│           test_report_chexbert_precision_macro            │                    0.0843597799539566                     │
│           test_report_chexbert_precision_micro            │                    0.7660516500473022                     │
│         test_report_chexbert_precision_no_finding         │                            0.0                            │
│      test_report_chexbert_precision_pleural_effusion      │                            0.0                            │
│       test_report_chexbert_precision_pleural_other        │                            0.0                            │
│         test_report_chexbert_precision_pneumonia          │                            0.0                            │
│        test_report_chexbert_precision_pneumothorax        │                            0.0                            │
│      test_report_chexbert_precision_support_devices       │                    0.3633934557437897                     │
│          test_report_chexbert_recall_atelectasis          │                    0.7206460237503052                     │
│         test_report_chexbert_recall_cardiomegaly          │                            0.0                            │
│         test_report_chexbert_recall_consolidation         │                            0.0                            │
│             test_report_chexbert_recall_edema             │                            0.0                            │
│  test_report_chexbert_recall_enlarged_cardiomediastinum   │                            0.0                            │
│            test_report_chexbert_recall_example            │                    0.5752052664756775                     │
│           test_report_chexbert_recall_fracture            │                            0.0                            │
│          test_report_chexbert_recall_lung_lesion          │                            0.0                            │
│         test_report_chexbert_recall_lung_opacity          │                            0.0                            │
│             test_report_chexbert_recall_macro             │                    0.07823583483695984                    │
│             test_report_chexbert_recall_micro             │                    0.6870800852775574                     │
│          test_report_chexbert_recall_no_finding           │                            0.0                            │
│       test_report_chexbert_recall_pleural_effusion        │                            0.0                            │
│         test_report_chexbert_recall_pleural_other         │                            0.0                            │
│           test_report_chexbert_recall_pneumonia           │                            0.0                            │
│         test_report_chexbert_recall_pneumothorax          │                            0.0                            │
│        test_report_chexbert_recall_support_devices        │                    0.3746556341648102                     │
│                   test_report_cxr-bert                    │                    0.7429220676422119                     │
│                  test_report_nlg_bleu_1                   │                    0.3031856417655945                     │
│                  test_report_nlg_bleu_2                   │                    0.03638414293527603                    │
│                  test_report_nlg_bleu_3                   │                   0.016369516029953957                    │
│                  test_report_nlg_bleu_4                   │                   0.0022414636332541704                   │
│                   test_report_nlg_cider                   │                    0.04183460399508476                    │
│                  test_report_nlg_meteor                   │                    0.1805824488401413                     │
│               test_report_nlg_num_dicom_ids               │                          2872.0                           │
│               test_report_nlg_num_study_ids               │                          1624.0                           │
│                   test_report_nlg_rouge                   │                    0.34699246287345886

The question is why test_report_cxr-bert is so high. Is it because cxr-bert has good Chinese generalization ability? I plan to test it.
And because I use val_report_chexbert_f1_macro as monitor, my task is Chinese, so the chexbert_f1 result is not referenceable. I will replace the monitor or fine-tune a chinese_chexbert according to you mentioned.

anicolson · 2024-08-29T05:16:50Z

How do the reports look? E.g., in experiments/.../trial_0/metric_outputs/reports/...

And I was suggesting a Chinese pre-trained Transformer encoder for BERTScore, not CheXbert or CXR-BERT (because I am not sure that they exist for the later two).

yihp · 2024-08-29T06:36:44Z

Another question is that I don't see any code for calculating BERTScore? There is no BERTScore in the test results, only test_report_cxr-bert.

anicolson · 2024-08-29T23:02:38Z

Please pull the repo, it has been updated

yihp · 2024-09-01T03:31:03Z

How do the reports look? E.g., in experiments/.../trial_0/metric_outputs/reports/...

And I was suggesting a Chinese pre-trained Transformer encoder for BERTScore, not CheXbert or CXR-BERT (because I am not sure that they exist for the later two).

Hi @anicolson ,

Thank you very much for your reply.

The generated reports seem to be fine, but many of the generated reports with different dicom_ids are identical, this indicates that the model's ability to generate reports is relatively poor.

Then I just tested the performance of CXR-BERT in Chinese, and the effect was very poor, which also shows that CXR-BERT is only for English chest X-ray tasks. But I am not sure if there is a similar Chinese BERT model that can calculate similarity and will do some test.

In addition, because CheXbert is only applicable to English tasks, it is not realistic for me to retrain a CheXbert in chinese language. So do you have any suggestions for the choice of monitor on my Chinese tasks? Is a Chinese pre-trained Transformer encoder for BERTScore a good choice? Or other indicators.

Looking forward to your reply !

anicolson · 2024-09-01T19:53:40Z

Hi @yihp,

I guess your best starting point would be a non-model based metric, such as a word overlap metric that is language agnostic (I assume these fit into this category, but you will have to double check: val_report_nlg_bleu_4,
val_report_nlg_cider, val_report_nlg_rouge).

You could use this until you find a Chinese-based model that could be used as a metric perhaps.

yihp · 2024-09-02T02:03:20Z

Hi @anicolson ,

OK, I am doing experimental verification.

I have a question about eval_loss_step. In the tesorboard training monitoring page, I only see train_loss_step, but no eval_loss_step. How should I add it?

github-staff deleted a comment from yihp Aug 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Model migration consultation #14

Model migration consultation #14

yihp commented Aug 27, 2024

anicolson commented Aug 27, 2024

yihp commented Aug 27, 2024

yihp commented Aug 28, 2024

anicolson commented Aug 28, 2024

yihp commented Aug 29, 2024

anicolson commented Aug 29, 2024

yihp commented Aug 29, 2024 •

edited

Loading

anicolson commented Aug 29, 2024 •

edited

Loading

yihp commented Aug 29, 2024

anicolson commented Aug 29, 2024

yihp commented Sep 1, 2024

anicolson commented Sep 1, 2024

yihp commented Sep 2, 2024

Model migration consultation #14

Model migration consultation #14

Comments

yihp commented Aug 27, 2024

anicolson commented Aug 27, 2024

yihp commented Aug 27, 2024

yihp commented Aug 28, 2024

anicolson commented Aug 28, 2024

yihp commented Aug 29, 2024

anicolson commented Aug 29, 2024

yihp commented Aug 29, 2024 • edited Loading

anicolson commented Aug 29, 2024 • edited Loading

yihp commented Aug 29, 2024

anicolson commented Aug 29, 2024

yihp commented Sep 1, 2024

anicolson commented Sep 1, 2024

yihp commented Sep 2, 2024

yihp commented Aug 29, 2024 •

edited

Loading

anicolson commented Aug 29, 2024 •

edited

Loading