Skip to content

MRMC analysis of binary data

Brandon Gallas edited this page Apr 25, 2023 · 2 revisions

Can iMRMC analyze binary data?

Yes! The iMRMC java GUI can analyze binary data and estimate reader-averaged percent correct. The key is how to format the data as the input file. Note that using the iMRMC java GUI to analyze binary data is kind of trick-out of the program to analyze ROC data. As such, results are still presented in terms of AUC even though you have tricked out the data so that you are actually performing an MRMC analysis of percent correct. See below.

Instead of tricking out the iMRMC java GUI, we have an R package ("iMRMC") that includes a function to specifically analyze binary data: uStat11.conditionalD. If you are comfortable with R, you should check out the iMRMC package that is downloadable from CRAN or downloadable directly from the release page of this repository.

Data

Your binary data is a set of "success" observations. One observation corresponds to one reader evaluating one case in one modality. The reader either gets the case correct (1 = one) or not (0 = zero). We need to map this data into an input file of ROC data that iMRMC expects. Please read the documentation here

The first section of an iMRMC input file contains the study description. After the study description, we specify the truth status of each case and then the data. This section begins with "BEGIN DATA:". The subsequent rows have four fields separated by commas. The fields are readerID, caseID, modalityID, and score.

1. Create the rows specifying the truth state of each case

*Let the cases corresponding to your actual binary data be the ROC "disease cases". If you have N1 cases, there will be N1 rows specifying your cases as disease cases. Each row starts with "truth" as the readerID, then you need to give a unique caseID to each case, then specify "truth" as the modalityID, and then the score is 1 for "disease". *Next we create 5 fake ROC "non-disease" cases. Each row starts with "truth" as the readerID, then create a fake caseID for each case ("fake1", "fake2", ... "fake5"), then specify "truth" as the modalityID, and the score is 0 for "non-disease".

2. Create the rows specifying the "success" observations

*Map each of your binary "success" observations to a row of the input file. Each row corresponds to a readerID, caseID, modalityID, and the observation success result (Correct=1, Incorrect=0). *Create fake observations of the ROC "non-disease" cases. For each reader, fake case, and modality where there are actual cases, create a fake observation with the score = 0.5. If a reader does not read in a modality, don't bother creating fake data for that modality.

{| class="wikitable" style="float:left; margin-right: 10px;" |- ! Case Category ! Reader decision ! Binary decision |- | rowspan="2" |Actual case | Correct | 1 |- | Incorrect | 0 |- | Fake case | NA | 0.5 |}

3. Example

Reading result: {| class="wikitable" style="float:left; margin-right: 10px;" |- ! Case ID ! Reader1 decision ! Reader2 decision |-
|Actual1 |Correct |Incorrect |- |Actual2 |Incorrect |Correct |}

Input file: (Add two Fake cases, assume only one modality: "modalityA") {|- ! readerID ! caseID ! modalityID

! score
truth
Actual1
truth
1
-
truth
Actual2
truth
1
-
truth
Fake1
truth
0
-
truth
Fake2
truth
0
-
Reader1
Actual1
modalityA
1
-
Reader1
Actual2
modalityA
0
-
Reader1
Fake1
modalityA
0.5
-
Reader1
Fake2
modalityA
0.5
-
Reader2
Actual1
modalityA
0
-
Reader2
Actual2
modalityA
1
-
Reader2
Fake1
modalityA
0.5
-
Reader2
Fake2
modalityA
0.5
}