fit z-curve (mixture model) with all z-values rather than only statsitically significant ones #16

Yefeng0920 · 2023-10-23T06:26:33Z

@FBartos @gaborcsardi I would be grateful, if you would like to tell me how to fit a collection of z values without truncation at 1.96. I mean z-curve only uses the statistically significant z-values to fit the mixture model. But how to use all z values regardless of the statistical significance. The reason why I ask this is because I want to test if a dataset without publication bias (this can be guaranteed by Registered Reports), the EDR derived from a mixture model fitted with only statistically significant z-values should be similar to that fitted with all z-values regardless of the statistical significance.

Best,
Yefeng

FBartos · 2023-10-25T14:07:51Z

Hi Yefeng,

You can use the control argument to specify the lower fitting range a in the zcurve() function. See the following example:

library(zcurve)
z <- rnorm(100)
fit <- zcurve(z = z, control = list(a = 0))
summary(fit)
plot(fit)

See ?control_EM for more details.

Hope this helps!
Frantisek

Yefeng0920 · 2023-10-25T22:47:32Z

Hi Frantisek @FBartos ,
This is quite useful. So let me try to understand the so-called folded truncated distribution. Basically, the raw values are converted into absolute values or magnitude, then constrain the data within a certain range of values. By default, the range is qnorm(0.05/2,lower.tail =F) to 5. Finally, a mixture model with EM estimation is used to fit the truncated values. The reason why only fitting the z values with a nominally statistical significance is that it can account for the publication bias, although I could not quite understand the rationale why this is the case. Do I understand the whole process correctly?

FBartos · 2023-10-27T07:39:57Z

Yes, that's correct.
In short; under the selection for statistical significance, estimating the model only using the statistically significant results with a truncated likelihood allows us to obtain estimates that are unaffected by publication bias. Then, we use the locations of the truncated distributions to extrapolate to statistically non-significant results (which we do not use for estimation as they might be non-representative due to the selection).

Yefeng0920 · 2023-10-27T09:09:37Z

@FBartos It is really a great idea. But I am still thinking only using the average to summarize the discovery rate or replication rate is not a good way on some occasions. Therefore, it is good to present the whole distribution

FBartos added the documentation Improvements or additions to documentation label Oct 25, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fit z-curve (mixture model) with all z-values rather than only statsitically significant ones #16

fit z-curve (mixture model) with all z-values rather than only statsitically significant ones #16

Yefeng0920 commented Oct 23, 2023

FBartos commented Oct 25, 2023

Yefeng0920 commented Oct 25, 2023

FBartos commented Oct 27, 2023

Yefeng0920 commented Oct 27, 2023

fit z-curve (mixture model) with all z-values rather than only statsitically significant ones #16

fit z-curve (mixture model) with all z-values rather than only statsitically significant ones #16

Comments

Yefeng0920 commented Oct 23, 2023

FBartos commented Oct 25, 2023

Yefeng0920 commented Oct 25, 2023

FBartos commented Oct 27, 2023

Yefeng0920 commented Oct 27, 2023