Skip to content
Zhiwen Owen Jiang edited this page Sep 18, 2024 · 3 revisions

General

Q: Which imaging modalities can I use?

A: HEIG supports images in NIFTI, CIFTI2 and FreeSurfer morphometry data. Imaging modalities that can be represented by these formats can be handled by HEIG. We have tested surface-based data in CIFTI2, and structural images and diffusion images in NIFTI. However, more effects are needed for functional MRI time series.

Q: How many LDRs do I need?

A: First, selecting a number that no less than 80% of image variance is preserved in --fpca. Next, make sure the correlation between raw and reconstructed images greater than 0.85 in --make-ldr. We encourage to use more LDRs if computationally affordable.

Q: Which software can I use for LDR GWAS?

A: You can use all off-the-shelf GWAS software such as PLINK2 and REGENIE.

Q: I want to share my summary statistics, what is required?

A: Sharing the triplets - LDR summary statistics (.sumstats + .snpinfo), bases, and the variance-covariance matrix of LDRs - is the minimum requirement. We recommend to provide the effective number in a README file or share the eigenvalue file so that users can easily do correction for multiple hypothesis testing across voxels. Additionally, attaching image templates for visualization is always encouraged.

Q: What is the maximum image resolution?

A: We have tested that images with 59,412 vertices can be efficiently handled by HEIG, which is able to cover the entire brain. Higher resolution may cause out of memory issue. See "How much memory do I need?" below.

Q: My sample size is small (<10,000), can I use HEIG?

A: Yes, you can. However, there are some drawbacks of using small sample size. First, if sample size is less than image resolution, the effective number is always downward biased, which may cause false positives. Second, heritability and genetic correlation estimates will be unstable with large standard error, although voxel-level GWAS is unaffected.

Q: How much memory do I need?

A: The main memory bottlenecks are --fpca and --voxel-gwas. We have tested that for a imaging dataset with 59,412 vertices and 15,752 subjects, --fpca took around 25 GB of memory to estimate all 15,752 PCs. In another benchmark study, we used a dataset with 117,019 voxels and 19,040 subjects to estimate the top 5,000 PCs. It took 56 GB of memory, which was a little unexpected. If we increased to the top 6,346 PCs, it took 72 GB of memory. Currently, HEIG is not memory efficient for images with resolution greater than 100,000.

We have tested that to recover voxel-level summary statistics using LDR summary statistics including 6.6 million SNPs and 25 LDRs, --voxel-gwas took 3, 10, 20, and 22 GB of memory to recover 2, 100, 1000, and 15,000 voxels, respectively (4 CPUs in parallel). When using a larger LDR summary statistics dataset including 6.6 million SNPs and 1750 LDRs, it took 50 GB of memory to recover 10,000 voxels (4 CPUs in parallel).