MAGMA_Mac is an extensively used tool for gene-based analysis, gene-set enrichment analysis, and gene property analysis.
This guide provides an overview of the visualization techniques used for the gene identity analysis and Gene Ontology (GO) enrichment results based on the MAGMA analysis.
- Gene Identity Analysis visualization: the gene expression levels across different tissue types. The visualization highlights the average expression values and uses a brown line to indicate the Bonferroni threshold. Genes with expression levels above this threshold are of particular interest, as they may be more strongly associated with the phenotype under study.
🔗 View the Gene Identity Analysis Script
- Gene Ontology (GO) Enrichment visualization: which helps in understanding the functional aspects of the genes of interest. It categorizes genes based on their involvement in various biological processes, cellular components, and molecular functions. Genes are grouped by their GO terms, and the significance of each term is denoted by its p-value. A brown line on the visualization represents the Bonferroni threshold, aiding in the identification of significantly enriched terms.
🔗 View the GO Enrichment Script
For more details on MAGMA and its applications, refer to the official MAGMA documentation.
- Download the Mac version of MAGMA. Download the ZIP file
- Unzip the downloaded file.
- Move the MAGMA binary to an appropriate directory and ensure directory in system path.
Reference population selection
On a Mac, perform gene annotation using the following command:
magma --annotate --snp-loc g1000_eas.bim --gene-loc gene.loc --out [OUTPUT_PREFIX]
- Consistency between reference sample and annotation: We used a reference sample named g1000_eas, which is data for the East Asian population. For accurate results, ensure that the annotation file is consistent with the population source of the reference sample.
- Parameters for the P-value file: We specified the P-value with --pval snp N=8436. The snp here should be the path to the GWAS result file, that is, [GWAS_PVAL_FILE]. Make sure the file name is correct and that the file actually exists in the current working directory.
magma --bfile g1000_eas --pval SNP N=8436 --gene-annot g1000_eas.genes.annot --out genebased
- GENE: the gene ID as specified in the annotation file
- CHR: the chromosome the gene is on
- START/STOP: the annotation boundaries of the gene on that chromosome (this includes any window around the gene applied during annotation)
- NSNPS: the number of SNPs annotated to that gene that were found in the data and were not excluded based on internal SNP QC
- NPARAM: the number of relevant parameters used in the model. For the SNP-wise models this is an approximate value; for the principal components regression (raw data default) this is set to the number of principal components retained after pruning; for the multimodels this is the mean NPARAM value of the component base models
- N: the sample size used when analysing that gene; can differ for allosomal chromosomes or when analysing SNP p-value input with variable sample size by SNP (due to missingness or differences in coverage in meta-analysis)
- ZSTAT: the Z-value for the gene, based on its (permutation) p-value; this is what is used as the measure of gene association in the gene-level analyses
- P: the gene p-value
On Mac, performing MAGMA-based gene-set enrichment analysis (GSEA) involves the following steps:
- Gene score calculation: First, a score is calculated for each gene based on the results of gene association analysis.
- Gene set enrichment analysis: Then, using the gene scores calculated in the previous step, gene set enrichment analysis was performed.
- [BINARY_PED_FILE_PREFIX] is the prefix of binary ped files (such as .bed/.bim/.fam files).
- [GENE_ANALYSIS_OUTPUT_FILE] is the path to the gene association analysis output file.
- [GENE_SCORES_OUTPUT_PREFIX] is the prefix of the gene score output file.
- [GENE_SET_FILE] is the file containing the gene set definition.
- [GENE_SCORES_OUTPUT_FILE] is the path to the gene score file.
- [GSEA_OUTPUT_PREFIX] is the prefix of the GSEA output file.
# Compute gene-level statistics
magma --bfile [BINARY_PED_FILE_PREFIX] --gene-results [GENE_ANALYSIS_OUTPUT_FILE] --out [GENE_SCORES_OUTPUT_PREFIX]
# Perform gene-set enrichment analysis
magma --gene-set [GENE_SET_FILE] --gene-scores [GENE_SCORES_OUTPUT_FILE] --out [GSEA_OUTPUT_PREFIX]
- GENESET: The name or ID of the gene set.
- NGENES: The number of genes in the gene set.
- NINDATA: The number of genes in the gene set that were also found in the gene analysis results.
- ZSTAT: The Z-value for the gene set, based on its permutation p-value.
- P: The p-value for the gene set's enrichment.
- NROT: The number of rotations (or permutations) that were run.
- P_BONF: The Bonferroni-corrected p-value.
- [GENE_PROPERTY_FILE] is a file containing gene property information.
- [GENE_SCORES_OUTPUT_FILE] is the path to the gene score file.
- [GENE_PROPERTY_OUTPUT_PREFIX] is the prefix of the gene property analysis output file.
magma --gene-property [GENE_PROPERTY_FILE] --gene-scores [GENE_SCORES_OUTPUT_FILE] --out [GENE_PROPERTY_OUTPUT_PREFIX]
- PROPERTY: The gene property being analyzed.
- NGENES: The number of genes that have this property.
- NINDATA: The number of genes with this property that were also found in the gene analysis results.
- ZSTAT: The Z-value for the gene property, based on its permutation p-value.
- P: The p-value for the gene property's association with the trait.
- NROT: The number of rotations (or permutations) that were run.
- P_BONF: The Bonferroni-corrected p-value.