MRHevo_paper.Rmd

---
title: "Inference of causal and pleiotropic effects with multiple weak genetic instruments: application to effect of adiponectin on type 2 diabetes"
#titlerunnning: "short"
date: 
author:
 - name: Paul M McKeigue\textsuperscript{\getAff{Usher}}
   affiliation: IGC
   email: paul.mckeigue@ed.ac.uk
 - name: Athina Spiliopoulou
   affiliation: Usher
 - name: Andrii Iakovliev
   affiliation: IGC
  - name: Buddhiprabha Erabadda
   affiliation: Usher
 - name: Helen M Colhoun
   affiliation: IGC
address:
 - code: Usher
   address: Usher Institute, College of Medicine and Veterinary Medicine, University of Edinburgh, Teviot Place, Edinburgh EH8 9AG, Scotland.
 - code: IGC
   address: Institute of Genetics and Cancer, College of Medicine and Veterinary Medicine, University of Edinburgh, Western General Hospital Campus, Crewe Road, Edinburgh EH4 2XUC, Scotland.
header-includes:
  \usepackage{authblk}
  \usepackage{natbib}
  \usepackage[T1]{fontenc}
  \let\origquote\quote
  \def\quote{\origquote\itshape}
  \usepackage{graphicx}
  \usepackage{longtable}
  \usepackage{booktabs}
  \usepackage{float}
  \usepackage{array}
  \usepackage{threeparttable}
  \usepackage{soul}
  \floatplacement{figure}{H}
output:
  rticles::plos_article:
    template: ../ukbb/template.tex # edited version of rticles plos template without wide left margin
    latex_engine: xelatex
    includes:
      in_header: ~/preamble.tex
  citation_package: natbib
  keep_tex: true
  fig_caption: yes
  number_sections: false
  link-citations: true
  md_extensions: -autolink_bare_uris
bibliography: "./mr.bib"
csl: "../ukbb/ajhg.csl"
always_allow_html: true
urlcolor: blue
linenumbers: false
linkcolor: cyan
abstract: |

---
<!--
MRHorse has
no regularization
C+ prior on tau as compound normal-invgamma
C+ prior on lambda_j (called phi_j) as compound normal-invgamma

gamma (called mu), deterministic given theta, alpha, beta
gamma_hat (called by) drawn as ~ N(mu, se.gamma_hat)

alpha_hat (called bx), drawn as ~ N(alpha, se.alpha_hat)

alpha_j (called bx0) drawn as from prior with global mean and variance mx0, vx0
where mx0 and vx0 are drawn from normal and half-normal priors

beta_j (called alpha) is drawn as ~ N(0, lambda^2, tau^2)
## doesn't draw from standard normal and scale afterwards

-->
```{r setup, echo=FALSE, warning=FALSE, message=FALSE}
unloadNamespace("kableExtra")
library(data.table)
library(ggplot2)
library(kableExtra)
options(knitr.kable.NA = '.')
options(knitr.table.format = "latex") # adds tab: prefix to labels
knitr::opts_chunk$set(echo = FALSE, message=FALSE, warning=FALSE, fig.pos='H')

save.dir <- "/opt/shared/project/ukbiobank/data/genetics/saves"
#load("mrhevopaper_f0.5_slabscale0.025.RData")

phenoname <- pheno.name <- "t2d.diag"
load("mrhevopaper_f0.2_slabscale0.1.RData")
load(file.path(save.dir, "somalogic.gwasids.excluded.RData"))
load("../ukbb/mr/t2d.diag_1e-06.mrhevo.genes.RData")

result.DeC <- mrhevo.list[["DeC"]]$estimators.dt[grepl("Marginaliz", Estimator)]
result.UKB <- mrhevo.list[["UKB"]]$estimators.dt[grepl("Marginaliz", Estimator)]

options(warn=1)

source("../ukbb/phenosdt.R")
long.phenoname <- phenos.dt[phenoname==pheno.name, long.phenoname]
phenofile <- phenos.dt[phenoname==pheno.name, phenofile]
gwas.hits <- readRDS(phenos.dt[phenoname==pheno.name, gwashits.file])[[1]]

```
# Abstract 
<!--200 words-->Current methods for Mendelian randomization (MR) analyses are restricted to single SNP instruments, and cannot reliably infer causality with instruments that are mostly weak and pleiotropic. We describe methods to overcome these limitations: key innovations are construction of scalar instruments from multiple SNPs, use of a regularized horseshoe prior, and hypothesis tests based on the marginal likelihood of the causal effect parameter.  To demonstrate the approach, we constructed genotypic instruments from unlinked _trans_-pQTLs detected in two large GWAS studies of plasma proteins, and tested the top 20 genes for which the aggregated effects of the instruments was associated with type 2 diabetes in the UK Biobank cohort.  The only protein with clear evidence of a causal effect on type 2 diabetes was adiponectin, encoded by _ADIPOQ_: standardized log odds ratio `r round(result.UKB$Estimate, 2)` (95% CI `r round(result.UKB$cl.lower, 2)` to `r round(result.UKB$cl.upper, 2)`) using UK Biobank instruments. These results have implications for the design and analysis of Mendelian randomization studies.  Where the exposure under study is expression of a gene, restricting the instruments to _cis_-acting variants is likely to miss causal effects.  Tests based on the marginal likelihood should supersede other methods of testing for causality in the presence of pleiotropy.  
<!-- TODO
rewrite discussion
flatten graph of model
grid plots of pairs and loglik
-->

\newpage
<!--\linenumbers-->

# Introduction
Instrumental variable analysis with genetic instruments ("Mendelian randomization") has been widely used to infer causal effects of exposures (broadly defined to include behavioural traits, biomarkers and gene expression levels) on diseases.  The biggest methodological challenge is how to infer causality when some of the genetic instruments have direct (pleiotropic) effects on the outcome that are not mediated through the exposure under study.  These pleiotropic effects are usually not directly observed, and their distribution over the instruments is unknown.  Inference of causality in the presence of pleiotropic effects has relied on makeshift procedures for the construction of estimators such as weighted medians that downweight or exclude instruments that are outliers [@bowden_consistent_2016].  The results depend on which estimator is used: recent guidelines suggest that "investigators should pick a sensible range of methods to assess the sensitivity of their findings" [@burgess_guidelines_2023]. 

The motivation for the work reported here was to infer causality from genetic instruments constructed from _trans_-QTLs that perturb expression (as levels of the transcript or encoded protein) of a gene.  Mendelian randomization studies of gene expression are usually restricted to _cis_-QTLs, on the basis that _trans_- effects are too weak and pleiotropic to be used as genetic instruments [@burgess_guidelines_2023].  Weak instruments are excluded because weak estimators constructed from ratios of estimated coefficients have unstable sampling properties where the denominators of these ratios -- the instrument-exposure coefficients -- can take values close to zero.  With recently-described Bayesian methods that use an unregularized horseshoe prior to marginalize over the distribution of pleiotropic effects to infer the causal effect [@berzuini_bayesian_2020;@grant_bayesian_2024], there is no need to exclude weak instruments.  However few real-world application of these methods have been reported. 

For studies that use _trans_-QTLs as instruments, a key limitation of all existing methods is that they cannot aggregate the effects of multiple SNPs at each locus to construct unlinked instruments that can be modelled as independent. The usual procedure is to select a single SNP that is reliably associated with the exposure from each genomic region in which genetic associations with the exposure have beeen detected.  This does not use all the available information about the effects of variants in the region on the exposure, and the results may depend upon which variants are selected. 

This paper describes methods that overcome these limitations, and demonstrates their application to infer causal effects of circulating proteins on type 2 diabetes in the UK Biobank cohort. 

## Methods
## Statistical model
For a Mendelian randomization study with $J$ unlinked genetic instruments, we specify a model with three parameters:

\begin{itemize}

\item $\boldsymbol{\alpha}$ vector of coefficients of effects of the instruments on exposure $X$. 

\item $\boldsymbol{\beta}$ vector of coefficients of direct (pleiotropic) effects of the instruments on outcome $Y$

\item $\theta$ causal effect of $X$ on $Y$

\end{itemize}

From summary statistics we have estimates $\boldsymbol{\hat{\alpha}}$, $\boldsymbol{\hat{\gamma}}$ of the effects $\boldsymbol{\alpha}$, $\boldsymbol{\gamma}$ of the $J$ instruments on the exposure and the outcome, with corresponding standard errors $\boldsymbol{s_{\alpha}}$, $\boldsymbol{s_{\gamma}}$.  We assume that these estimates have been adjusted for relevant covariates such as genetic background. As the instruments are unlinked, we can model the estimates as independent Gaussian variables conditional on the true values. 

\[
 \hat{\alpha}_j \sim \mathcal{N} \left( \alpha_j, s_{\alpha \left( j \right) }^2 \right)
\]

\[
 \hat{\gamma}_j \sim \mathcal{N} \left( \gamma_j, s_{\gamma\left( j \right) }^2 \right)
\]

The crude effect of each instrument on the outcome is the sum of the direct effect and the causal effect. 

\[
 \gamma_j = \beta_j + \theta \alpha_j
\]

With priors on $\boldsymbol{\beta}, \theta$ we can compute the joint posterior distribution of these parameters given $\boldsymbol{\hat{\gamma}}, \boldsymbol{s_{\gamma}}, \boldsymbol{\hat{\alpha}}, \boldsymbol{s_{\alpha}}$.  As the form of the distribution of pleiotropic effects over loci is unknown, any realistic statistical model has to specify a prior on these effects that encompassses a broad family of symmetric distributions ranging from a spike-and-slab mixture to a Gaussian.  One such prior is the horseshoe [@carvalho_handling_2009], which has been applied to inference of causal effects from Mendelian randomization using individual-level data [@berzuini_bayesian_2020] or summary-level data [@grant_bayesian_2024]. A limitation of the original horseshoe is that the posterior distribution is difficult to sample from because the curvature varies.  This problem is not obvious when Gibbs sampling is used [@grant_bayesian_2024], but is revealed by the divergence diagnostics that are implemented in the NUTS (No U-Turn Sampling) algorithm used in `Stan` and other probabilistic programming languages [@piironen_sparsity_2017]. 
Another limitation of the original horseshoe is that there is no way of specifying a realistic prior on the size of the nonzero effects in the "slab" component.  When modelling the genetics of complex traits, we already know that the effects of common genetic variants on a complex trait are usually small.  The regularized horseshoe prior, known as the "Finnish horseshoe", overcomes these limitations as described below. We name our method "MR-Hevo", using the Finnish word for a horse to distinguish it from the recently-described "MR-Horse" method which uses an unregularized horseshoe prior [@grant_bayesian_2024].  

Fig \ref{fig:mrhevograph} shows the model as a directed acyclic graph in plate notation. The regularized horseshoe prior for the regression coefficients $\beta_1, \dots, \beta_J$ is 
\[
\beta_j \sim \mathcal{N} \left( 0, \tau^2 \tilde{\lambda}_j^2 \right), \tilde{\lambda}_j^2 = \frac{\eta^2 \lambda_j^2}{\eta^2 + \tau^2 \lambda_j^2}
\]

Half-Cauchy priors are specified on the unregularized local scale parameters $\lambda_j$ and the global scale parameter $\tau$.  

\[
\lambda_j \sim C^+ \left( 0, 1 \right)
\]

\[
\tau \sim C^+ \left( 0, s_{\textrm{global}} \right)
\]

The heavy tail of the half-Cauchy priors on $\lambda_j$ allows some of the regression coefficients to escape the shrinkage imposed by small values of the global parameter $\tau$.  These nonzero coefficients are the slab component of the spike-and-slab distribution. 

A weakly informative prior is specified for the regularization parameter $\eta$: 

\[
\eta^2 \sim \textrm{Inverse-Gamma} \left( \frac{ \nu}{2}, \frac{\nu s^2_{\textrm{slab}}}{2} \right)
\]

Even the largest direct effects will be regularized as a Gaussian with standard deviation $\eta$.  The scaling factor $s_{\textrm{slab}}$ is specified based on prior information about the expected size of the largest direct effects.  To constrain $\eta$ so that the sampling algorithm does not diverge, we set $\nu=2$; this translates to a half-$t_{2\textrm{df}}$ prior on the scale of the largest direct effects. 

## Shrinkage coefficients and effective number of nonzero effects
Calculation of the shrinkage coefficients and the effective number of nonzero parameters does not affect the results of statistical modelling, but is helpful for interpretation of what the horseshoe prior is learning.  

The shrinkage coefficient $\kappa_j$ for the $j$th regression coefficient $\beta_j$, with prior $\sim \mathcal{N} \left( 0, \tau^2 \lambda_j^2 \right)$ and Gaussian likelihood with Fisher information $\mathcal{I}_j$ can be defined as the fraction by which the information in the prior shrinks the posterior mean of $\beta_j$ from the maximum likelihood estimate: 

\[
\kappa_j = 1 - \frac{ \tau^2 \lambda_j^2}{\tau^2 \lambda_j^2 + \mathcal{I}} = \frac{1}{1 + \tau^2 \lambda_j^2 / \mathcal{I}_j }
\]

This definition of the shrinkage coefficient ignores the regularization parameter $\eta$, which imposes a lower bound on the shrinkage of $\beta_j$.  The prior on each shrinkage coefficient has a horseshoe shape.  The shrinkage coefficients $\kappa_j$ can take values from 0 (no shrinkage) to 1 (complete shrinkage).  

$\mathcal{I}_j$ is asymptotically equivalent to the inverse variance of the maximum likelihood estimate of $\beta_j$ at $\theta = 0$.  When using summary statistics, the approximate variance of the maximum likelihood estimate of the ratio $\beta_j = \gamma_j / \alpha_j$ can be calculated by the delta method.  

The effective fraction $f$ of nonzero coefficients is 

\[
f = \frac{1}{J}\sum_{j=1}^J{\left( 1 - \kappa_j \right)}
\]

The prior expectation of $f$ is 

\[
\mathbb{E}\langle f \mid \tau \rangle = 1 - \frac{1}{J} \sum_{j=1}^J {\frac{1}{1 + \tau^2 \mathcal{I}_j}}
\]

The information $\mathcal{I}_j$ on the coefficient $\beta_j$ depends on the sample sizes in the studies from which the summary statistics $\hat{\alpha}_j$ and $\hat{\gamma}_j$ were obtained.  With more information, a smaller value of $\tau$ is required to impose the same prior expectation of $f$.  Piironen and Vehtari recommend setting the scale of the prior on $\tau$ to be consistent with this prior expectation [@piironen_sparsity_2017].  For this analysis we have set the scale of the Cauchy prior on $\tau$ to so that the prior median $\tau_0$ is the value at which the prior expectation of $f$ given $\mathcal{I}_1, \dots, \mathcal{I}_J$ is 0.2, and we have set the scale of the Inverse-Gamma  prior on $\eta^2$ as 0.1, encoding our prior knowledge that effects of any single genomic region on type 2 diabetes are of modest size.  We examined the sensitivity of the results to these prior settings as described below.  

## Computational methods
The NUTS algorithsm is implemented in several recently developed probabilistic programming languages, including `Stan`, `PyMC`, and `NumPyro`.  MR-Hevo is implemented in `NumPyro`, which has a more robust implementation of the NUTS algorithm than `Stan` giving fewer divergent transitions.  To improve posterior geometry, the Cauchy distributions are parameterized as mixture distributions (Gaussian with inverse gamma distribution of scale parameter).  A divergent transitions rate of less than 1 in 1000 is considered acceptable. This criterion was met with the regularized horseshoe, but not with the original horseshoe for which the divergent transition rate was about 5%. To obtain the marginal likelihood as a function of the parameter of interest, we simply divide the posterior by the prior on that parameter.  We calculated the likelihood of the causal effect parameter $\theta$ by fitting a kernel density to the posterior samples of $\theta$, weighting each observation by the inverse of the prior.  A quadratic function was fitted by least-squares to the logarithm of this likelihood function.  Where the log-likelihood is asymptotically quadratic, a confidence interval for $\theta$ and a test of the null hypothesis $\theta=0$ can be obtained by standard methods. Where the log-likelihood is not quadratic, inference should be based on the posterior distribution rather than on sampling theory.  

## Constructing scalar instruments from multiple SNPs
### Instrument-exposure coefficients
From summary statistics we have for each clump of exposure-associated SNPs a vector $\boldsymbol{\hat{\alpha}_u}$ of univariate coefficient estimates for the effect of the SNPs on the exposure.  Multivariable coefficient estimates $\boldsymbol{\hat{\alpha}_m}$ are calculated by premultiplying the univariate coefficients by the inverse of the correlation matrix $\boldsymbol{\Sigma_G}$ between the SNP genotypes.  

\[
\boldsymbol{\hat{\alpha}_m} = \boldsymbol{\Sigma_G^{-1}} \boldsymbol{\hat{\alpha}_u}
\]

The correlation matrix $\boldsymbol{\Sigma_G}$ is obtained from a reference panel such as 1000 Genomes.  A shrinkage penalty (equivalent to ridge regression) can be imposed by adding a penalty factor to the diagonal elements of $\boldsymbol{\Sigma_G}$.  Where the correlation matrix is singular or ill-conditioned, a pseudo-inverse can be used to calculate the multivariable coefficients.  For an individual with genotypes $\boldsymbol{G}$ at the exposure-associated SNPs, a locus-specific score $S$ predicting the exposure is calculated as $\boldsymbol{G} \cdot \boldsymbol{\hat{\alpha}_m}$.

Because the score $S$ is calculated from the genotypes and the genotype-exposure coefficients, we cannot use it as a genotypic instrument in a model of the relationship of the genotype-outcome coefficients to the genotype-exposure coefficients.  We can factor the dot product $\boldsymbol{G} \cdot \boldsymbol{\hat{\alpha}_m}$ as the product of two scalars: the magnitude of the multivariable coefficient vector $\lVert \boldsymbol{\hat{\alpha}_m} \rVert$ and a pseudo-genotype $\lVert \boldsymbol{G} \rVert \rho_{G, \boldsymbol{\hat{\alpha}_m}}$, where $\rho_{G, \boldsymbol{\hat{\alpha}_m}}$ is the correlation between $\boldsymbol{G}$ and $\boldsymbol{\hat{\alpha}_m}$, geometrically equivalent to the cosine of the angles between these vectors. We can then substitute $\lVert \boldsymbol{\hat{\alpha}_m} \rVert$ for the scalar coefficient estimate $\hat{\alpha}$ and $S / \lVert \boldsymbol{\hat{\alpha}_m} \rVert$ for the scalar instrument $Z$ as a pseudo-genotype in the statistical model.  The derived coefficient for the estimated effect of the instrument on the exposure is always positive; flipping the coding of the alleles would flip the sign of $\rho_{G,\alpha}$. We can calculate the standard error of $\hat{\alpha}$ as $1 / \sqrt{\mathcal{I}_\alpha}$ where $\mathcal{I}_{\alpha}$ is the Fisher information on the slope of the linear regression of $X$ on $Z$, given by 

\[
\mathcal{I}_{\alpha} = N_{\alpha} \frac{\textrm{Var} \left( Z \right) }{\sigma_X^2 - \hat{\alpha}^2 \textrm{Var} \left( Z \right) } \enspace,
\]

where $N_{\alpha}$ is the sample size of the study from which the estimated univariate coefficients $\boldsymbol{\hat{\alpha}_u}$ were obtained, $\textrm{Var} \left( Z \right)$ is the variance of the scalar instrument $Z$ estimated in a reference panel, and $\sigma_X^2$ is the variance of the exposure $X$.  If an estimate of $\sigma_X^2$ is not given, it can be calculated from the allele frequency $a_j$ of the $j$th SNP, the standard error $s_{G(j)}$ of the $j$th univariate coefficient estimate $\hat{\alpha}_{u(j)}$, and the sample size $N_{\alpha}$ as 

\[
\sigma_X^2 = 2 a_j \left( 1 - a_j \right) \left( N_{\alpha} s_{G(j)}^2 + \alpha_{u(j)}^2 \right)\enspace.
\] 

## Instrument-outcome coefficients
For this study, where we have individual-level data in UK Biobank for the scalar instrument $Z$ and the outcome $Y$, we estimated the coefficients for the effect of the scalar instruments $Z$ on the outcome $Y$ and its standard error by fitting logistic regression models.  A method for calculating the instrument-outcome coefficient $\hat{\gamma}$ and its standard error from summary-level data for SNP-outcome coefficients is described below.  

For each instrument we standardize the pseudo-genotypes $Z$ to unit variance and scale the coefficient estimates accordingly, so that the variance explained by the $j$th linear predictor $Z_j \hat{\alpha}_j$, which can be interpreted as the "strength of the instrument", is proportional to the square of the scalar coefficient $\hat{\alpha}_j$.  The prior on the scale of the pleiotropic effects $\boldsymbol{\beta}$ of the instruments $Z$ is thus independent of the strength of the instrument.  

Where only summary-level data are available for the genotype-outcome associations, an estimate $\hat{\gamma}$ of the coefficient for the effect of the scalar instrument $Z$ on the outcome and its standard error can be obtained using individual-level genotype data from a reference panel.  First, the vector $\boldsymbol{\hat{\gamma}_m}$ of estimated multivariable coefficients is calculated by premultiplying the estimates $\boldsymbol{\hat{\gamma}_u}$ of the univariate coefficients by $\boldsymbol{\Sigma_G^{-1}}$. The coefficient $\hat{\gamma}$ for the scalar instrument $Z$ is then estimated by minimizing the sum of the squared differences between the predictor calculated from $Z$ and the linear predictor calculated from the vector of genotypes $\boldsymbol{G}$, where the sum is taken over the genotypes of all $n$ individuals in the reference panel.  

\[
\sum_{i=1}^n \left( \gamma Z_i - \boldsymbol{G}_i \cdot \boldsymbol{\hat{\gamma}_m} \right)^2
\]

Equating to zero the derivative of this expression with respect to $\gamma$, substituting $\lVert \boldsymbol{G_i} \rVert \rho_{G, \boldsymbol{\hat{\alpha}_m}}$ for $Z_i$ and factoring the dot product $\boldsymbol{G}_i \cdot \boldsymbol{\hat{\gamma}_m}$ as $\lVert \boldsymbol{G}_i \rVert \rho_{\boldsymbol{G}_i, \boldsymbol{\hat{\gamma}_m}} \lVert \boldsymbol{\hat{\gamma}_m} \rVert$ we obtain
\[
\hat{\gamma} = \lVert \boldsymbol{\hat{\gamma}_m} \rVert \frac{\sum_{i=1}^n \rho_{\boldsymbol{G}_i, \boldsymbol{\hat{\gamma}_m}} \lVert \boldsymbol{G}_i \rVert^2 \rho_{G_i, \boldsymbol{\hat{\alpha}_m}}}{\sum_{i=1}^n \lVert \boldsymbol{G_i} \rVert^2 \rho_{G_i, \boldsymbol{\hat{\alpha}_m}}^2}
\]

The ratio in this expression can be recognized as a weighted average of the ratio $\rho_{G_i, \boldsymbol{\hat{\gamma}_m}} / \rho_{G_i, \boldsymbol{\hat{\alpha}_m}}$, with weights $Z_i^2$.  The standard error of $\hat{\gamma}$ is $1 / \sqrt{\mathcal{I}_\gamma}$ where $\mathcal{I}_{\gamma}$ is the Fisher information on the slope of the regression of $Y$ on $Z$, given (for a logistic regression model) by 

\[
\mathcal{I}_{\gamma} = N_{\gamma} \textrm{Var} \left( Z \right) p \left( 1 - p \right) 
\]

where $N_{\gamma}$ and $p$ are respectively the total sample size and the the proportion of cases in the dataset from which the coefficient estimates $\boldsymbol{\hat{\gamma}_u}$ were obtained. 

## Example: effect of adiponectin on type 2 diabetes
As the methods described above overcome some of the limitations of standard methods for design and analysis of Mendelian randomization studies, they may help to resolve contradictory results obtained in earlier studies.  As an example, we examine the effect of adiponectin, encoded by the _ADIPOQ_ gene, on the risk of type 2 diabetes.  Low adiponectin levels are strongly associated with type 2 diabetes.  Of four previous studies using Mendelian randomization to investigate whether low adiponectin levels cause type 2 diabetes, two reported no evidence of a causal effect [@yaghootkar_mendelian_2013;;@chen_effects_2020] and two reported evidence for a causal effect [@nielsen_low_2021;@zhang_causal_2022].  We obtained coefficient estimates for the effects of SNPs on circulating levels of adiponectin from two studies: 

* The DeCODE study of 35559 Icelanders in which 4719 proteins on the SomaLogic v4 panel were measured in plasma [@ferkingstad_largescale_2021].  `r length(bad.gwasids)` aptamers on this platform that appeared to cross-react with complement factor H were excluded.  The criteria for identifying these aptamers were: a _trans_-pQTL at the _CFH_ locus and no _cis_-pQTL for the protein that the aptamer was designed to detect and association of the _trans_- score with age-related macular degeneration. 

* The UK Biobank proteomics study of 54306 individuals in which 2923 proteins on the Olink Explore panel were measured in plasma [@sun_plasma_2023]

The calculated standard deviation of adiponectin levels was `r mrhevo.list[["DeC"]]$scoresinfo.gene[1, round(sqrt(median.totalvar.study)[1], 2)]` in the DeCODE dataset and `r mrhevo.list[["UKB"]]$scoresinfo.gene[1, round(sqrt(median.totalvar.study)[1], 2)]` in the UK Biobank dataset.  In each genomic region where there was at least one SNP-exposure association with $p < 10^{-6}$, all SNPs with $p < 10^{-6}$ separated by no more than 1 Mb were included in the calculation of the multivariable coefficient vector. This yielded `r mrhevo.list[["DeC"]]$scoresinfo.gene[, .N]` unlinked genotypic instruments from the DeCODE study, calculated from `r mrhevo.list[["DeC"]]$scoresinfo.gene[, sum(n_snps)]` SNPs, and `r mrhevo.list[["UKB"]]$scoresinfo.gene[, .N]` instruments from the UK Biobank proteomics study, calculated from `r  mrhevo.list[["UKB"]]$scoresinfo.gene[, sum(n_snps)]` SNPs.  Each instrument was annotated with a list of nearby genes, and the genes on each list were matched to a list of 256 genes for which SNP associations with type 2 diabetes have been reported at $p < 5 \times 10^{-8}$, extracted from GWAS Catalog.  

The target dataset with genotypes and type 2 diabetes status was the UK Biobank cohort.  Ethical approval for the UK Biobank study was granted in 2011 by the North West Multi-centre Research Ethics Committee (11/NW/0382), and renewed every five years since then. Informed consent was obtained for all participants in UK Biobank.  The work described herein was approved by the UK Biobank under application number 23652. 

Separate analyses were undertaken using the instruments constructed from these two proteomics studies.  When using instruments obtained from the UK Biobank proteomics study, the 54306 participants who were in the proteomics study were excluded from the target dataset.

Because the individuals in the proteomics studies were predominantly of European ancestry, the target dataset was restricted to individuals of European ancestry.  Of `r mrhevo.list[[1]]$N.indiv` unrelated individuals with nonmissing phenotype and genotype data, `r mrhevo.list[[1]]$N.cases` were classified as cases of `r long.phenoname` and the all others were classified as noncases. 
	
For each instrument, the pseudo-genotypes were standardized to unit variance and a logistic regression model was fitted with type 2 diabetes as outcome variable.  The other covariates were sex and the first five principal components of the SNP relationship matrix.  The coefficients for the effect of each scalar instrument on type 2 diabetes were used to fit the model.  _Cis_-pQTLs were excluded.  

For comparison with inference based on the marginal likelihood of the causal effect parameter, three widely-used estimators for the causal effect were calculated from the coefficient ratios: the inverse-variance weighted mean of the coefficient ratios which assumes no pleiotropy, the weighted median, and the penalized weighted median which downweights outliers [@bowden_consistent_2016].  The coefficient ratio estimates and their standard errors were calculated by the delta method, based on second-order Taylor expansions for the moments of the distribution of the ratio of two independent normally-distributed variables.  Standard errors for the weighted median and penalized weighted median estimators were calculated by a parametric bootstrap method.  <!--The method described in the original publication on weighted median estimators [@bowden_consistent_2016], and implemented on the MR-Base platform, is not correct as it simulates from the observed coefficient estimates rather than from the distribution that generates these observations. --> To generate the sampling distribution of the weighted median, we used 4000 draws from the posterior distribution of the regularized shrinkage parameters $\tilde{\lambda}_j$ and the instrument effects $\alpha_j$.  At each draw we sampled the direct effects $\beta_j$ of $J$ new genetic instruments, and sampled the coefficient ratio estimates $\hat{\gamma}_j$ (calculated by the delta method) conditional on $\beta_j$, $\alpha_j$ and the standard errors s $\hat{\alpha}, \hat{\beta}$ of these estimates.  The standard deviation of the weighted median of the coefficient ratio estimates obtained from this posterior predictive distribution was calculated and used to obtain _p_-values and confidence intervals. 

## Results 
Table \ref{tab:coeffs} shows summary statistics for the genetic instruments for adiponectin derived from each pQTL study.  There is suprisingly little overlap between the `r mrhevo.list[["DeC"]]$scoresinfo.gene[qtl_type=="trans", .N]` _trans_-pQTLs detected in the DeCODE study and the `r mrhevo.list[["UKB"]]$scoresinfo.gene[qtl_type=="trans", .N]` _trans_-pQTLs detected in UK Biobank.  Only eleven _trans_-pQTLs -- gene-poor regions at 219.4 Mb on chromosome 1, 34.2 Mb on chr 6, 139.5 Mb on chromosome 6, and regions containing _ALAS1_, _ADRB1_, _SPON1_, _PDE3A_, _ABCB9_, _CDH13_, _AKR1B1P7_, and _MIR6813_ -- were detected in both studies at a threshold of $p < 10^{-6}$.  This may reflect the low power to detect weak _trans_-pQTLs even in these large datasets.  In both studies there is a _cis_-QTL at the _ADIPOQ_ transcription site.  In the DeCODE study an extended _cis_-pQTL is detected 2 Mb upstream of the transcription site.  Only two pQTLs -- _SPON1_ and _APOC1_ -- contained genes that were listed in GWAS Catalog as associated with type 2 diabetes.  

Figure \ref{fig:plot.instruments} shows that `r  mrhevo.list[["DeC"]]$coeffs.dt[qtl_type=="trans" & sign(theta_IV)==-1, .N]` of `r  mrhevo.list[["DeC"]]$coeffs.dt[qtl_type=="trans", .N]` _trans_-pQTL instruments from DeCODE and `r  mrhevo.list[["UKB"]]$coeffs.dt[qtl_type=="trans" & sign(theta_IV)==-1, .N]` of `r  mrhevo.list[["UKB"]]$coeffs.dt[qtl_type=="trans", .N]` _trans_-pQTL instruments from UKBB are inversely associated with type 2 diabetes. The _cis_-pQTL instruments are not associated with type 2 diabetes.  

Table \ref{tab:estimates} shows the estimates of the causal effect parameter obtained by each method, as log odds ratios for unit change in adiponectin levels.  The maximum likelihood estimates obtained by marginalizing over direct effects were `r round(result.DeC$Estimate, 2)` (95% CI `r round(result.DeC$cl.lower, 2)` to `r round(result.DeC$cl.upper, 2)`) using DeCODE instruments and `r round(result.UKB$Estimate, 2)` (95% CI `r round(result.UKB$cl.lower, 2)` to `r round(result.UKB$cl.upper, 2)`) using UK Biobank instruments.  In comparison with the weighted median or penalized weighted median estimators, the confidence intervals obtained from the likelihood function were wider and the _p_-values were more conservative.  

Supplementary Figure \ref{fig:densityukbb} shows the posterior distribution and log-likelihood of the causal effect parameter using UK Biobank instruments.  As expected with this large sample size, the posterior distribution is approximately Gaussian and the log-likelihood is well approximated by a quadratic function.  Supplementary Table \ref{tab:posteriorsummaries} shows posterior summaries of the model parameters.  For each of the two sets of instruments, the prior median of the global shrinkage parameter $\tau$ was set to a value that translated to an expectation of 0.2 for the effective fraction $f$ of instruments with nonzero pleiotropic effects.  The posterior medians of $\tau$ and $f$ were much lower than the prior medians.  Consistent with this low estimate of the fraction of instruments that have nonzero pleiotropic effects, Figure \ref{fig:plot.betakappa} shows that only a few instruments have clearly escaped shrinkage: _of the DeCODE instruments these were APOC1_ _CDH13_ and _ABCB9_, and of the UK Biobank instruments these were _CDH13_, _ABCB9_ and _CCND2_. 

Specifying different values for the scale of the prior on the global shrinkage parameter $\tau$ which regulates the number of nonzero pleiotropic effects, or for the scale of the prior on the regularization parameter $\eta$ which regulates the size of these effects, did not appreciably alter the results.  

Fig \ref{fig:weightedmedian} compares, for the top 20 genome-wide aggregated _trans_- effect scores associated with type 2 diabetes in the UK Biobank cohort, the _p_-values calculated from the weighted median and penalized weighted median estimators (using the procedure described above to calculate the standard deviation of these estimators), with _p_-values based on the marginal likelihood.  _ADIPOQ_ was the only gene for which _p_-values based on the marginal likelihood provided strong evidence of a causal effect. The p-values obtained from tests based on the sampling distribution of the weighted median were broadly concordant with those obtained from tests based on the marginal likelihood but were less extreme.  This is to be expected, as the median is less efficient than the maximum likelihood estimate.  

# Discussion
Mendelian randomization analysis, combined with related approaches including transcriptom-wide association studies and co-localization analysis, has been widely used to infer causal effects of gene expression on outcomes.  The most widely-used methods use summary statistics for the coefficients for the effect of each SNP on the outcome and on the exposure from non-overlapping samples [@hemani_mrbase_2018].  "Estimators" for the causal effect are constructed from the ratios of SNP-outcome to SNP-exposure coefficients. Bayesian methods for marginalizing over the unobserved pleiotropic effects to infer the causal effect parameter in a Mendelian randomization study have been described previously [@berzuini_bayesian_2020;@grant_bayesian_2024].  The methods described in this paper extend this approach: (1) by allowing unlinked scalar genetic instruments to be constructed from all SNPs associated with the exposure without having to restrict to one SNP from each genomic region; (2) by using a regularized horseshoe prior which allows encoding of prior knowledge that genetic effects of a single genomic region on a complex trait are usually small; and (3) by constructing classical hypothesis tests based on the likelihood of the causal effect parameter.  
<!--The method used in the original paper to calculate the sampling distributions of these test statistics was described as a "parametric bootstrap".  Parametric bootstrapping assumes that the data come from a known distribution, estimates the parameters of that distribution from the data, and uses this estimated distribution to draw samples from that distribution under the null.  Applying this to Mendelian randomization, this would entail simulating a direct effects of a new set of $J$ instruments from a mixture of spike-and-slab mixture distributions, and generating the sampling distribution of the coefficient ratios. The code released with the original publication [@bowden_consistent_2016] and used on the MRBase platform however simulates the posterior distribution of the pleiotropic effects of the $J$ instruments under the null, given the observed summary statistics (maximum likelihood estimate and standard error) for each instrument.  This does not correspond to a parametric bootstrap, or to any other valid procedure for statistical inference, as it simulates from the observations rather than from the distribution from which the observations come.  Not surprisingly, this yields standard errors that are too small and an inflated Type 1 error rate.  -->

The difficulties associated with using ratios of coefficients to construct estimators are eliminated in a Bayesian framework, where nothing is "estimated".  Where the log-likelihood is asymptotically quadratic, statistical theory guarantees that the maximum-likelihood estimate has the sampling properties that are desirable for an estimator: consistency, minimum variance and unbiasedness. Where the log-likelihood is not approximately quadratic, it is preferable to base inference directly on the likelihood function rather than on the sampling properties of an estimator. Because Bayesian inference does not rely on the sampling properties of ratios, there is no need to exclude "weak instruments".  This makes it possible to integrate Mendelian randomization into genome-wide aggregated _trans_- effects analysis, which attempts to identify core genes on which the polygenic effects of many variants coalesce via _trans_- effects to influence complex traits [@iakovliev_genomewide_2023]. In this approach, the genetic instruments are clumps of SNPs with _trans_- effects on the expression of a gene as transcript or circulating protein.  

Using instruments constructed from _trans_-pQTLs in two different studies that used different assays for the protein, there is clear support for a causal effect of low adiponectin levels on type 2 diabetes.  This is apparent even without a formal statistical analysis, as the signs of the effects of the instruments are almost all negative.  Earlier studies that failed to detect a causal effect were restricted to _cis_-acting SNPs [@yaghootkar_mendelian_2013;;@chen_effects_2020]. Some of these _cis_- acting SNPs affect not only the measured levels but also the splicing of the gene product [@lee_functional_2016], so their effects on measured adiponectin levels do not necessarily correspond to effects on the functional protein.  The most direct human evidence for a causal effect of low adiponectin on type 2 diabetes comes from reports of families in which type 2 diabetes segregated with rare loss-of-function variants in _ADIPOQ_ [@cook_hypoadiponectinemia_2010].  In the obese mouse model, overexpression of adiponectin in the liver protected against diabetes [@yamauchi_globular_2003].  Though adiponectin analogues have been studied experimentally, none has reached the clinical stage of drug development.  Misleading results from _cis_- Mendelian randomization studies may have discouraged development of a possible drug target.

Where, as in this example, the exposure under study is a gene transcript or a circulating protein, study designs that use only _cis_- acting variants as genetic instruments have been advocated [@schmidt_genetic_2020].  The rationale for thse "_cis_-MR" designs is that _cis_- acting variants are less likely to have pleiotropic effects on the outcome that are not mediated through the level of the circulating protein.  This supposition is questionable: especially where _cis_-acting variants affect the splicing of the transcript, the relationship between measured levels and functional activity of the protein may be broken.  More fundamentally, restriction to _cis_-QTLs focuses on the genes least relevant to disease risk.  Most disease/trait-associated SNPs are not co-localized with _cis_-eQTLs detected in relevant cell types [@chun_limited_2017].  Recent studies have shown that disease-relevant genes are enriched with redundant enhancer domains and depleted of _cis_-eQTLs of large effect, as would be expected if perturbation of these genes has large effects on fitness [@wang_enhancer_2020; @mostafavi_systematic_2023]. Most of the SNP heritability of gene expression is attributable to _trans_- effects [@ouwens_characterization_2020].     

Bayesian hypothesis testing is based on the Likelihood Principle: all information conveyed by observations that supports one model or one parameter value over another is contained in the ratio of the likelihoods of these models or parameter values.  The likelihood-based approach to causal inference described in this paper is aligned with the not yet universally-accepted principle that causal inference is just a special case of statistical inference, requiring attention to assumptions about exchangeability between observations (of exposure and outcome) and predictions of the effect of perturbing the exposure [@rohde_causal_2022].  In this decision-theoretic framework there is no need to invoke counterfactuals [@dawid_causal_2001] or to require causal effects to be expressed as marginal rather than as conditional effects.  In this example, the likelihood of a model with a causal effect parameter and only a few nonzero pleiotropic effects is higher than the likelihood of a model with no causal effect that requires more nonzero pleiotropic effect parameters to adjust to the data to obtain the same fit.  Thus even though the causal effect parameter is not identifiable because for any setting of the causal effect parameter we can find a setting of pleiotropic effect parameters that has the same fit to the data, a formal hypothesis test can be obtained.  With horseshoe priors, which allow the distribution of the pleiotropic effects to be learned from the data, this model comparison is performed "under the hood" [@carvalho_horseshoe_2010], without the computational inconvenience of having to average over all possible partitions of the variables into two disjoint sets (spike-and-slab) as in the contamination mixture model [@burgess_robust_2020] or over different settings of the number of nonzero effect parameters [@xue_constrained_2021].  A corollary of inference based on model comparison is that reliable inference of causality in the presence of pleiotropy requires a relatively large number of unlinked genetic instruments to allow learning the distribution of pleiotropic effects.  

The statistical model assumes that the effects of the instruments on the exposure and their direct effects on the outcome are independent in magnitude and direction: this has been denoted the InSIDE (Instrument Strength Independent of Direct Effect) assumption [@bowden_mendelian_2015a] though this term has also been used for the less stringent assumption that "the magnitude of the pleiotropic effects are independent of the SNP-exposure associations" [@bowden_difficulties_2017]. With the procedure described here for constructing scalar instruments, the estimates for the coefficients of the instrument-exposure effects can take only positive values (a negative effect would simply correspond to flipping of the allele coding), and the direct effects are assumed to arise from a distribution with zero mean. This assumption that the instrument-exposure effects and the direct effects are uncoupled in direction is termed "balanced pleiotropy".  It is possible in principle to infer a causal effect if the instrument-exposure effects and the direct effects are assumed to be uncoupled in magnitude even if they are coupled in direction [@bowden_mendelian_2015a] but it is not obvious why this would be a biologically realistic assumption.  

If the statistical model is extended to allow the instrument-exposure effects and the direct effects to be coupled in magnitude and direction, it is not possible to infer causality without other information about pleiotropy because models with and without a causal effect can fit the data equally well with the same number of adjustable parameters. It may be more useful to formulate and test possible hypotheses about mechanisms that could give rise to coupling of instrument-exposure effects with direct effects on the outcome, as this may identify a shared aetiological pathway.  For adiponectin and type 2 diabetes, for instance, coupling could arise if the pQTLs for adiponectin are QTLs for a trait such as obesity that causes low adiponectin levels and increased risk of type 2 diabetes. In this situation other circulating proteins such as leptin that are biomarkers for obesity would show a similar pattern of apparently causal pQTL effects on type 2 diabetes.  More generally, pQTLs that have pleiotropic effects should be detectable on a heat map of correlations between genotypic scores for multiple proteins or transcripts.   

Guidelines for the design of Mendelian randomization studies have recently been updated [@burgess_guidelines_2023].  Based on the work described here, some suggestions for revisions to these guidelines are listed below.  

* For inference of causal effects where pleiotropic effects may be present, the optimal method is to compute the likelihood of the causal effect parameter by marginalizing over the distribution of pleiotropic effects.  This is robust to the inclusion of weak instruments, and is computationally efficient with modern tools for Bayesian computation. Investigators should not "pick a range of methods", but assess the sensitivity of their results to prior assumptions about the distribution of unobserved pleiotropic effects.  There is no advantage to using tests based on the sampling distribution of estimators such as the weighted median, when the likelihood can be calculated as a function of the parameter of interest. 

* There is no need to restrict the selection of SNPs to one SNP from each genomic region: the methods described here allow scalar instruments to be constructed from multiple SNPs. This maximize the strength of the genetic instruments that can be obtained for each genomic region, and ensures that the results do not depend upon an arbitrary choice of which SNP to select from each region.  

* Study designs that rely on using variants from a single genomic region, for instance using only _cis_-acting variants to study the effects of a gene transcript or gene product on a disease or trait, are likely to give misleading results and should be deprecated. Because _cis_- acting variants are likely to affect the function of the gene through mechanisms (such 
as altered splicing) other than the measured levels of the transcript or gene product, it is preferable to report the effects of a _cis_-QTL separately.  

# Declarations 
# Data and code availability
A Python script using `NumPyro` to fit the MR-Hevo model, together with the summary-level data used in this paper, is available at https://github.com/molepi-precmed/mrhevo.

# Acknowledgements 
No specific funding was received for this work. This research has been conducted using the UK Biobank Resource under application number 23652. The development of the GENOSCORES platform was supported by a Springboard Award (SBF006/1109) to AS from the Academy of Medical Sciences, supported in turn by the Wellcome Trust, the UK Government Department of Business, Energy and Industrial Strategy, the British Heart Foundation, and Diabetes UK. AI was supported by the Medical Research Council Cross Disciplinary Fellowship (XDF) Programme (MC_FE_00035). HC is supported by an endowed chair from the AXA Research Foundation.  

# Declaration of interests
The authors declare no competing interests.

# Web resources
GWAS Catalog: https://www.ebi.ac.uk/gwas/

# References{-}
<div id="refs"></div>

\newpage

# Figures


```{r mrhevograph, echo=FALSE, out.width="1.0\\linewidth", fig.cap="\\label{fig:mrhevograph}MR-Hevo model as a directed acyclic graph in plate notation.  Stochastic nodes are shown as ellipses with continuous borders, deterministic nodes as ellipses with dashed borders. Observed nodes are shaded."}

knitr::include_graphics("mrhevograph.pdf")

```

```{r plot.instruments, echo=FALSE, fig.width=6, fig.height=9, fig.cap=paste0("\\label{fig:plot.instruments}Plot of coefficients of regression of type 2 diabetes on each instrument against coefficients of regression of adiponectin levels on each instrument. Size of each data point is inversely proportional to the standard error of the ratio estimate. The value of each estimator is shown as the slope of a line passing through the origin. \\textit{Cis}-pQTLs are excluded from these estimates.")}

studies <- c("DeCODE", "UK Biobank")
p.coeffs.bystudy <- cowplot::plot_grid(mrhevo.list[["DeC"]]$p.coeffs + coord_fixed(ratio=1.5),
                                       mrhevo.list[["UKB"]]$p.coeffs + coord_fixed(ratio=1.5) +
                                       theme(legend.position="none"),
                                       ncol=1, labels=studies)

options(warn=-1)
p.coeffs.bystudy

```

\newpage

```{r plot.betakappa, echo=FALSE, fig.width=6.5, fig.height=9, fig.cap="\\label{fig:plot.betakappa}: Effects of trans-QTLs for adiponectin on type 2 diabetes: posterior medians and 80% credible intervals for direct effects and shrinkage coefficients"}

coeffs.beta <- mrhevo.list[["DeC"]]$beta.coeffs
coeffs.beta[, qtlname := factor(qtlname, levels=qtlname)]
p.beta.DeC <- ggplot(data=coeffs.beta,
                      aes(y=qtlname, x=`50%`, xmin=`10%`, xmax=`90%`)) +
    geom_pointrange() +
    xlab("Pleiotropic (direct) effect") + ylab("Trans-pQTL")

coeffs.kappa <- mrhevo.list[["DeC"]]$kappa.coeffs
coeffs.kappa[, qtlname := factor(qtlname, levels=qtlname)]
p.kappa.DeC <- ggplot(data=coeffs.kappa,
                      aes(y=qtlname, x=`50%`, xmin=`10%`, xmax=`90%`)) +
    geom_pointrange() +
    theme(axis.text.y = element_blank(), axis.title.y=element_blank()) + 
    xlab("Shrinkage coefficient")

coeffs.beta <- mrhevo.list[["UKB"]]$beta.coeffs
coeffs.beta[, qtlname := factor(qtlname, levels=qtlname)]
p.beta.UKB <- ggplot(data=coeffs.beta,
                      aes(y=qtlname, x=`50%`, xmin=`10%`, xmax=`90%`)) +
    geom_pointrange() +
    xlab("Pleiotropic (direct) effect") + ylab("Trans-pQTL")

coeffs.kappa <- mrhevo.list[["UKB"]]$kappa.coeffs
coeffs.kappa[, qtlname := factor(qtlname, levels=qtlname)]
p.kappa.UKB <- ggplot(data=coeffs.kappa,
                  aes(y=qtlname, x=`50%`, xmin=`10%`, xmax=`90%`)) +
    geom_pointrange() +
    theme(axis.text.y = element_blank(), axis.title.y=element_blank()) + 
    xlab("Shrinkage coefficient")

cowplot::plot_grid(p.beta.DeC, p.kappa.DeC,
                   p.beta.UKB, p.kappa.UKB, nrow=2, rel_heights=c(1, 1.5),
                   labels=list("", "DeCODE", "", "UK Biobank"))

```

\newpage

# Tables

```{r fit.coeffs, echo=FALSE}

fit.summarystats.study1 <- mrhevo.list[[1]]$global.summarystats[match(c("tau", "eta", "f", "theta"), param)]
fit.summarystats.study1[, variable := c("Global shrinkage $\\tau$",
                                        "Regularization $\\eta$",
                                        "Fraction nonzero pleiotropic effects $f$",
                                        "Causal effect $\\theta$")]
fit.summarystats.study1[, priormedian := c(round(mrhevo.list[[1]]$tau0, 2),
                                           mrhevo.list[[1]]$slab_scale, 0.2, 0)]
fit.summarystats.study2 <- mrhevo.list[[2]]$global.summarystats[match(c("tau", "eta", "f", "theta"), param)]
fit.summarystats.study2[, variable := c("Global shrinkage $\\tau$",
                                        "Regularization $\\eta$",
                                        "Fraction nonzero pleiotropic effects $f$",
                                        "Causal effect $\\theta$")]
fit.summarystats.study2[, priormedian := c(round(mrhevo.list[[2]]$tau0, 2),
                                           mrhevo.list[[2]]$slab_scale, 0.2, 0)]

fit.summarystats.bystudy <- rbind(fit.summarystats.study1, fit.summarystats.study2)
fit.summarystats.bystudy <- fit.summarystats.bystudy[, .(variable, priormedian,
                                                         q50=round(`50%`, 3), 
                                                         q10=round(`10%`, 3), q90=round(`90%`, 3))]

#fit.coeffs.bystudy <- fit.coeffs.bystudy[variable != "lp__"]
kable(fit.summarystats.bystudy, 
      escape=FALSE,
      #linesep = c(rep("", nrow(fit.coeffs.studies[[1]][[1]]) - 1), "\\addlinespace"),
      col.names = c("Parameter", "Prior median", "Median", "10th centile", "90th centile"),
	  label="posteriorsummaries",
      booktabs=TRUE,
      longtable=FALSE,
      caption="Posterior summaries of global parameters of MR-Hevo models of the effect of instruments for adiponectin levels on type 2 diabetes") %>%
    add_header_above(c(" "=2, "Posterior distribution"=3)) %>%
    group_rows("DeCODE", 1, 4) %>% 
    group_rows("UK Biobank", 5, 8) %>% 
  kable_styling(latex_options=c("HOLD_position")) # , "scale_down")) 

```
\newpage

```{r table.estimates, echo=FALSE, warning=FALSE, message=FALSE}

methods.dt <- rbindlist(lapply(mrhevo.list, function(x) x$estimators.dt))
methods.dt[, Estimator := gsub("ize over direct effects", " likelihood", Estimator)]
kable(methods.dt[, .(Estimator,
                     round(Estimate, 2),
                     CI=paste0(round(cl.lower, 2), ", ", round(cl.upper, 2)), 
                     pvalue.formatted)],
      linesep = c("", "", "", "\\addlinespace"),
      escape=FALSE,
      booktabs=TRUE,
      longtable=FALSE,
      col.names=c("Estimator", "Estimate", "95\\% CI", "\\ensuremath{p}-value"),
      label="estimates",
      caption=paste("Effect of adiponectin on type 2 diabetes: comparison of causal effect parameter estimates obtained with different methods")) %>%
    group_rows("DeCODE instruments", 1, 4) %>% 
    group_rows("UK Biobank instruments", 5, 8) %>% 
    kable_styling(latex_options=c("HOLD_position")) 

```

\beginsupplement
\newpage

# Supplementary information{-}

## Supplementary Figures

```{r pairs.instruments, echo=FALSE, out.width="1.0\\linewidth", fig.cap="\\label{fig:pairs}Pairs plot of posterior samples using UK Biobank instruments for adiponectin levels"}

gwasids <- c(1701717, 1705224)

#knitr::include_graphics(paste0("../ukbb/", pheno.name, "_", "ADIPOQ", "_",
#                               gwasids[1], "_mrhevo_pairsplot.pdf"))

knitr::include_graphics(paste0("../ukbb/", pheno.name, "_", "ADIPOQ", "_",
                               gwasids[2], "_mrhevo_pairsplot.pdf"))

```

\newpage

```{r densitytheta, echo=FALSE, fig.width=4, fig.height=7, fig.cap="\\label{fig:densityukbb}Posterior density and log-likelihood of causal effect parameter $\\theta$ for effect of adiponectin on type 2 diabetes, using UK Biobank instruments"}

#mrhevo.list[["DeC"]]$p.loglik

mrhevo.list[["UKB"]]$p.bayesian.loglik

```

\newpage

```{r weightedmedian, echo=FALSE, fig.width=7, fig.cap="\\label{fig:weightedmedian}Comparison of p-values from weighted median estimators with p-values from marginal likelihood, for top 20 associations of genome-wide aggregated \\textit{trans}- scores with type 2 diabetes in UK Biobank cohort"}

mrhevo.genes[, minuslog10p := -log10(pvalue)]
mrhevo.genes.pvalue <- dcast(mrhevo.genes, gene + study + gwasid ~ Estimator,
                             value.var="minuslog10p")

options(warn=1)
p.wm <- ggplot(mrhevo.genes.pvalue, aes(x=`Marginalize over direct effects`,
                                y=`Weighted median`, label=gene)) +
    geom_point() + coord_fixed() +
    ggrepel::geom_text_repel(color="blue", fontface="italic", size=2) + 
    geom_abline(intercept=0, slope=1, linetype="dotted") +
    xlab("-log10 p: marginal likelihood") +
    ylab("-log10 p: weighted median") 

p.pwm <- ggplot(mrhevo.genes.pvalue, aes(x=`Marginalize over direct effects`,
                                y=`Penalized weighted median`, label=gene)) + 
    geom_point() + coord_fixed() +
    ggrepel::geom_text_repel(color="blue", fontface="italic", size=2) + 
    geom_abline(intercept=0, slope=1, linetype="dotted") +
    xlab("-log10 p: marginal likelihood") +
    ylab("-log10 p: weighted median") 

cowplot::plot_grid(p.wm, p.pwm, ncol=2)

```

\newpage
## Supplementary Tables

\small

```{r table.coeffs, echo=FALSE, warning=FALSE, message=FALSE}

shorten.string.commas <- function(x) {
    if(!is.na(x)) {
        x <- trimws(unlist(strsplit(x, split=",")))
        L <- length(x)
        x.dt <- data.table(x)
        x.dt[, rownum := .I]
        x.dt[, gwas.hit := x %in% gwas.hits]
        x.dt[gwas.hit==TRUE, x := paste0("\\textbf{", x, "}")]
        x.dt <- x.dt[rownum < 3 | .N - rownum < 3 | gwas.hit]
 #       if(x.dt[, .N] > 5) {
 #           x <- c(x.dt[1:2, x], "...", x.dt[(L - 1):L, x])
 #       }
        x <- paste(x.dt$x, collapse=",")
    }
    return(x)
}
    
shorten.list.commas <- function(v) {
    z <- sapply(X=v, FUN=shorten.string.commas)
    return(z)
}

coeffs.table <- rbindlist(lapply(mrhevo.list, function(x) x$coeffs.dt))[, .(study, chrom_int,
                                         startpos=round(startpos * 1E-6, 1),
                                         endpos=round(endpos * 1E-6, 1),
                                         gsub(",", ", ",
                                              gsub("_", "\\\\_",
                                                   shorten.list.commas(nearby_genes))),
                                         round(gamma_hat, 3),
                                         round(se.gamma_hat, 3), 
                                         round(alpha_hat, 4),
                                         round(se.alpha_hat, 4),
                                         theta=round(theta_IV, 3))]
setorder(coeffs.table, study, chrom_int, startpos)
kable(coeffs.table,
      escape=FALSE,
      linesep = c(rep("", nrow(mrhevo.list[[1]]$coeffs.dt) - 1), "\\addlinespace"),
      booktabs=TRUE,
      longtable=TRUE,
      col.names=c("Study", "Chr", "Start (Mb)", "End (Mb)", "Nearby genes",
                  "$\\hat{\\gamma}$", "SE($\\hat{\\gamma}$)",
                  "$\\hat{\\alpha}$", "SE($\\hat{\\alpha}$)",
                  "$\\hat{\\theta}$"),
      label="coeffs",
      caption=paste("Summary statistics for regression of type 2 diabetes and adiponectin levels on genetic instruments")) %>%
  footnote(general_title="",
           threeparttable=TRUE,
           footnote_as_chunk = TRUE,
           general=paste("Genes in bold are those for which GWAS Catalog lists associations  with type 2 diabetes detected at $p < 5 \\\\times 10^{-8}$."),
          escape=FALSE) %>% 
  column_spec(1, width="0.8cm") %>% 
  column_spec(2, width="1cm") %>% 
  column_spec(3, width="1cm") %>% 
  column_spec(4, width="1cm") %>% 
  column_spec(5, width="5cm", italic=TRUE) %>% 
    kable_styling(latex_options=c("HOLD_position", "scale_down")) 

```
\normalsize