Replies: 5 comments 7 replies
-
(partial) derivate chromosome (i.e. haplotype) reconstruction does require variant ordering for it to be unambiguous. We could support both ordered and unordered haplotypes but then we're changing our definition of a haplotype. From my perspective, unordered is just an implicit linear ordering of the Alleles with respect to the reference. Is this not the case? When SVs are present, you can't assume that the standard intuitive textbook definition of one maternal + one paternal copy = 2 haplotypes. So what does haplotype actually mean then? For example, are de novo mutations on 2 different maternal copies of a trisomy chromosome considered the same haplotype? (No) What if there's one maternal chromosome but the region has been duplicated? (Yes?) Do the variants in the same duplicated region belong on the same haplotype or on different haplotypes? (same). None of this is made explicit in VRS and, given we have SV support, we need to define haplotype in more detail.
Note that our existing definition of haplotype as based on the same physical molecule means that all VRS haplotypes implicitly require that the caller has verified the absence of any structural rearrangement that could separate the variants onto different physical molecules. For example, if you have fully phased (T2T) SNVs & a balanced chromosomal translocation is present, you need to break up your haplotype blocks because they're on different physical molecules. Is this what we intend? There are other definitions out there. If I say I've got fully phased maternal/paternal haplotypes then I'm making a claim about the inheritance pattern of those variants, not about the relationship between the physical molecules they occur on. Without SVs, these two definitions mean the same thing. With SVs, they do not. |
Beta Was this translation helpful? Give feedback.
-
I'm in favor of a single, ordered haplotype model, but am interested in hearing more about the perceived benefits of creating a separate, unordered haplotype model. My primary concerns are that it would create unnecessary complexity and more situations where there are multiple ways to represent the same haplotype. Even if we defined an unordered haplotype model, we can't escape the need for an implicit ordering if we want to unordered haplotypes to have consistent identifiers, regardless of the order in which their alleles are specified. I think in VRS 1.3 this was done by ordering arrays of digests by unicode character sets values. |
Beta Was this translation helpful? Give feedback.
-
A draft model of 4 proposed changes between VRS
These changes address the concerns about ordered / unordered variant members by creating data classes specifically for each of cis-phased variant and derivative sequences. |
Beta Was this translation helpful? Give feedback.
-
If you are looking at the 2.x vrs.ga4gh.org readthedocs specification the |
Beta Was this translation helpful? Give feedback.
-
@ahwagner As I'm converting the 1.3 sv_haplotype test I noticed that we do not have |
Beta Was this translation helpful? Give feedback.
-
The current Haplotype proposal for VRS 2.0 allows Adjacencies to accompany Alleles in the Haplotype
members
. Since Adjancency order is meaningful, the haplotype members array is nowordered
. However, this requires some additional rules around expected order of Alleles.@rrfreimuth recommended that we consider a model that does not impose ordering constraints on Alleles, perhaps as separate models. We are opening discussion on this point to collect opinions and ideas.
Beta Was this translation helpful? Give feedback.
All reactions