Unable to flatten sample/gene counts for table export #3665
Replies: 3 comments
-
Note The following post was exported from discuss.hail.is, a forum for asking questions about Hail which has since been deprecated. (May 12, 2022 at 17:22) tpoterba said:The main tools to use here are the MatrixTable methods annotations = mt.group_rows_by(gene = mt.info.SYMBOL).aggregate(
geneCounts =
nTotalSynonymousGene = hl.agg.count_where(
mt.info.Consequence.contains('synonymous_variant')
& mt.GT.is_non_ref()
& hl.is_defined(mt.GT)),
nTotalMissenseGene = hl.agg.count_where(
mt.info.Consequence.contains('missense_variant')
& mt.GT.is_non_ref()
& hl.is_defined(mt.GT)
)
)
)
annotations.select_cols().entries().export(...) |
Beta Was this translation helpful? Give feedback.
-
Note The following post was exported from discuss.hail.is, a forum for asking questions about Hail which has since been deprecated. (May 12, 2022 at 17:22) tpoterba said:might have extra/missing parens/commas above. |
Beta Was this translation helpful? Give feedback.
-
Note The following post was exported from discuss.hail.is, a forum for asking questions about Hail which has since been deprecated. (May 13, 2022 at 09:03) DavideB said:Thank you so much! It worked! |
Beta Was this translation helpful? Give feedback.
-
Note
The following post was exported from discuss.hail.is, a forum for asking questions about Hail which has since been deprecated.
(May 12, 2022 at 10:01) DavideB said:
Hi there!
I’m new to Hail, and I was hoping you could help me the export of a MatrixTable annotation.
I believe I’m doing something relatively common, i.e. counting the number of variants of a certain category in each gene of each sample.
I’d like to export these counts in a flattened way that’s readable in R, but I can’t find a way to flatten the structure to something like:
sampleName gene nMissense nSynon
Alternatively, I’d be happy to export a valid JSON file which I could read with rjson or jsonlite in R.
I’ve tried the following:
in this case, each annotation would have the same keys (i.e. same genes), which I don’t know how to combine
in this way I have everything I need
I’m giving an example of this second, to keep my post as simple as possible.
Annotation step:
I have then saved the MatrixTable cols() into a separate object, in order to handle a Table I can export in a TSV file (or JSON):
The Table has this structure:
But when I try to flatten, nothing happens and a JSON-like file is exported.
I haven’t succeeded exporting a TSV file.
If I read the produced JSON-like file into R, with jsonlite I get an error referring to the formatting indicating trailing garbage, while rjson indicates there’s unquoted strings.
Beta Was this translation helpful? Give feedback.
All reactions