Skip to content

Commit

Permalink
Merge pull request #1844 from milaboratory/exports-for-trees
Browse files Browse the repository at this point in the history
trees exports
  • Loading branch information
gnefedev authored Oct 31, 2024
2 parents cc74062 + ea49a4f commit 1e0edd1
Show file tree
Hide file tree
Showing 7 changed files with 102 additions and 52 deletions.
2 changes: 1 addition & 1 deletion build.gradle.kts
Original file line number Diff line number Diff line change
Expand Up @@ -134,7 +134,7 @@ val toObfuscate: Configuration by configurations.creating {
val obfuscationLibs: Configuration by configurations.creating


val mixcrAlgoVersion = "4.7.0-31-fixes"
val mixcrAlgoVersion = "4.7.0-35-exports-for-trees"
// may be blank (will be inherited from mixcr-algo)
val milibVersion = ""
// may be blank (will be inherited from mixcr-algo or milib)
Expand Down
10 changes: 10 additions & 0 deletions changelogs/v4.7.1.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,19 @@
## ❗ Breaking changes

- Shortcut options `-p` and `-pf` are removed from export commands. They are conflicting with new added export fields
`-pN`, `-pS` and `-pNpS`

## 🛠️ Other improvements & fixes

- Tracking the progres of fasta[.gz] inputs on `align`.
- Improved alignment rate of short parts of genes in reads with height rate of mutations
- Fixed combination of `-Massemble.sortBySequence=true` and `--assemble-clonotypes-by` with composite feature
- For `findShmTree` command `--alleles-search-could-be-skipped` option is added for special cases
- Added `[withAlignmentGaps]` option to features exports in `exportShmTreesWithNodes` command. That will add gaps, so
results could be interpreted as alignments of all nodes to root sequence
- Added `-pN`, `-pS`, `-pNpS` options for `exportShmTreesWithNodes` command
- Fixed `findShmTrees` command. Now it outputs root node with the reconstructed NDN region (before it was filled with N
letters)

## 📚 New Presets

Expand Down
4 changes: 2 additions & 2 deletions regression/cli-help/exportReportsTable.txt
Original file line number Diff line number Diff line change
Expand Up @@ -32,8 +32,8 @@ Export reports from files in tabular format.
[report.tsv] Path where to write reports. Print in stdout if omitted.
--without-upstreams Don't export upstream reports for sources of steps with several
inputs, like `findShmTrees`.
-p, --preset <preset> Specify preset of export fields. Possible values: min, full.
-pf, --preset-file <presetFile>
--preset <preset> Specify preset of export fields. Possible values: min, full.
--preset-file <presetFile>
Specify preset file of export fields
--no-header Don't print first header line, print only data
-f, --force-overwrite Force overwrite of output file(s).
Expand Down
64 changes: 42 additions & 22 deletions regression/cli-help/exportShmSingleCellTrees.txt
Original file line number Diff line number Diff line change
Expand Up @@ -10,21 +10,24 @@ Usage: mixcr exportShmSingleCellTrees [--include-one-chain-trees] [--only-observ
[-rootsCountPerChain] [-treeHeight] [-nodeId] [-isObserved]
[-parentId] [-distance (germline|mrca|parent)]... [-vHit]
[-jHit] [-vGene] [-jGene] [-vFamily] [-jFamily] [-nFeature
<gene_feature> [(germline|mrca|parent)]]... [-allNFeatures
[(germline|mrca|parent)]]... [-aaFeature <gene_feature>
[(germline|mrca|parent)]]... [-allAAFeatures
[(germline|mrca|parent)]]... [-nFeatureImputed <gene_feature>
[(germline|mrca|parent)]]... [-allNFeaturesImputed
[<from_reference_point> <to_reference_point>]
[(germline|mrca|parent)]]... [-aaFeatureImputed
<gene_feature> [(germline|mrca|parent)]]...
[-allAAFeaturesImputed [<from_reference_point>
<to_reference_point>] [(germline|mrca|parent)]]... [-nLength
<gene_feature[,gene_feature]...> [(germline|mrca|parent)]]...
[-allNLength [(germline|mrca|parent)]]... [-aaLength
<gene_feature[,gene_feature]...> [(germline|mrca|parent)]]...
[-allAALength [(germline|mrca|parent)]]... [-nMutations
<gene_feature> [(germline|mrca|parent)]
[withAlignmentGaps]]... [-allNFeatures
[(germline|mrca|parent)] [withAlignmentGaps]]... [-aaFeature
<gene_feature> [(germline|mrca|parent)]
[withAlignmentGaps]]... [-allAAFeatures
[(germline|mrca|parent)] [withAlignmentGaps]]...
[-nFeatureImputed <gene_feature> [(germline|mrca|parent)]]...
[-allNFeaturesImputed [<from_reference_point>
<to_reference_point>] [(germline|mrca|parent)]]...
[-aaFeatureImputed <gene_feature>
[(germline|mrca|parent)]]... [-allAAFeaturesImputed
[<from_reference_point> <to_reference_point>]
[(germline|mrca|parent)]]... [-nLength <gene_feature[,
gene_feature]...> [(germline|mrca|parent)]]... [-allNLength
[(germline|mrca|parent)]]... [-aaLength <gene_feature[,
gene_feature]...> [(germline|mrca|parent)]]... [-allAALength
[(germline|mrca|parent)]]... [-nMutations <gene_feature>
[(germline|mrca|parent)]
[(substitutions|indels|inserts|deletions)]]...
[-allNMutations [(germline|mrca|parent)]
[(substitutions|indels|inserts|deletions)]]...
Expand Down Expand Up @@ -57,8 +60,10 @@ Usage: mixcr exportShmSingleCellTrees [--include-one-chain-trees] [--only-observ
gene_feature]...>]
[(substitutions|indels|inserts|deletions)]]...
[-aaMutationsRate [<gene_feature[,gene_feature]...>]
[(substitutions|indels|inserts|deletions)]]... [-isotype
[(primary|subclass|auto)]]... [-cellId
[(substitutions|indels|inserts|deletions)]]... [-pN
[<gene_feature[,gene_feature]...>]]... [-pS [<gene_feature[,
gene_feature]...>]]... [-pNpS [<gene_feature[,gene_feature]...
>]]... [-isotype [(primary|subclass|auto)]]... [-cellId
[(none|space|dash)]]... [-uniqueTagCount
(Molecule|Cell|Sample)]... [-cellGroup] [-hasStops
<gene_feature>]... [-isOOF <gene_feature>]... [-isProductive
Expand All @@ -84,8 +89,8 @@ separate columns. Initial data for building tree should contain cell data.
--ids <id>[,<id>...] Filter specific trees by id
--chains <chains> Export only trees that contains clones with specific chain (e.g. IGK,
IGL or IGH).
-p, --preset <preset> Specify preset of export fields. Possible values: min, default.
-pf, --preset-file <presetFile>
--preset <preset> Specify preset of export fields. Possible values: min, default.
--preset-file <presetFile>
Specify preset file of export fields
--cell-type <cell_type>...
Export SHM trees for given cell type. By default all will be exported.
Expand Down Expand Up @@ -140,19 +145,19 @@ Possible fields to export
-jGene Export best J hit gene name (e.g. TRBV12-3 for TRBV12-3*00)
-vFamily Export best V hit family name (e.g. TRBV12 for TRBV12-3*00)
-jFamily Export best J hit family name (e.g. TRBV12 for TRBV12-3*00)
-nFeature <gene_feature> [(germline|mrca|parent)]
-nFeature <gene_feature> [(germline|mrca|parent)] [withAlignmentGaps]
Export nucleotide sequence of specified gene feature.
If second arg is omitted, then feature will be printed for current
node. Otherwise - for corresponding `parent`, `germline` or `mrca`.
-allNFeatures [(germline|mrca|parent)]
-allNFeatures [(germline|mrca|parent)] [withAlignmentGaps]
Export nucleotide sequences for all covered gene features.
If first arg is omitted, then feature will be printed for current
node. Otherwise - for corresponding `parent`, `germline` or `mrca`.
-aaFeature <gene_feature> [(germline|mrca|parent)]
-aaFeature <gene_feature> [(germline|mrca|parent)] [withAlignmentGaps]
Export amino acid sequence of specified gene feature.
If second arg is omitted, then feature will be printed for current
node. Otherwise - for corresponding `parent`, `germline` or `mrca`.
-allAAFeatures [(germline|mrca|parent)]
-allAAFeatures [(germline|mrca|parent)] [withAlignmentGaps]
Export amino acid sequences for all covered gene features.
If first arg is omitted, then feature will be printed for current
node. Otherwise - for corresponding `parent`, `germline` or `mrca`.
Expand Down Expand Up @@ -308,6 +313,21 @@ Possible fields to export
Resolutions of wildcards in CDR3 are excluded from calculation.
Second parameter will filter mutations by type, by default no
filtering.
-pN [<gene_feature[,gene_feature]...>]
Count of non-synonymous substitutions normalized by total count of
possible non-synonymous substitutions. By default, will be used all
covered features. For wildcards in VJJunction all possible
combinations will be calculated and averaged.
-pS [<gene_feature[,gene_feature]...>]
Count of synonymous substitutions normalized by total count of
possible synonymous substitutions. By default, will be used all
covered features. For wildcards in VJJunction all possible
combinations will be calculated and averaged.
-pNpS [<gene_feature[,gene_feature]...>]
Ration of pN to pS. In case of no mutations value will be absent.By
default, will be used all covered features. For wildcards in
VJJunction all possible combinations will be calculated and
averaged. If pN or pS is zero, no value will be exported.
-isotype [(primary|subclass|auto)]
Export isotype for IGH chains if it's can be distinguishable.
`primary` will resolve 'IgA', 'IgD', 'IgG', 'IgE', 'IgM'. `subtype`
Expand Down
4 changes: 2 additions & 2 deletions regression/cli-help/exportShmTrees.txt
Original file line number Diff line number Diff line change
Expand Up @@ -26,9 +26,9 @@ data)
--ids <id>[,<id>...] Filter specific trees by id
--chains <chains> Export only trees that contains clones with specific chain (e.g. IGK,
IGL or IGH).
-p, --preset <preset> Specify preset of export fields. Possible values: default,
--preset <preset> Specify preset of export fields. Possible values: default,
defaultSingleCell, min.
-pf, --preset-file <presetFile>
--preset-file <presetFile>
Specify preset file of export fields
--no-header Don't print first header line, print only data
--not-covered-as-empty Export not covered regions as empty text.
Expand Down
Loading

0 comments on commit 1e0edd1

Please sign in to comment.