28 Nov 09:34

github-actions

ec9c6fd

MiXCR v4.1.2

Major changes

Command findShmTrees now can build trees from inputs with different tags
Added --impute-germline-on-export and --dont-impute-germline-on-export to exportAlignments and exportClones
commands

Minor improvements

Now, instead of specifying separately multiple tags of the same type (i.e. CELL1+CELL2+CELL3) in filters, one can use
convenient aliases (like allTags:Cell, allTags:Molecule). This also facilitates creation of a more generic base
presets implementing common single-cell and UMI filtering strategies.
Several command line interface improvements
Migration from <tag_name> to <tag_type> semantics in export columns and --split-by-tag options

Fixes

fixes bug with saveOriginalReads=true on align leading to errors down the pipeline
analyze now correctly terminates on first error
correct progress reporting in align with multiple input files provided by file name expansion mechanism
fix --only-observed behaviour in exportShmTreesWithNodes
fix missing tile in heatmap
fix some cases of usage of -O...

Presets

Fixed issue with mouse presets from MiLaboratories
Fixed presets with whitelists
Fixed missing material type and species in several presets
Added template switch region trimming for RACE protocols
Added presets for
- Thermo Fisher Oncomine kits
- ParseBio single-cell protocols
- iRepertoire kits
- Preset for protocol described in Vergani et al. (2017)
- Cellecta AIR kit

Assets 3

11 Nov 14:55

github-actions

v4.1.1

3418ff5

MiXCR v4.1.1

Overview

With this release we continue extending the set of supported single-cell protocols by adding new ready-to-use presets to our collection. Additionally to newly supported protocols and features required for their reliable processing this release comes with many usability optimizations and stability improvements. See details below:

Major changes

presets for analysis of all types of BD Rhapsody data (see docs for the list of supported kits)
analysis of data produced by single-cell protocol described in Han et al. (2014) (see docs)
special presets for exom data analysis exom-cdr3 and exom-full-length
initial support for overlap-extension-based chain pairing protocols
possibility to export groups of similar columns specifying single option (like -allAAFeatures <from_reference_point> <to_reference_point>)
user-friendly alternative for -uniqueTagsCount - -allUniqueTagsCount; allows to export counts of unique tag combinations (useful for protocols with multiple CELL and UMI barcodes)
new "by sequence" filters for all somatic hypermutation trees (SHMT) exports
new weighted auto-threshold selection and complementary metric histogram aggregation modes (i.e. y-axis on reads-pre-UMI plots now can show number of reads instead of number of UMIs)
detected allelic variants are now can also be exported in fasta format right from the findAlleles command
better algorithm for seed sequence selection in consensus assembly routine in assemble; increases productive consensus count for cases with multi-variant tag groups (i.e. birthday paradoxes in UMI data or single-cell data analysis without UMIs)

Minor changes:

minor adjustments for existing presets
many CLI and parameter validation fixes, more human-readable error messages, better protection from common input errors
support for preset-embedded tag whitelists for protocols with small number of barcode variants
options --use-local-temp, --threads, --not-aligned-R1(R2) and --not-parsed-R1(R2) are now available in analyze, additionally to individual step commands
bugfix for imputation in export for compound gene features
other minor fixes and enhancements

Assets 3

31 Oct 07:15

github-actions

v4.1.0

852d9d8

MiXCR v4.1

Overview

MiXCR 4.1 features two major functional upgrades:

essential fixes and improvements for the single-cell and molecular-barcoded data processing algorithms
new powerful set of tools for allelic variant discovery and analysis of antibody hypermutation trees

Along with these features, release brings radically simplified user interface, which reduces all the complexities of repertoire analysis pipeline down to a single command, where only one option, the “preset”, has to be specified. MiXCR 4.1 is shipped with many of specifically optimized presets, for most of the repertoire analysis cases. Upgrades, introduced in this release, also significantly increases transparency of analysis pipeline, by providing a diverse set of new graphical QC reports and adding dozens of new metrics to textual and JSON reports. Additionally, this release incorporates tens of important fixes, performance optimizations and stability improvements.

Documentation portal

Along with the software release, we present a new documentation portal. It features a clean content organization, informative illustrations, deep guides on many real-world repertoire analysis scenarios and detailed descriptions for each of the MiXCR commands and analysis presets.

Welcome to https://docs.milaboratories.com/

Improvements for single-cell and molecular-barcoded data analysis

Based on our deep research of a large number of single-cell and molecular barcoded datasets, generated with dozens of protocols and instruments in a wide set of laboratory setups, we developed several important upgrades to the algorithms engaged in analysis of tagged data. With all the improvements and fixes, MiXCR 4.1 produces clean and reliable results for the majority of popular wet-lab protocols, being robust to a wide range of protocol noises, cross contamination mechanisms and artifacts. The set of tools offered by MiXCR 4.1 allows it to be applied for virtually any data of such type.

Featured fixes and upgrades:

new high-performance aligner settings optimized for single-cell T- and B-cell receptor datasets
important fixes for assemblePartial algorithm for tagged data
redesign of tag correction algorithm to increase performance and decrease memory consumption
whitelist-based barcode correction in refineTagsAndSort step (f/k/a correctAndSortTags)
comprehensive options for data filtering, applied right after barcode sequence correction (in refineTagsAndSort)
algorithms for automated threshold selection in refineTagsAndSort filters
multiple improvements for consensus assembly algorithm (which pre-assembles consensuses from tagged groups in assemble); increased performance and stability in respect to data artifacts
automated inference of minimal number of reads in consensus
de-contamination filters in assemble to fight cross-cell contaminations
rework of assembleContigs algorithm to increase robustness in respect to data artifacts
many new QC metrics from tag pattern parsing, sequence correction, consensus to contig assembly algorithms

SHM trees & Allele discovery

MiXCR 4.1 introduces two new comprehensive tools for analysis of hypermutation trees of antibodies. The first is the de-novo discovery of V and J gene alleles provided by the findAlleles command. And the second is the SHM trees reconstruction tool provided by the findShmTrees command. These two features go hand in hand and help each other to accurately separate allelic variants from somatic mutations and reconstruct mutation tree topology, given the set of samples for the same individual. We implemented new original algorithms for these tasks, both are based on sophisticated analysis of alignments with germline segments, rather than naive reconstruction of mutation histories regardless of the sequence structure, as implemented in other tools. This functionality is accompanied by a set of commands to export SHM trees in several formats: exportShmTrees, exportShmTreesWithNodes, exportShmTreesNewick and exportPlots shmTrees.

For the correct lineage tree reconstruction, it is critical to first have accurate V- and J-gene allele information for a particular donor or mouse strain. Hence, it is highly recommended to first run findAlleles and re-align all clonotype sequences (option -o) to a newly generated individual reference V- and J-gene library.
findAlleles utilizes an allele inference algorithm which can use even somatically hypermutated clonal sequences as input data.
Both findAlleles and findShmTrees commands support multiple .clns files input - so the alleles can be inferred and lineage trees can be reconstructed using all available datasets. Note that it only makes sense to use datasets derived from an individual donor (or homogenic mouse strain) per command launch.
All commands produce extensive reports and auxiliary tables providing additional transparency in the algorithm performance

Presets and refreshed CLI

From now on, most users can run the whole pipeline, specifying just a single option, the preset name, in addition to the input and output file names.

MiXCR provides tens of fine tuned sets of parameters (presets) to extract repertoires from the data generated with most of the commercially available kits and instruments as well as with the well established open protocols, including single-cell, bulk repertoire sequencing with or without molecular-barcodes and non-enriched data like RNA-Seq.

For example you can run the whole analysis (from fastq to clonesets) for the dataset generated with MiLaboratories Human TCR RNA Multiplex kit using the following command:

mixcr analyze milab-human-tcr-rna-multiplex-cdr3 input_file_R1.fastq.gz input_file_R2.fastq.gz results_prefix

This will produce a full set of intermediate files, with tsv clonesets and extensive report files both in txt and json formats.

The preset functionality is accompanied by the set of special high level command line options, we call mixins, that help to adapt the selected preset if experimental setup requires non-standard analysis (though it is not required in most cases).

The following improvements were made to MiXCR’s CLI:

analyze command was completely redesigned (see example above)
mixin options were introduced; can be specified on analyze, align or, for some mixins, on other pipeline stages
new refreshed and polished CLI help
new safer and more reliable file name expansion mechanism, {{a}} and {{R}} pattern elements added; now one can specify ... input_file_{{R}}.fastq.gz output.vdjca instead of ... input_file_R1.fastq.gz input_file_R2.fastq.gz output.vdjca
all reports and analysis parameters are now embedded into the output files and can be easily retrieved afterwords

Graphical QC plots

MiXCR 4.1 introduces a new exportQc command to visualize different quality control metrics including alignment performance, chain usage, reads coverage, barcode abundance distribution, automatically selected correction threshold etc.

Many other fixes & improvements

fix a bunch of visualization issues #743, #747, #748, #749, #750, #751
added bar plot gene usage plots
added gene family usage plots
better naming for diversity and overlap measures
rename biophysics to cdr3metrics in postanalysis
support of svg / png and other graphical formats in exportPlots
allow samples with different data types (umi/no-umi) been used in overlapScatter when
implement cutting contig results by assemble region
introduce --pairwise-comparisons instead of --hide-pairwise-comparison in exportPlots diversity / biophysics
fixed wrong sign for hydrophobicity metric in downstream analysis
fixed incorrect behaviour of clonotype splitting by V, J and C genes
multiple bug fixes for post analysis downsampling
added --show-significance option in exportPlots diversity / biophysics
fix NPE in overlap browser when some clone do not contain gene feature specified in overlap criteria
splitting of clones on export; there is no need to run exportClones command multiple times (only “by chain” option is currently implemented)
new export fields for single-cell and molecular barcodes (i.e. -tagFraction)
fixes for --not-aligned-R1/2 option for tagged analysis
incomplete V gene feature correction for AIRR export, if vFeatureToAlign was adjusted to exclude primer sequence from alignment
options to export reads that were not parsed according to the tag pattern (--not-parsed-R1/2)
start from BAM file
CLI and several other parts are (re)implemented in Kotlin
temporary files are now by default are placed to the system temp folder; option to move them in the folder of output files --use-system-temp
fixed bug in assemble report caused by pre-clone assembler which did not reported failed to extract target
fixed NPE in assembleContigs with disjoint features (#727)
better ChainsUsage report (#732)
factor-by option for overlap downstream analysis
allow lowercase...

Assets 3

10 Jun 01:22

dbolotin

v4.0.0

c604f8d

MiXCR v4.0

Comprehensive support for Single-Cell and Molecular barcodes

flexible and fast pattern matching engine to parse barcodes from the data; allows to fit the pipeline to any
commercially available or in-house wet lab protocol with molecular or/and cell barcodes
error correction in barcode sequences
two cooperating UMI and/or Cell-barcode-based steps for clonal sequence reconstruction:
- consensus assembly (i.e. for well-framed amplicon sequencing)
- contig assembly (i.e. for 10x-like enzymatically fragmented data)
tag information preserved on all analysis steps and extensive QC reports are generated throughout the pipeline,
providing maximal visibility into analysis performance and giving a powerful tool for wet lab issues investigation

See the following usage examples:

https://mixcr.com/mixcr/reference/overview-built-in-presets/#10xgenomics

Downstream analysis

Set of powerful downstream analysis features with the ability to export postanalysis results in tabular format and vector plots with various statistical comparisons.

Ability to group samples by metadata values and compare repertoire features between groups
Comprehensive repertoire normalization and filtering
Statistical significance tests with proper p-value adjustment
Repertoire overlap analysis
Vector plots output (.svg / .pdf)
Tabular outputs

See the following usage guide:

https://mixcr.com/mixcr/reference/mixcr-postanalysis/

Overlap browser

Added command exportClonesOverlap allowing to efficiently build and export overlap of the arbitrary number of clonesets.

Major rework of contig assembly algorithm

significantly increased accuracy and stability
works with or without molecular or cell barcodes
can be applied to (sc)RNASeq data with reasonable IG/TCR coverage to reconstruct long sequence outside the CDR3

Export in AIRR format

multiple options to export alignment or clonal data in AIRR format
provides better compatibility with 3rd-party tools from AIRR community (see also RepSeq.IO feature for generation of fasta libraries with IMGT-like gaps from repseqio formatted references)

See here for usage example.

Other improvements and changes

new built-in reference library with new species and newest genome based library for human
(see changelog here)
complete rewrite of IO for intermediate files (much faster IO with parallel serialization and deserialization,
more compact files - each block is compressed with LZ4, versatile random access features provides additional speedup)
faster hash-based external (file-based) sorting algorithm for alignment and other regrouping tasks in UMI/Single-cell
related tasks and operations requiring alignment to clone mapping
input sequence quality-score based trimming enabled by default
support for human-readable alignments export from *.clna files by clone index
all steps are cleaned-up to be completely pure, i.e. for the same input, output will always be byte-to-byte equal
(no analysis date or other variable pieces of information leaks to the output files)
more stable amino acid and combined amino acid plus nucleotide mutations export
slight default analysis parameter optimization

Obtaining a license file

MiXCR requires a license file to run. Academic users with no commercial funding can quickly obtain a MiXCR license for free at https://licensing.milaboratories.com/. We are committed to support academic community and provide our software free of charge for scientists doing non-profit research. Commercial trial license can be requested at https://licensing.milaboratories.com or by email to licensing@milaboratories.com.

For details see: https://mixcr.com/mixcr/getting-started/milm/

Assets 3

15 Apr 15:35

dbolotin

v3.0.13

045e9af

MiXCR v3.0.13

Fixed bug with wrong V gene selection in assembleContigs.
analyze doesn't use .clna when contig assembly is not specified
Fixed AlignConfiguration to account for trimming
Added --threads option to analyze
Added --library option to analyze
Bug fix in partial assembler

MiXCR is free for non-profit use only (see LICENSE for details)!
For commercial use please contact licensing@milaboratory.com.

Assets 3

20 Nov 13:40

dbolotin

v3.0.12

ace1ef9

MiXCR v3.0.12

Built-in reference library upgraded to v1.6 (see changes)
Additional mixcr script optimizations for docker

MiXCR is free for non-profit use only (see LICENSE for details)!
For commercial use please contact licensing@milaboratory.com.

Assets 3

24 Oct 16:35

dbolotin

v3.0.11

9d22b8b

MiXCR v3.0.11

Fixes exception in assemble for multi-assembling-feature cases with zero length sequences
Fixes empty FR3 imputed sequence in cases with zero assembled nucleotides on the 5' side of CDR3
MiXCR execution script optimized for Docker (Java 11 is recommended for running MiXCR in container environment)
Other fixes for MiXCR script

Starting from this release we will maintain Official MiXCR Docker Image.

MiXCR is free for non-profit use only (see LICENSE for details)!
For commercial use please contact licensing@milaboratory.com.

Assets 3

11 Sep 21:42

dbolotin

v3.0.10

8e2d32b

MiXCR v3.0.10

Fixes NPE in very rare cases with incompatible V gene selection for assembleContigs in case of partially annotated gene libraries
Several fixes for sequence imputation algorithm in exportAlignments / exportClones

MiXCR is free for non-profit use only (see LICENSE for details)!
For commercial use please contact licensing@milaboratory.com.

Assets 3

06 Aug 18:26

dbolotin

v3.0.9

1bc4875

MiXCR v3.0.9

Fixed wrong behaviour with score-based pre-filtering in split-by-V/J=true cases
Chain usage statistics added to align and assemble JSON reports
Fixed rare IndexOutOfBounds exception in -nFeatureImputed ...
Added shortcut for --json-report = -j
Sanity check for common mistake in analyze parameters

MiXCR is free for non-profit use only (see LICENSE for details)!
For commercial use please contact licensing@milaboratory.com.

Assets 3

22 Jul 18:38

dbolotin

v3.0.8

98f1240

MiXCR v3.0.8

Major changes

Alignments are forced to the corresponding edge if the same V/J gene is detected in the opposite PE read mate
Average quality threshold (-OaverageQualityThreshold=...) in assembleContigs increased to 20
Added exportAlignmentsForClones action
All indels in homopolimeric stretches are now shifted left in all alignment algorihtms
Smarter base V/J/C hit selection in assembleContigs
Read quality trimming (see help for --trimming-window-size and --trimming-quality-threshold options in align) (disabled by default until 3.1)
Fix exception in -mutationsDetailed and similar export options
Fixes in step skipping logic in analyze (additional automatic parameter adjustment for VDJC libraries covering only VRegion)
Several more fixes for assembleContigs

Minor changes

Fixes incorrect numbers in assemble report
Fix for filterAlignmentsAction

MiXCR is free for non-profit use only (see LICENSE for details)!
For commercial use please contact licensing@milaboratory.com.

Assets 3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Major changes

Minor improvements

Fixes

Presets

Overview

Major changes

Minor changes:

Overview

Documentation portal

Improvements for single-cell and molecular-barcoded data analysis

SHM trees & Allele discovery

Presets and refreshed CLI

Graphical QC plots

Many other fixes & improvements

Comprehensive support for Single-Cell and Molecular barcodes

Downstream analysis

Overlap browser

Major rework of contig assembly algorithm

Export in AIRR format

Other improvements and changes

Obtaining a license file

Major changes

Minor changes

Releases: milaboratory/mixcr

MiXCR v4.1.2

Major changes

Minor improvements

Fixes

Presets

MiXCR v4.1.1

Overview

Major changes

Minor changes:

MiXCR v4.1

Overview

Documentation portal

Improvements for single-cell and molecular-barcoded data analysis

SHM trees & Allele discovery

Presets and refreshed CLI

Graphical QC plots

Many other fixes & improvements

MiXCR v4.0

Comprehensive support for Single-Cell and Molecular barcodes

Downstream analysis

Overlap browser

Major rework of contig assembly algorithm

Export in AIRR format

Other improvements and changes

Obtaining a license file

MiXCR v3.0.13

MiXCR v3.0.12

MiXCR v3.0.11

MiXCR v3.0.10

MiXCR v3.0.9

MiXCR v3.0.8

Major changes

Minor changes