mergem is a python package and command-line tool for merging, comparing, and translating genome-scale metabolic models.
Use pip to install the latest release:
pip install mergem
For detailed usage instructions, please refer to the documentation.
Command-line options can be viewed using "--help" flag, as shown below:
> mergem --help
Usage: mergem [OPTIONS] [INPUT_FILENAMES]...
mergem takes genome-scale metabolic models as input, merges them into a single model and saves the merged model as .xml. Users can optionally select the objective, provide an output filename for the merged model, and translate the models to a different namespace.
Lobo Lab (https://lobolab.umbc.edu)
Options:
-obj TEXT Set objective: 'merge' all objectives (default) or 1, 2, 3... (objective from one of the input models)
-o TEXT Save model as (filename with format .xml, .sbml, etc.)
-v Print merging statistics
-up Update ID mapping table
-s Save ID mapping table as CSV
-e Uses exact stoichiometry when merging reactions
-p Consider protonation when merging reactions
-a Extend annotations with mergem database of metabolites and reactions
-t TEXT Translate all metabolite and reaction IDs to a target namespace (chebi, metacyc, kegg, reactome, metanetx, hmdb, biocyc, bigg, seed, sabiork, or rhea)
--version Show the version and exit.
--help Show this message and exit.
For merging two models and setting objective of merged model from first model, use:
mergem -i model1.xml -i model2.xml -obj 1
To print merging statistics, append the "-v" flag:
mergem -i model1.xml -i model2.xml -obj 1 -v
To use mergem within a python script, simply import the package with:
import mergem
For merging or processing one, two, or more models, provide a list of models to the merge function:
results = mergem.merge(input_models, set_objective='merge', exact_sto=False use_prot=False, extend_annot=False, trans_to_db=None)
merged_model = results['merged_model']
jacc_matrix = results['jacc_matrix']
num_met_merged = results['num_met_merged']
num_reac_merged = results['num_reac_merged']
met_sources = results['met_sources']
reac_sources = results['reac_sources']
-
input_models
is a list of one or more COBRApy model objects or strings specifying file names. -
set_objective
specifies if the objective functions are merged ('merge') or copied from a single model (specifying the index of the model: '1', 2', '3', etc.). -
exact_sto
use exact stoichiometry when merging reactions. -
use_prot
consider hydrogen and proton metabolites when merging reactions. -
add_annot
add additional metabolite and reaction annotations from mergem dictionaries. -
trans_to_db
translate metabolite and reaction IDs to a target database (chebi, metacyc, kegg, reactome, metanetx, hmdb, biocyc, bigg, seed, sabiork, or rhea) -
results
a dictionary with all the results, including: -
merged_model
the merged model. -
jacc_matrix
metabolite and reaction jaccard distances. -
num_met_merged
number of metabolites merged. -
num_reac_merged
number of reactions merged. -
met_sources
dictionary mapping each metabolite ID in the merged model to the corresponding metabolite IDs from each of the input models. -
reac_sources
dictionary mapping each reaction ID in the merged model to the corresponding reaction IDs from each of the input models.
The merge function returns a dictionary of results including the merged model, the metabolite and reaction Jaccard distance matrix between models, and the metabolite and reaction model sources.
The following functions can also be imported from mergem:
from mergem import translate, load_model, save_model, map_localization, map_metabolite_univ_id, map_reaction_univ_id, get_metabolite_properties, get_reaction_properties, update_id_mapper
translate(input_model, trans_to_db)
translates a model to another target database specified intrans_to_db
.load_model(filename)
loads a model from the given filename/path.save_model(cobra_model, file_name)
takes a cobra model as input and exports it as filefile_name
.map_localization(id_or_model_localization)
converts localization suffixes into common notation.map_metabolite_univ_id(met_id)
maps metabolite id to metabolite universal id.map_reaction_univ_id(reac_id)
maps reaction id to metabolite universal id.get_metabolite_properties(met_univ_id)
retrieves the properties of a metabolite using its universal idget_reaction_properties(reac_univ_id)
retrieves the properties of a reaction using its universal idupdate_id_mapper(delete_database_files)
updates and build mergem database. It will download the latest source database files, merge the identifiers based on common properties, and save the mapping mapping tables and information internally. This process can take several hours. The parameter specifies if the downloaded intermediate database files are deleted after the update (saves disk space but the next update will take longer; dafault is True).
Please cite mergem using:
mergem: merging, comparing, and translating genome-scale metabolic models using universal identifiers
A. Hari, A. Zarrabi, D. Lobo
NAR Genomics and Bioinformatics, 6(1), lqae010, 2024
This package was developed at The Lobo Lab, University of Maryland, Baltimore County.
This package is under GNU GENERAL PUBLIC LICENSE. The package is free for use without any express or implied warranty. In no event will the authors be held liable for any damages arising from the use of this software. Permission is granted to anyone to use this software for any purpose, subject to the following restrictions:
- The origin of this software and database must not be misrepresented; you must not claim that you wrote the original software.
- If you use this software and/or database in a work (any production in the scientific, literary, and artistic domain), an acknowledgment and citation (see publication above) in the work is required.
- This notice may not be removed or altered from any distribution.