MICOM is a Python package for metabolic modeling of microbial communities

Overview

https://github.com/micom-dev/micom/raw/master/docs/source/micom.png

actions status coverage pypi status

Welcome

MICOM is a Python package for metabolic modeling of microbial communities currently developed in the Gibbons Lab at the Institute for Systems Biology and the Human Systems Biology Group of Prof. Osbaldo Resendis Antonio at the National Institute of Genomic Medicine Mexico.

MICOM allows you to construct a community model from a list on input COBRA models and manages exchange fluxes between individuals and individuals with the environment. It explicitly accounts for different abundances of individuals in the community and can thus incorporate data from biomass quantification, cytometry, ampliconsequencing, or metagenomic shotgun sequencing.

It identifies a relevant flux space by incorporating an ecological model for the trade-off between individual taxa growth and community-wide growth that shows good agreement with experimental data.

Attribution

MICOM is published in

MICOM: Metagenome-Scale Modeling To Infer Metabolic Interactions in the Gut Microbiota
Christian Diener, Sean M. Gibbons, Osbaldo Resendis-Antonio
mSystems 5:e00606-19
https://doi.org/10.1128/mSystems.00606-19

Please cite this publication when referencing MICOM. Thanks πŸ˜„

Installation

MICOM is available on PyPi and can be installed via

pip install micom

Getting started

Documentation can be found at https://micom-dev.github.io/micom .

Getting help

General questions on usage can be asked in Github Discussions
https://github.com/micom-dev/micom/discussions
We are also available on the cobrapy Gitter channel
https://gitter.im/opencobra/cobrapy
Questions specific to the MICOM Qiime2 plugin (q2-micom) can also be asked on the Qiime2 forum
https://forum.qiime2.org/c/community-plugin-support/
Comments
  • exchanges not identified correctly in CarveME models

    exchanges not identified correctly in CarveME models

    Goodmorning. I have recently used your program to analyze metabolic exchanges in a community. I have only a question: in the output, I have for each metabolite two columns one called, for example, EX_but_e and the other EX_but_m. The EX_but_m appears only in one row which is the medium one. I supposed that summing the EX_but_e for all the microbes in my community I should have got the value in the medium row at EX_but_m column. But it is not the case. Why?

    Sincerely, Arianna Basile

    opened by arianccbasile 14
  • interpretation of plot_exchanges_per_sample default behavior

    interpretation of plot_exchanges_per_sample default behavior

    The "Plotting consumed metabolites" section of the documentation describes the function plot_exchanges_per_sample with default parameter direction='import' as plotting the metabolites consumed by the microbial community.

    In the source code for plot_exchanges_per_sample, the subset with taxon='medium' is selected. But isn't the "import" for the 'medium' actually the export for the microbial community? This seems at odds with the description of plot_exchanges_per_sample(direction='import') as plotting the consumption patterns of the microbes.

    Perhaps I am misinterpreting the direction of the fluxes for the 'medium' in the output of the grow function?

    opened by mmp3 6
  • optimize_all(fluxes=True) and optimize_single(fluxes=True) do not return fluxes

    optimize_all(fluxes=True) and optimize_single(fluxes=True) do not return fluxes

    Even with fluxes=True passed into community.optimize_all(), only maximal growth rates are returned. In fact, community.optimize_single() does not accept the fluxes argument.

    Using version '0.22.6'

    opened by michaelsilverstein 6
  • Fix #37 by allowing the user to define the compression...

    Fix #37 by allowing the user to define the compression...

    …type and level when building a zipped database. Default behavior stays the same (ZIP_STORED with no compression).

    This commit reworks the behavior of the compress parameter and adds the optional parameter compresslevel to micom.workflows.build_database().

    opened by nigiord 5
  • Fraction variation influences

    Fraction variation influences "status"

    Hi! I have a question about the flag "status" which comes as an output when "cooperative_tradeoff" is run. In particular, sometimes I get as a result "optimal" but most of times I get "numeric". What does it mean?

    Sincerely, Arianna Basile

    opened by arianccbasile 5
  • Optimize community growth to match known relative abundance

    Optimize community growth to match known relative abundance

    I'm new to MICOM and COBRA, and I'd like to use MICOM to predict the metabolites similar to MAMBO and the Garza et al. 2018 paper's approach which, as I understand it, is to maximize the correlation of microbial growth with the known relative abundances. Is this possible with MICOM?

    I'd guess this would involve changing the community objective function somehow, or are alternate objective functions already supported?

    Thanks!

    opened by krcurtis 5
  • Add basic installation instructions for QP solvers to the docs

    Add basic installation instructions for QP solvers to the docs

    I'm trying to run com.cooperative_tradeoff on a community using some sbml files but cannot because I'm using the default GLPK and don't know how to use IBM Cplex instead. I downloaded it as an academic edition but do not know how to set it as the solver for micom. Thanks in advance!

    opened by jfoldi81 4
  • Error occurs when joining two models after changing their metabolite IDs

    Error occurs when joining two models after changing their metabolite IDs

    Hi, I am trying to build a community for two models and I've already changed the metabolite IDs for each model. I renamed the metabolite IDs as + such as C12145cytoplasm ('C12145' is KEGG ID and 'cytoplasm' is the name of the compartment). Firstly one thing I'd like to double-check is that do I need to change the compartment ID as well even though I didn't use the compartment ID for matching models?

    Secondly, when I tried to join the two renamed models an error occurred:

    File "/Users/wintermute/opt/anaconda3/lib/python3.7/urllib/parse.py", line 107, in <genexpr>
        return tuple(x.decode(encoding, errors) if x else '' for x in args)
    AttributeError: 'Model' object has no attribute 'decode'
    

    Here is my code:

    sc = micom.util.load_model('/Users/wintermute/OneDrive - University of Cambridge/cambridge/during_cam/Department/work/code/coculture/yeast7.6_changeID_final.xml')
    
    kp = micom.util.load_model('/Users/wintermute/OneDrive - University of Cambridge/cambridge/during_cam/Department/work/code/coculture/KP_changeID_final3.xml')
    
    community = [sc,kp]
    
    micom.util.join_models(community)
    

    Does anyone have any idea about that? Thanks.

    opened by wintermute221 4
  • Why does problems.solve() return None when status is not optimal?

    Why does problems.solve() return None when status is not optimal?

    Just curious since there is additional information about what did not work in the cobrapy Solution object that could be returned. Or does it just not make sense to return a CommunitySolution object when the optimization did not work? I'm most interested in knowing what the solver status is.

    opened by mmundy42 4
  • Adjust active demands when loading models

    Adjust active demands when loading models

    Problem description

    Custom models and some models from AGORA have active demand reactions for instance for biotin. This can make MICOM fail due to infeasible models in low nutrient settings.

    Code Sample

    See https://github.com/micom-dev/media/issues/2 .

    Suggested fix

    Adjust those demands so that the zero flux solution is included. Raise a warning or info in the logger if that is the case.

    opened by cdiener 3
  • build.py:177: FutureWarning: The default value of regex will change from True to False in a future version.

    build.py:177: FutureWarning: The default value of regex will change from True to False in a future version.

    Following your recommendation from #31 , I execute build_database as follows:

    # df has columns: 
    #     id   kingdom   phylum  class  order  family   genus     species    file
    > m = build_database( manifest = df , out_path = "/data/out" , threads = 8 )
    

    which gives warning message (but still completes successfully):

    /usr/local/lib/python3.8/dist-packages/micom/workflows/build.py:177: FutureWarning: The default value of regex will change from True to False in a future version.
      meta.index = meta[rank].str.replace("[^\\w\\_]", "_")
    Running ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╺━━━━━━━━  79% 0:02:12
    

    System information:

    $ python3  -c "import micom; micom.show_versions()"
    
    System Information
    ==================
    OS                  Linux
    OS-release 5.4.0-1030-aws
    Python              3.8.5
    
    Package Versions
    ================
    cobra        0.21.0
    jinja2       2.10.1
    micom        0.22.5
    pip          20.0.2
    scikit-learn 0.24.1
    scipy         1.6.1
    setuptools   45.2.0
    symengine     0.7.2
    wheel        0.34.2
    
    opened by mmp3 3
  • Error while copying model using model.copy()

    Error while copying model using model.copy()

    Discussed in https://github.com/micom-dev/micom/discussions/69

    Originally posted by anubhavdas0907 March 8, 2022 Hello Christian,

    I was trying to copy a community model to a different variable, but I get an error. I want to manipulate a model, without changing the original one.

    Following are the details.

    from micom import load_pickle
    model = load_pickle("ERR1883210.pickle")
    model_1 = model.copy()
    
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/data/Conda_base/envs/MyConda/lib/python3.7/site-packages/cobra/core/model.py", line 321, in copy
        new = self.__class__()
    TypeError: __init__() missing 1 required positional argument: 'taxonomy'
    

    Can you please suggest, what's wrong in this code, and what can be the solution?

    Regards Anubhav

    opened by cdiener 2
  • Improve the warning for media metabolites with a missing transport reaction

    Improve the warning for media metabolites with a missing transport reaction

    Checklist

    Is your feature related to a problem? Please describe it.

    The warning regarding media components with missing imports is a bit too fatalistic given that it is not a real problem in most cases. See #63 for instance.

    Describe the solution you would like.

    Maybe change it to be a an info and only trigger a warning if a large number of metabolites can not be consumed or if there are no consumable carbon and nitrogen sources.

    Describe alternatives you considered

    Just changing the text of the warning, but it may still be too verbose.

    Additional context

    opened by cdiener 0
  • Update and fix the docs

    Update and fix the docs

    Checklist

    Bundled action items

    This is issue is a collection of small things that should be fixed in the docs.

    • [ ] fix links in the docs
    • [ ] provide a section summarizing the available DBs and media
    • [ ] Improve the workflows section and maybe split it
    • [ ] give more info and the tradeoff parameter and how to choose it
    • [ ] update the theoretical/methods intro
    • [ ] document the QIIME2 artifact readers
    • [ ] update the docs for elasticities
    • [x] add solver installation to docs
    opened by cdiener 0
  • Support for GTDB taxonomy?

    Support for GTDB taxonomy?

    Checklist

    Is your feature related to a problem? Please describe it.

    The Genome Taxonomy Database (GTDB) is comprehensive (especially the new v202 release) and more robust than the NCBI microbial taxonomy, especially given that the GTDB taxonomy is completely based off of genome phylogenic relatedness.

    Although the MICOM docs are vague about the taxonomy that one must use, it appears that the NCBI taxonomy is required.

    Describe the solution you would like.

    Provide direct support for the GTDB taxonomy.

    opened by nick-youngblut 3
  • [MICOM 1.0 API] Proposed new format for fluxes

    [MICOM 1.0 API] Proposed new format for fluxes

    This is a proposal for a new format for fluxes slated for MICOM 1.0. Feel free to comment :smile:

    Checklist

    Current state

    The current format for fluxes returned by MICOM is a table in wide format:

    In [1]: from micom import Community
    
    In [2]: from micom.data import test_taxonomy
    
    In [3]: com = Community(test_taxonomy())
    Building ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00
    
    In [7]: sol = com.cooperative_tradeoff(fluxes=True)
    
    In [8]: sol.fluxes
    Out[8]: 
    reaction               ACALD    ACALDt      ACKr    ACONTa    ACONTb     ACt2r          ADK1  ...     SUCDi    SUCOAS      TALA          THD2      TKT1      TKT2       TPI
    compartment                                                                                   ...                                                                          
    Escherichia_coli_1  0.049190 -0.008897 -0.004224  5.999485  5.999485 -0.004224  3.388665e-11  ...  5.017641 -5.017641  1.489184  1.924736e-10  1.489184  1.173698  7.513137
    Escherichia_coli_2 -0.079989 -0.115231  0.072559  6.001066  6.001066  0.072559  4.264225e-11  ...  5.033051 -5.033051  1.491048  1.924125e-10  1.491048  1.175562  7.495742
    Escherichia_coli_3  0.102350  0.197394 -0.100513  6.004985  6.004985 -0.100513  3.662292e-11  ...  5.083935 -5.083935  1.506075  1.926208e-10  1.506075  1.190589  7.460396
    Escherichia_coli_4 -0.071551 -0.073266  0.032177  6.023463  6.023463  0.032177  4.133342e-11  ...  5.122875 -5.122875  1.501628  1.926284e-10  1.501628  1.186143  7.440253
    medium                   NaN       NaN       NaN       NaN       NaN       NaN           NaN  ...       NaN       NaN       NaN           NaN       NaN       NaN       NaN
    
    [5 rows x 115 columns]
    
    

    This has resulted in some issues:

    1. It is incompatible with cobra.Solution.fluxes which breaks a lot of the cobra functionality like for instance summary methods.
    2. It can be pretty sparse for very divergent models (many NA entries)
    3. It mixes medium and taxa fluxes
    4. It does not specify if export fluxes denote import or export which is one of the most common help requests we receive
    5. Basically all methods using flux results in MICOM will convert them to a long format

    Proposed new API for fluxes

    CommunitySolution.fluxes will retain the cobrapy format and will superseded by new accessors that all return fluxes in long format:

    CommunitySolution.exchange_fluxes

    Similar to the previous one but with the taxa annotated.

          reaction                     name               taxon          flux direction                       micom_id
    0      EX_ac_m     ac_m medium exchange              medium  1.814984e-11    export                        EX_ac_m
    1   EX_acald_m  acald_m medium exchange              medium  1.328645e-11    export                     EX_acald_m
    2     EX_akg_m    akg_m medium exchange              medium  3.225128e-12    export                       EX_akg_m
    3     EX_co2_m    co2_m medium exchange              medium  2.280983e+01    export                       EX_co2_m
    4    EX_etoh_m   etoh_m medium exchange              medium  1.515389e-11    export                      EX_etoh_m
    ..         ...                      ...                 ...           ...       ...                           
    

    CommunitySolution.internal_fluxes

        reaction                                               name               taxon          flux                    micom_id
    0      ACALD           Acetaldehyde dehydrogenase (acetylating)  Escherichia_coli_1  1.312146e+00   ACALD__Escherichia_coli_1
    1     ACALDt                  Acetaldehyde reversible transport  Escherichia_coli_1  3.236132e+00  ACALDt__Escherichia_coli_1
    2       ACKr                                     Acetate kinase  Escherichia_coli_1 -1.304078e+00    ACKr__Escherichia_coli_1
    3     ACONTa   Aconitase (half-reaction A, Citrate hydro-lyase)  Escherichia_coli_1  5.987675e+00  ACONTa__Escherichia_coli_1
    4     ACONTb  Aconitase (half-reaction B, Isocitrate hydro-l...  Escherichia_coli_1  5.987675e+00  ACONTb__Escherichia_coli_1
    

    This will consolidate GrowthResults and CommunitySolution and gives a more readable format. All those properties are generated on the fly when accessing the property.

    Additionaly, we may also want to save the annotations in the solution but they may be large, so it might be better to have a property on the model class like Community.annotations.

    Additional context

    A similar format change is planned for Community.knockout_taxa. elasticities already uses a long format.

    feature 
    opened by cdiener 0
  • Implement more checks and help for model tables

    Implement more checks and help for model tables

    The format for the taxonomy table community model manifests can be unclear and both are often confused for one another. Provide a validation helper and better error messages.

    opened by cdiener 0
Releases(v0.32.3)
Owner
Developers of the microbial community modeling package micom.
A collection of interactive machine-learning experiments: πŸ‹οΈmodels training + 🎨models demo

πŸ€– Interactive Machine Learning experiments: πŸ‹οΈmodels training + 🎨models demo

Oleksii Trekhleb 1.4k Jan 06, 2023
A Python implementation of GRAIL, a generic framework to learn compact time series representations.

GRAIL A Python implementation of GRAIL, a generic framework to learn compact time series representations. Requirements Python 3.6+ numpy scipy tslearn

3 Nov 24, 2021
JMP is a Mixed Precision library for JAX.

Mixed precision training [0] is a technique that mixes the use of full and half precision floating point numbers during training to reduce the memory bandwidth requirements and improve the computatio

DeepMind 108 Dec 31, 2022
A GitHub action that suggests type annotations for Python using machine learning.

Typilus: Suggest Python Type Annotations A GitHub action that suggests type annotations for Python using machine learning. This action makes suggestio

40 Sep 18, 2022
Python 3.6+ toolbox for submitting jobs to Slurm

Submit it! What is submitit? Submitit is a lightweight tool for submitting Python functions for computation within a Slurm cluster. It basically wraps

Facebook Incubator 768 Jan 03, 2023
Projeto: Machine Learning: Linguagens de Programacao 2004-2001

Projeto: Machine Learning: Linguagens de Programacao 2004-2001 Projeto de Data Science e Machine Learning de anΓ‘lise de linguagens de programaΓ§Γ£o de 2

Victor Hugo Negrisoli 0 Jun 29, 2021
Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow

eXtreme Gradient Boosting Community | Documentation | Resources | Contributors | Release Notes XGBoost is an optimized distributed gradient boosting l

Distributed (Deep) Machine Learning Community 23.6k Jan 03, 2023
Python implementation of the rulefit algorithm

RuleFit Implementation of a rule based prediction algorithm based on the rulefit algorithm from Friedman and Popescu (PDF) The algorithm can be used f

Christoph Molnar 326 Jan 02, 2023
A high performance and generic framework for distributed DNN training

BytePS BytePS is a high performance and general distributed training framework. It supports TensorFlow, Keras, PyTorch, and MXNet, and can run on eith

Bytedance Inc. 3.3k Dec 28, 2022
cleanlab is the data-centric ML ops package for machine learning with noisy labels.

cleanlab is the data-centric ML ops package for machine learning with noisy labels. cleanlab cleans labels and supports finding, quantifying, and lear

Cleanlab 51 Nov 28, 2022
Turning images into '9-pan' palettes using KMeans clustering from sklearn.

img2palette Turning images into '9-pan' palettes using KMeans clustering from sklearn. Requirements We require: Pillow, for opening and processing ima

Samuel Vidovich 2 Jan 01, 2022
Automated Machine Learning Pipeline with Feature Engineering and Hyper-Parameters Tuning

The mljar-supervised is an Automated Machine Learning Python package that works with tabular data. I

MLJAR 2.4k Jan 02, 2023
An MLOps framework to package, deploy, monitor and manage thousands of production machine learning models

Seldon Core: Blazing Fast, Industry-Ready ML An open source platform to deploy your machine learning models on Kubernetes at massive scale. Overview S

Seldon 3.5k Jan 01, 2023
SynapseML - an open source library to simplify the creation of scalable machine learning pipelines

Synapse Machine Learning SynapseML (previously MMLSpark) is an open source library to simplify the creation of scalable machine learning pipelines. Sy

Microsoft 3.9k Dec 30, 2022
Time series forecasting with PyTorch

Our article on Towards Data Science introduces the package and provides background information. Pytorch Forecasting aims to ease state-of-the-art time

Jan Beitner 2.5k Jan 02, 2023
Implementation of linesearch Optimization Algorithms in Python

Nonlinear Optimization Algorithms During my time as Scientific Assistant at the Karlsruhe Institute of Technology (Germany) I implemented various Opti

Paul 3 Dec 06, 2022
A Python package for time series classification

pyts: a Python package for time series classification pyts is a Python package for time series classification. It aims to make time series classificat

Johann Faouzi 1.4k Jan 01, 2023
A python fast implementation of the famous SVD algorithm popularized by Simon Funk during Netflix Prize

⚑ funk-svd funk-svd is a Python 3 library implementing a fast version of the famous SVD algorithm popularized by Simon Funk during the Neflix Prize co

Geoffrey Bolmier 171 Dec 19, 2022
Learning --> Numpy January 2022 - winter'22

Numerical-Python Numpy NumPy is a library for the Python programming language, adding support for large, multi-dimensional arrays and matrices, along

Shahzaneer Ahmed 0 Mar 12, 2022
Simulate & classify transient absorption spectroscopy (TAS) spectral features for bulk semiconducting materials (Post-DFT)

PyTASER PyTASER is a Python (3.9+) library and set of command-line tools for classifying spectral features in bulk materials, post-DFT. The goal of th

Materials Design Group 4 Dec 27, 2022