Python implementation of MULTIseq barcode alignment using fuzzy string matching and GMM barcode assignment

Last update: Feb 11, 2022

Related tags

Overview

py-multi-seq

Python implementation of MULTIseq barcode alignment using fuzzy string matching and GMM barcode assignment.

The scripts in this repository are roughly analogous to the provided MULTI-seq R package deMULTIplex at https://github.com/chris-mcginnis-ucsf/MULTI-seq. This script loads read data from paired end reads, performs fuzzy string matching from paired end reads to the provided MULTIseq barcode file, then counts the reads mapping to each barcode. Next, Expectation Maximization is used to fit Gaussian Mixture Models for each barcode, which assigns each cell a most likely barcode, no barcode or doublet barcodes.

Installation

Clone this repository. The scripts within also depend on python >= 3.7 and the following packages which can be installed with: pip install pandas numpy scipy fuzzywuzzy tqdm sparse_dot_topn scanpy natsort

You will need the cellranger cell barcodes file before running. You can in theory modify the MultiseqIndices.txt along with the read length parameters for custom barcodes in the reads.

Usage example for 10X scRNAseq or Multiome + MULTIseq:

python BarcodeFuzzyMatching.py /path/to/this/repo/MultiseqSamplesExample.txt /path/to/this/repo/MultiseqIndices.txt /path/to/sampleMULTIseq_R1.fastq /path/to/cellranger/outs/filtered_feature_bc_matrix/barcodes.tsv.gz /path/to/output/dir/ 16 8 0

python RunDemuxEM.py /path/to/output/dir/ /path/to/cellranger/outs/filtered_feature_bc_matrix/

Running this pipeline will output a matrix of barcodes by reads_counts, as well as a csv listing cell barcodes and their assigned barcode(s).

Python implementation of MULTIseq barcode alignment using fuzzy string matching and GMM barcode assignment

Related tags

Overview

py-multi-seq

Python implementation of MULTIseq barcode alignment using fuzzy string matching and GMM barcode assignment.

Installation

Usage example for 10X scRNAseq or Multiome + MULTIseq:

Owner

MT Schmitz

PyTorch implementation of Convolutional Neural Fabrics http://arxiv.org/abs/1606.02492

Implementation of Lie Transformer, Equivariant Self-Attention, in Pytorch

CoaT: Co-Scale Conv-Attentional Image Transformers

LTR_CrossEncoder: Legal Text Retrieval Zalo AI Challenge 2021

Time Delayed NN implemented in pytorch

An open-source project for applying deep learning to medical scenarios

A collection of papers about Transformer in the field of medical image analysis.

PyTorch Code for the paper "VSE++: Improving Visual-Semantic Embeddings with Hard Negatives"

Code for "Graph-Evolving Meta-Learning for Low-Resource Medical Dialogue Generation". [AAAI 2021]

The official implementation of the Interspeech 2021 paper WSRGlow: A Glow-based Waveform Generative Model for Audio Super-Resolution.

Introduction to AI assignment 1 HCM University of Technology, term 211

A Fast Monotone Rotating Shallow Water model

Code for the Interspeech 2021 paper "AST: Audio Spectrogram Transformer".

Pmapper is a super-resolution and deconvolution toolkit for python 3.6+

Theano is a Python library that allows you to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. It can use GPUs and perform efficient symbolic differentiation.

Source code release of the paper: Knowledge-Guided Deep Fractal Neural Networks for Human Pose Estimation.

A colab notebook for training Stylegan2-ada on colab, transfer learning onto your own dataset.

Code from Daniel Lemire, A Better Alternative to Piecewise Linear Time Series Segmentation

Pyramid R-CNN: Towards Better Performance and Adaptability for 3D Object Detection

URIE: Universal Image Enhancementfor Visual Recognition in the Wild