Cash in on Expressed Barcode Tags (EBTs) from NGS Sequencing Data with Python

Overview

Cash in on Expressed Barcode Tags (EBTs) from NGS Sequencing Data with Python

Cashier is a tool developed by Russell Durrett for the analysis and extraction of expressed barcode tags.

This python implementation offers the same flexibility and simple command line operation.

Like it's predecessor it is a wrapper for the tools cutadapt, fastx-toolkit, and starcode.

Dependencies

  • cutadapt (sequence extraction)
  • starcode (sequence clustering)
  • fastx-toolkit (PHred score filtering)
  • pear (paired end read merging)
  • pysam (sam file convertion to fastq)

Recommended Installation Procedure

It's recommended to use conda to install and manage the dependencies for this package

conda env create -f https://raw.githubusercontent.com/brocklab/pycashier/main/environment.yml # or mamba env create -f ....
conda activate cashierenv
pycashier --help

Additionally you may install with pip. Though it will be up to you to ensure all the non-python dependencies are on the path and installed correctly.

pip install pycashier

Usage

Pycashier has one required argument which is the directory containing the fastq or sam files you wish to process.

conda activate cashierenv
pycashier ./fastqs

For additional parameters see pycashier -h.

As the files are processed two additional directories will be created pipeline and outs.

Currently all intermediary files generated as a result of the program will be found in pipeline.

While the final processed files will be found within the outs directory.

Merging Files

Pycashier can now take paired end reads and perform a merging of the reads to produce a fastq which can then be used with cashier's default feature.

pycashier ./fastqs -m

Processing Barcodes from 10X bam files

Pycashier can also extract gRNA barcodes along with 10X cell and umi barcodes.

Firstly we are only interested in the unmapped reads. From the cellranger bam output you would obtain these reads using samtools.

samtools view -f 4 possorted_genome_bam.bam > unmapped.sam

Then similar to normal barcode extraction you can pass a directory of these unmapped sam files to pycashier and extract barcodes. You can also still specify extraction parameters that will be passed to cutadapt as usual.

Note: The default parameters passed to cutadapt are unlinked adapters and minimum barcode length of 10 bp.

pycashier ./unmapped_sams -sc

When finished the outs directory will have a .tsv containing the following columns: Illumina Read Info, UMI Barcode, Cell Barcode, gRNA Barcode

Usage notes

Pycashier will NOT overwrite intermediary files. If there is an issue in the process, please delete either the pipeline directory or the requisite intermediary files for the sample you wish to reprocess. This will allow the user to place new fastqs within the source directory or a project folder without reprocessing all samples each time.

  • Currently, pycashier expects to find .fastq.gz files when merging and .fastq files when extracting barcodes. This behavior may change in the future.
  • If there are reads from multiple lanes they should first be concatenated with cat sample*R1*.fastq.gz > sample.R1.fastq.gz
  • Naming conventions:
    • Sample names are extracted from files using the first string delimited with a period. Please take this into account when naming sam or fastq files.
    • Each processing step will append information to the input file name to indicate changes, again delimited with periods.
Parametric Bottle in CADQuery

Parametric Bottle using CADQuery The proposed code makes it possible to generate different types and sizes of 3D bottles in order to train Pixel2mesh

Ayoub EL HOUDRI 1 May 22, 2022
This is a Python 3.10 port of mock, a library for manipulating human-readable message strings.

This is a Python 3.10 port of mock, a library for manipulating human-readable message strings.

Alexander Bartolomey 1 Dec 31, 2021
A tool to help the Poly copy-reading process! :D

PolyBot A tool to help the Poly copy-reading process! :D Let's face it-computers are better are repeatitive tasks. And, in spite of what one may want

1 Jan 10, 2022
Git Hooks Tutorial.

Git Hooks Tutorial My public talk about this project at Sberloga: Git Hooks Is All You Need 1. Git Hooks 101 Init git repo: mkdir git_repo cd git_repo

Dani El-Ayyass 17 Oct 12, 2022
Py-Parser est un parser de code python en python encore en plien dévlopement.

PY - PARSER Py-Parser est un parser de code python en python encore en plien dévlopement. Une fois achevé, il servira a de nombreux projets comme glad

pf4 3 Feb 21, 2022
This package tries to emulate the behaviour of syntax proposed in PEP 671 via a decorator

Late-Bound Arguments This package tries to emulate the behaviour of syntax proposed in PEP 671 via a decorator. Usage Mention the names of the argumen

Shakya Majumdar 0 Feb 06, 2022
Exploring basic lambda calculus in Python

Lambda Exploring basic lambda calculus in Python. In this repo I have used the lambda function built into python to get a more intiutive feel of lambd

Bhardwaj Bhaskar 2 Nov 12, 2021
Python bindings for `ign-msgs` and `ign-transport`

Python Ignition This project aims to provide Python bindings for ignition-msgs and ignition-transport. It is a work in progress... C++ and Python libr

Rhys Mainwaring 3 Nov 08, 2022
Statistics Calculator module for all types of Stats calculations.

Statistics-Calculator This Calculator user the formulas and methods to find the statistical values listed. Statistics Calculator module for all types

2 May 29, 2022
A Tool to validate domestic New Zealand vaccine passes

Vaccine Validator Tool to validate domestic New Zealand vaccine passes Create a new virtual environment: python3 -m venv ./venv Activate virtual envi

8 May 01, 2022
Python plugin for Krita that assists with making python plugins for Krita

Krita-PythonPluginDeveloperTools Python plugin for Krita that assists with making python plugins for Krita Introducing Python Plugin developer Tools!

18 Dec 01, 2022
My attempt at this years Advent of Code!

Advent-of-code-2021 My attempt at this years Advent of Code! day 1: ** day 2: ** day 3: ** day 4: ** day 5: ** day 6: ** day 7: ** day 8: * day 9: day

1 Jul 06, 2022
A VirtualBox manager with interactive mode

A VirtualBox manager with interactive mode

Luis Gerardo 1 Nov 21, 2021
Ghost source since the developer of the project quit due to reasons

👻 Ghost Selfbot The official code for Ghost which was recently discontinued and released to the public. Feel free to use any of the code found in thi

xannyy 2 Mar 24, 2022
fetchmesh is a tool to simplify working with Atlas anchoring mesh measurements

A Python library for working with the RIPE Atlas anchoring mesh. fetchmesh is a tool to simplify working with Atlas anchoring mesh measurements. It ca

2 Aug 30, 2022
Data and analysis relating to the 5.8M Melbourne quake of 2021

quake2021 Data and analysis relating to the 5.8M Melbourne quake of 2021 Monash University Woodside Living Lab Building The building is located here T

Colin Caprani 6 May 16, 2022
Visualize Data From Stray Scanner https://keke.dev/blog/2021/03/10/Stray-Scanner.html

StrayVisualizer A set of scripts to work with data collected using Stray Scanner. Usage Installing Dependencies Install dependencies with pip -r requi

Kenneth Blomqvist 45 Dec 30, 2022
Very Simple Zoom Spam Pinger!

Very Simple Zoom Spam Pinger!

Syntax. 2 Mar 05, 2022
Structured, dependable legos for starknet development.

Structured, dependable legos for starknet development.

Alucard 127 Nov 23, 2022
Convert Beat Saber maps to Tesla light shows!

Tesla x Beat Saber - Light Show Converter Convert Beat Saber maps to Tesla light shows! This project requires FFMPEG and all packages from requirement

HLVM 20 Dec 21, 2022