This is a tool for speculation of ancestral allel, calculation of sfs and drawing its bar plot.

Last update: Dec 16, 2022

Related tags

Overview

superSFS

This is a tool for speculation of ancestral allel, calculation of sfs and drawing its bar plot. It is easy-to-use and runing fast. What you should prepare is the phased vcf file containg the data of populations you intrested and the outgroup, the outgroup name file, and the annotation file. Enjoy it!!!

It has four models:

0：Using all function, from original vcf data to sfs barplot
1: Only speculate the ancestral allel and output new vcf file using speculated allel as reference
2: Only count the frequency of derived allel in each snp of each population
3: Only draw bar polt of sfs using data generated from the results of calutation of sfs

Example:

Model 0: python superSFS 0 ogdir threshold vcfdir annodir modir coutdir plotdir group
Model 1: python superSFS 1 ogdir threshold vcfdir outdir
Model 2: python superSFS 2 annodir modir coutdir
Model 3: python superSFS 3 coutdir plotdir group

Explation for each parameter:

ogdir: direction of outgroup names file
threshold: a number that if the sum of variant allel in outpgroup greater than it,the variant allel will be counted as ancestral allel
vcfdir: direction of vcf data
vannodir: direction of annotation file with sample names in first column and group name in second colum. This file should has header in first row
vmodir: assign the output direction of generated vcf file using speculated allel as reference
countdir: assign the output direction of calculation of derived allels for each snp in each group
plotdir: assign the output direction of bar plot of sfs
group: the group that you want to analysis

This is a tool for speculation of ancestral allel, calculation of sfs and drawing its bar plot.

Related tags

Overview

superSFS

Owner

Minimal working example of data acquisition with nidaqmx python API

Python package for processing UC module spectral data.

INF42 - Topological Data Analysis

Created covid data pipeline using PySpark and MySQL that collected data stream from API and do some processing and store it into MYSQL database.

Python Project on Pro Data Analysis Track

simple way to build the declarative and destributed data pipelines with python

ICLR 2022 Paper submission trend analysis

Helper tools to construct probability distributions built from expert elicited data for use in monte carlo simulations.

Elementary is an open-source data reliability framework for modern data teams. The first module of the framework is data lineage.

PostQF is a user-friendly Postfix queue data filter which operates on data produced by postqueue -j.

Pypeln is a simple yet powerful Python library for creating concurrent data pipelines.

Multiple Pairwise Comparisons (Post Hoc) Tests in Python

Improving your data science workflows with

Spectral Analysis in Python

Data-sets from the survey and analysis

Useful tool for inserting DataFrames into the Excel sheet.

Pyspark Spotify ETL

Top 50 best selling books on amazon

Finding project directories in Python (data science) projects, just like there R rprojroot and here packages

A set of functions and analysis classes for solvation structure analysis