NeoDTI: Neural integration of neighbor information from a heterogeneous network for discovering new drug-target interactions

Last update: Nov 26, 2022

Overview

NeoDTI

NeoDTI: Neural integration of neighbor information from a heterogeneous network for discovering new drug-target interactions (Bioinformatics).

Recent Update 09/06/2018

L2 regularization is added.

Requirements

Tensorflow (tested on version 1.0.1 and version 1.2.0)
tflearn
numpy (tested on version 1.13.3 and version 1.14.0)
sklearn (tested on version 0.18.1 and version 0.19.0)

Quick start

To reproduce our results:

Unzip data.zip in ./data.
Run NeoDTI_cv.py to reproduce the cross validation results of NeoDTI. Options are:
-d: The embedding dimension d, default: 1024.
-n: Global norm to be clipped, default: 1.
-k: The dimension of project matrices, default: 512.
-r: Positive and negative. Two choices: ten and all, the former one sets the positive:negative = 1:10, the latter one considers all unknown DTIs as negative examples. Default: ten.
-t: Test scenario. The DTI matrix to be tested. Choices are: o, mat_drug_protein.txt will be tested; homo, mat_drug_protein_homo_protein_drug.txt will be tested; drug, mat_drug_protein_drug.txt will be tested; disease, mat_drug_protein_disease.txt will be tested; sideeffect, mat_drug_protein_sideeffect.txt will be tested; unique, mat_drug_protein_drug_unique.txt will be tested. Default: o.
Run NeoDTI_cv_with_aff.py to reproduce the cross validation results of NeoDTI with additional compound-protein binding affinity data. Options are:
-d: The embedding dimension d, default: 1024.
-n: Global norm to be clipped, default: 1.
-k: The dimension of project matrices, default: 512.

Data description

drug.txt: list of drug names.
protein.txt: list of protein names.
disease.txt: list of disease names.
se.txt: list of side effect names.
drug_dict_map: a complete ID mapping between drug names and DrugBank ID.
protein_dict_map: a complete ID mapping between protein names and UniProt ID.
mat_drug_se.txt : Drug-SideEffect association matrix.
mat_protein_protein.txt : Protein-Protein interaction matrix.
mat_drug_drug.txt : Drug-Drug interaction matrix.
mat_protein_disease.txt : Protein-Disease association matrix.
mat_drug_disease.txt : Drug-Disease association matrix.
mat_protein_drug.txt : Protein-Drug interaction matrix.
mat_drug_protein.txt : Drug-Protein interaction matrix.
Similarity_Matrix_Drugs.txt : Drug & compound similarity scores based on chemical structures of drugs ([0,708) are drugs, the rest are compounds).
Similarity_Matrix_Proteins.txt : Protein similarity scores based on primary sequences of proteins.
mat_drug_protein_homo_protein_drug.txt: Drug-Protein interaction matrix, in which DTIs with similar drugs (i.e., drug chemical structure similarities > 0.6) or similar proteins (i.e., protein sequence similarities > 40%) were removed (see the paper).
mat_drug_protein_drug.txt: Drug-Protein interaction matrix, in which DTIs with drugs sharing similar drug interactions (i.e., Jaccard similarities > 0.6) were removed (see the paper).
mat_drug_protein_sideeffect.txt: Drug-Protein interaction matrix, in which DTIs with drugs sharing similar side effects (i.e., Jaccard similarities > 0.6) were removed (see the paper).
mat_drug_protein_disease.txt: Drug-Protein interaction matrix, in which DTIs with drugs or proteins sharing similar diseases (i.e., Jaccard similarities > 0.6) were removed (see the paper).
mat_drug_protein_unique: Drug-Protein interaction matrix, in which known unique and non-unique DTIs were labelled as 3 and 1, respectively, the corresponding unknown ones were labelled as 2 and 0 (see the paper for the definition of unique).
mat_compound_protein_bindingaffinity.txt: Compound-Protein binding affinity matrix (measured by negative logarithm of Ki).

All entities (i.e., drugs, compounds, proteins, diseases and side-effects) are organized in the same order across all files. These files: drug.txt, protein.txt, disease.txt, se.txt, drug_dict_map, protein_dict_map, mat_drug_se.txt, mat_protein_protein.txt, mat_drug_drug.txt, mat_protein_disease.txt, mat_drug_disease.txt, mat_protein_drug.txt, mat_drug_protein.txt, Similarity_Matrix_Proteins.txt, are extracted from https://github.com/luoyunan/DTINet.

Contacts

If you have any questions or comments, please feel free to email Fangping Wan (wfp15[at]tsinghua[dot]org[dot]cn) and/or Jianyang Zeng (zengjy321[at]tsinghua[dot]edu[dot]cn).

NeoDTI: Neural integration of neighbor information from a heterogeneous network for discovering new drug-target interactions

Related tags

Overview

NeoDTI

Recent Update 09/06/2018

Requirements

Quick start

Data description

Contacts

Owner

PaddleBoBo是基于PaddlePaddle和PaddleSpeech、PaddleGAN等开发套件的虚拟主播快速生成项目

Cossim - Sharpened Cosine Distance implementation in PyTorch

Repo for CReST: A Class-Rebalancing Self-Training Framework for Imbalanced Semi-Supervised Learning

Using NumPy to solve the equations of fluid mechanics together with Finite Differences, explicit time stepping and Chorin's Projection methods

FPGA: Fast Patch-Free Global Learning Framework for Fully End-to-End Hyperspectral Image Classification

Code associated with the paper "Towards Understanding the Data Dependency of Mixup-style Training".

RANZCR-CLiP 7th Place Solution

Official implementation of the ICCV 2021 paper "Conditional DETR for Fast Training Convergence".

Reinforcement learning for self-driving in a 3D simulation

A free, multiplatform SDK for real-time facial motion capture using blendshapes, and rigid head pose in 3D space from any RGB camera, photo, or video.

Infrastructure as Code (IaC) for a self-hosted version of Gnosis Safe on AWS

Self-Learned Video Rain Streak Removal: When Cyclic Consistency Meets Temporal Correspondence

Episodic Transformer (E.T.) is a novel attention-based architecture for vision-and-language navigation. E.T. is based on a multimodal transformer that encodes language inputs and the full episode history of visual observations and actions.

🎃 Core identification module of AI powerful point reading system platform.

ElasticFace: Elastic Margin Loss for Deep Face Recognition

Exploration & Research into cross-domain MEV. Initial focus on ETH/POLYGON.

PyTorch implementation of "MLP-Mixer: An all-MLP Architecture for Vision" Tolstikhin et al. (2021)

Repository for "Toward Practical Monocular Indoor Depth Estimation" (CVPR 2022)

Open CV - Convert a picture to look like a cartoon sketch in python

A Neural Net Training Interface on TensorFlow, with focus on speed + flexibility