This is the repo for the paper `SumGNN: Multi-typed Drug Interaction Prediction via Efficient Knowledge Graph Summarization'. (published in Bioinformatics'21)

Overview

SumGNN: Multi-typed Drug Interaction Prediction via Efficient Knowledge Graph Summarization

This is the code for our paper ``SumGNN: Multi-typed Drug Interaction Prediction via Efficient Knowledge Graph Summarization'' (published in Bioinformatics'21) [link].

Install

git clone [email protected]:yueyu1030/SumGNN.git
cd SumGNN
pip install -r requirements.txt

Example

python train.py 
    -d drugbank         # task
    -e ddi_hop3         # the name for the log for experiments
    --gpu=0             # ID of GPU
    --hop=3             # size of the hops for subgraph
    --batch=256         # batch size for samples
    --emb_dim=32        # size of embedding for GNN layers
    -b=10               # size of basis for relation kernel

You can also change the d to BioSNAP. Please change the e accordingly. The trained model and the logs are stored in experiments folder. Note that to ensure a fair comparison, we test all models on the same negative triplets.

Dataset

We provide the dataset in the data folder.

Data Source Description
Drugbank This link A drug-drug interaction network betweeen 1,709 drugs with 136,351 interactions.
TWOSIDES This link A drug-drug interaction network betweeen 645 drugs with 46221 interactions.
Hetionet This link The knowledge graph containing 33,765 nodes out of 11 types (e.g., gene, disease, pathway,molecular function and etc.) with 1,690,693 edges from 23 relation types after preprocessing (To ensure no information leakage, we remove all the overlapping edges between HetioNet and the dataset).

Knowledge Graph Embedding

We train the knowledge graph embedding based on the framework in OpenKE.

To obtain the embedding on your own, you need to first feed the triples in train.txt (edges in dataset) and relations_2hop.txt (edges in KG) as edges into their toolkit and obtain the embeddings for each node. Then, you can incorporate this embedding into our framework by modifying the line 44-45 in model/dgl/rgcn_model.py.

Cite Us

Please kindly cite this paper if you find it useful for your research. Thanks!

@article{yu2021sumgnn,
  title={Sumgnn: Multi-typed drug interaction prediction via efficient knowledge graph summarization},
  author={Yu, Yue and Huang, Kexin and Zhang, Chao and Glass, Lucas M and Sun, Jimeng and Xiao, Cao},
  journal={Bioinformatics},
  year={2021}
}

Acknowledgement

The code framework is based on GraIL.

Physics-Aware Training (PAT) is a method to train real physical systems with backpropagation.

Physics-Aware Training (PAT) is a method to train real physical systems with backpropagation. It was introduced in Wright, Logan G. & Onodera, Tatsuhiro et al. (2021)1 to train Physical Neural Networ

McMahon Lab 230 Jan 05, 2023
a reccurrent neural netowrk that when trained on a peice of text and fed a starting prompt will write its on 250 character text using LSTM layers

RNN-Playwrite a reccurrent neural netowrk that when trained on a peice of text and fed a starting prompt will write its on 250 character text using LS

Arno Barton 1 Oct 29, 2021
Galileo library for large scale graph training by JD

近年来,图计算在搜索、推荐和风控等场景中获得显著的效果,但也面临超大规模异构图训练,与现有的深度学习框架Tensorflow和PyTorch结合等难题。 Galileo(伽利略)是一个图深度学习框架,具备超大规模、易使用、易扩展、高性能、双后端等优点,旨在解决超大规模图算法在工业级场景的落地难题,提

JD Galileo Team 128 Nov 29, 2022
git《Beta R-CNN: Looking into Pedestrian Detection from Another Perspective》(NeurIPS 2020) GitHub:[fig3]

Beta R-CNN: Looking into Pedestrian Detection from Another Perspective This is the pytorch implementation of our paper "[Beta R-CNN: Looking into Pede

35 Sep 08, 2021
GDSC-ML Team Interview Task

GDSC-ML-Team---Interview-Task Task 1 : Clean or Messy room In this task we have to classify the given test images as clean or messy. - Link for datase

Aayush. 1 Jan 19, 2022
Tensorflow-Project-Template - A best practice for tensorflow project template architecture.

Tensorflow Project Template A simple and well designed structure is essential for any Deep Learning project, so after a lot of practice and contributi

Mahmoud G. Salem 3.6k Dec 22, 2022
LAnguage Model Analysis

LAMA: LAnguage Model Analysis LAMA is a probe for analyzing the factual and commonsense knowledge contained in pretrained language models. The dataset

Meta Research 960 Jan 08, 2023
Dynamics-aware Adversarial Attack of 3D Sparse Convolution Network

Leaded Gradient Method (LGM) This repository contains the PyTorch implementation for paper Dynamics-aware Adversarial Attack of 3D Sparse Convolution

An Tao 2 Oct 18, 2022
Iowa Project - My second project done at General Assembly, focused on feature engineering and understanding Linear Regression as a concept

Project 2 - Ames Housing Data and Kaggle Challenge PROBLEM STATEMENT Inferring or Predicting? What's more valuable for a housing model? When creating

Adam Muhammad Klesc 1 Jan 03, 2022
My solution for the 7th place / 245 in the Umoja Hack 2022 challenge

Umoja Hack 2022 : Insurance Claim Challenge My solution for the 7th place / 245 in the Umoja Hack 2022 challenge Umoja Hack Africa is a yearly hackath

Souames Annis 17 Jun 03, 2022
This code is the implementation of the paper "Coherence-Based Distributed Document Representation Learning for Scientific Documents".

Introduction This code is the implementation of the paper "Coherence-Based Distributed Document Representation Learning for Scientific Documents". If

tsc 0 Jan 11, 2022
Creating multimodal multitask models

Fusion Brain Challenge The English version of the document can be found here. Обновления 01.11 Мы выкладываем пример данных, аналогичных private test

Sber AI 43 Nov 28, 2022
Leaderboard, taxonomy, and curated list of few-shot object detection papers.

Leaderboard, taxonomy, and curated list of few-shot object detection papers.

Gabriel Huang 70 Jan 07, 2023
COVINS -- A Framework for Collaborative Visual-Inertial SLAM and Multi-Agent 3D Mapping

COVINS -- A Framework for Collaborative Visual-Inertial SLAM and Multi-Agent 3D Mapping Version 1.0 COVINS is an accurate, scalable, and versatile vis

ETHZ V4RL 183 Dec 27, 2022
KE-Dialogue: Injecting knowledge graph into a fully end-to-end dialogue system.

Learning Knowledge Bases with Parameters for Task-Oriented Dialogue Systems This is the implementation of the paper: Learning Knowledge Bases with Par

CAiRE 42 Nov 10, 2022
Yet Another Robotics and Reinforcement (YARR) learning framework for PyTorch.

Yet Another Robotics and Reinforcement (YARR) learning framework for PyTorch.

Stephen James 51 Dec 27, 2022
The offcial repository for 'CharacterBERT and Self-Teaching for Improving the Robustness of Dense Retrievers on Queries with Typos', SIGIR2022

CharacterBERT-DR The offcial repository for CharacterBERT and Self-Teaching for Improving the Robustness of Dense Retrievers on Queries with Typos, Sh

ielab 11 Nov 15, 2022
Sparse Progressive Distillation: Resolving Overfitting under Pretrain-and-Finetune Paradigm

Sparse Progressive Distillation: Resolving Overfitting under Pretrain-and-Finetu

3 Dec 05, 2022
A no-BS, dead-simple training visualizer for tf-keras

A no-BS, dead-simple training visualizer for tf-keras TrainingDashboard Plot inter-epoch and intra-epoch loss and metrics within a jupyter notebook wi

Vibhu Agrawal 3 May 28, 2021
The final project of "Applying AI to EHR Data" of "AI for Healthcare" nanodegree - Udacity.

Patient Selection for Diabetes Drug Testing Project Overview EHR data is becoming a key source of real-world evidence (RWE) for the pharmaceutical ind

Omar Laham 1 Jan 14, 2022