Pytorch implementation of the paper "Topic Modeling Revisited: A Document Graph-based Neural Network Perspective"

Last update: Sep 14, 2022

Related tags

Overview

Graph Neural Topic Model (GNTM)

This is the pytorch implementation of the paper "Topic Modeling Revisited: A Document Graph-based Neural Network Perspective"

Requirements

Python >= 3.6
Pytorch == 1.6.0
torch-geometric == 1.7.0
torch-scatter == 2.0.6
torch-sparse == 0.6.9

Dataset

The links of the datasets can be found in the following:

The Glove word embeddings can be download from theis link.

The datasets and word embedings should be placed with the guide of the paths in the settings.py.

Usage

Before training GNTM, we first need to preprocess the data by the following scripts (need adjust some parameters based on the description in our paper for different datasets.):

cd dataPrepare
python preprocess.py
python graph_data.py

Example script to train GNTM:

python main.py \
--device cuda:0 \
--dataset News20 \
--model GDGNNMODEL \
--num_topic 20 \
--num_epoch 400 \
--ni 300  \
--word \
--taskid 0 \
--nwindow  3

Here,

--dataset specifies the dataset name, currently it supports News20, TMN, BNC and Reuters for 20 News Group, Tag My News, British National Corpus and Reuters, respectively.
--device represents computation device, such as cpu or cuda:0.
--model represents the used model, GDGNNMODEL is corresponding to GNTM
--num_topic represents the number of topics.
--num_epoch represents the maximized number of training epochs.
--ni represents the dimension of word embeddings.
--taskid is corresponding to the random seed.
--nwindow represents the window size to construct dpcument graphs.

Reference

If you find our methods or code helpful, please kindly cite the paper:

@inproceedings{shen2021topic,
  title={Topic Modeling Revisited: A Document Graph-based Neural Network Perspective},
  author={Shen, Dazhong and Qin, Chuan and Wang, Chao and Dong, Zheng and Zhu, Hengshu and Xiong, Hui},
  booktitle={Proceedings of Thirty-fifth Conference on Neural Information Processing Systems (NeurIPS-2021)},
  year={2021}
}

Pytorch implementation of the paper "Topic Modeling Revisited: A Document Graph-based Neural Network Perspective"

Related tags

Overview

Graph Neural Topic Model (GNTM)

Requirements

Dataset

Usage

Reference

Owner

Dazhong Shen

A module for solving and visualizing Schrödinger equation.

EMNLP 2021: Single-dataset Experts for Multi-dataset Question-Answering

3D position tracking for soccer players with multi-camera videos

Implementation of the federated dual coordinate descent (FedDCD) method.

CAMoE + Dual SoftMax Loss (DSL): Improving Video-Text Retrieval by Multi-Stream Corpus Alignment and Dual Softmax Loss

TargetAllDomainObjects - A python wrapper to run a command on against all users/computers/DCs of a Windows Domain

MultiMix: Sparingly Supervised, Extreme Multitask Learning From Medical Images (ISBI 2021, MELBA 2021)

Tensorflow implementation of Semi-supervised Sequence Learning (https://arxiv.org/abs/1511.01432)

One implementation of the paper "DMRST: A Joint Framework for Document-Level Multilingual RST Discourse Segmentation and Parsing".

Patient-Survival - Using Python, I developed a Machine Learning model using classification techniques such as Random Forest and SVM classifiers to predict a patient's survival status that have undergone breast cancer surgery.

The implementation of DeBERTa

SAGE: Sensitivity-guided Adaptive Learning Rate for Transformers

code for Multi-scale Matching Networks for Semantic Correspondence, ICCV

This is the formal code implementation of the CVPR 2022 paper 'Federated Class Incremental Learning'.

Train the HRNet model on ImageNet

Official pytorch implementation of "DSPoint: Dual-scale Point Cloud Recognition with High-frequency Fusion"

Cervix ROI Segmentation Using U-NET

Pytorch implementation of paper: "NeurMiPs: Neural Mixture of Planar Experts for View Synthesis"

PIGLeT: Language Grounding Through Neuro-Symbolic Interaction in a 3D World [ACL 2021]

Laplacian Score-regularized Concrete Autoencoders