This code is the implementation of the paper "Coherence-Based Distributed Document Representation Learning for Scientific Documents".

Overview

Introduction

This code is the implementation of the paper "Coherence-Based Distributed Document Representation Learning for Scientific Documents".

If you find this code useful, please cite the following paper:

@article{tan2022coherence,
  title = {Coherence-Based Distributed Document Representation Learning for Scientific Documents},
  author = {Tan, Shicheng and Zhao, Shu and Zhang, Yanping},
  journal = {arXiv},
  year = {2022},
  type = {Journal Article}
}

Run

  1. Installation environment (ref. requirements.txt)
  2. Download data: Link: https://pan.baidu.com/s/1EEJk0_P55Ov5ReXsmyVZPA Password: rkh0
  3. python _av_CTE.py

信息检索数据运行指南

  1. 数据处理(4个文件):使用“...data helper-IR.py”获取3份数据,原始数据处理暂存文件、原始数据处理暂存文件的语料、构建的数据集,然后使用“_aj_get dataset corpus.py”获得 构建的数据集的语料
  2. 词向量训练(4个文件):使用“_ak_get word embedding.py”训练第一步的2个语料得到2个词表和2个词向量文件,glove需要去除后缀名“.txt”
  3. 运行5次“_al_em-avg.py”得到5个结果,avg-word2vec、avg-word2vec(globe)、avg-glove、avg-glove(globe)、random embedding
  4. 运行“_ac_tf-idf.py”得到一个距离矩阵和1个结果,矩阵用于CTE方法
  5. LDA、doc2vec、BM25、LSI、GPT2、XLNet、GPT、Transformer-XL、XLM 对应文件各运行一次得到9个结果
  6. 运行“_ah_WMD.py”4次得到4个结果,WMD-word2vec、WMD-word2vec(globe)、WMD-glove、WMD-glove(globe)
  7. 运行“_at_BERT.py”2次得到2个结果,BERT-Large uncased、BERT-Large uncased(wwm)
  8. 运行“_at_ELMo.py”2次得到2个结果,ELMo-Original(5.5B)、ELMo-Original(5.5B,级联)
  9. 运行“_av_CET.py”13次得到13个结果,基于 random embedding 等13种基础词向量
Owner
tsc
Artificial general intelligence
tsc
The official implementation for ACL 2021 "Challenges in Information Seeking QA: Unanswerable Questions and Paragraph Retrieval".

Code for "Challenges in Information Seeking QA: Unanswerable Questions and Paragraph Retrieval" (ACL 2021, Long) This is the repository for baseline m

Akari Asai 25 Oct 30, 2022
Breast cancer is been classified into benign tumour and malignant tumour.

Breast cancer is been classified into benign tumour and malignant tumour. Logistic regression is applied in this model.

1 Feb 04, 2022
[ECCV'20] Convolutional Occupancy Networks

Convolutional Occupancy Networks Paper | Supplementary | Video | Teaser Video | Project Page | Blog Post This repository contains the implementation o

622 Dec 30, 2022
Construct a neural network frame by Numpy

本项目的CSDN博客链接:https://blog.csdn.net/weixin_41578567/article/details/111482022 1. 概览 本项目主要用于神经网络的学习,通过基于numpy的实现,了解神经网络底层前向传播、反向传播以及各类优化器的原理。 该项目目前已实现的功

24 Jan 22, 2022
A ssl analyzer which could analyzer target domain's certificate.

ssl_analyzer A ssl analyzer which could analyzer target domain's certificate. Analyze the domain name ssl certificate information according to the inp

vincent 17 Dec 12, 2022
CT-Net: Channel Tensorization Network for Video Classification

[ICLR2021] CT-Net: Channel Tensorization Network for Video Classification @inproceedings{ li2021ctnet, title={{\{}CT{\}}-Net: Channel Tensorization Ne

33 Nov 15, 2022
Composing methods for ML training efficiency

MosaicML Composer contains a library of methods, and ways to compose them together for more efficient ML training.

MosaicML 2.8k Jan 08, 2023
DGL-TreeSearch and the Gurobi-MWIS interface

Independent Set Benchmarking Suite This repository contains the code for our maximum independent set benchmarking suite as well as our implementations

Maximilian Böther 19 Nov 22, 2022
Syllabic Quantity Patterns as Rhythmic Features for Latin Authorship Attribution

Syllabic Quantity Patterns as Rhythmic Features for Latin Authorship Attribution Abstract Within the Latin (and ancient Greek) production, it is well

4 Dec 03, 2022
Official code release for "GRAF: Generative Radiance Fields for 3D-Aware Image Synthesis"

GRAF This repository contains official code for the paper GRAF: Generative Radiance Fields for 3D-Aware Image Synthesis. You can find detailed usage i

349 Dec 29, 2022
Listing arxiv - Personalized list of today's articles from ArXiv

Personalized list of today's articles from ArXiv Print and/or send to your gmail

Lilianne Nakazono 5 Jun 17, 2022
Code for the ICML 2021 paper "Bridging Multi-Task Learning and Meta-Learning: Towards Efficient Training and Effective Adaptation", Haoxiang Wang, Han Zhao, Bo Li.

Bridging Multi-Task Learning and Meta-Learning Code for the ICML 2021 paper "Bridging Multi-Task Learning and Meta-Learning: Towards Efficient Trainin

AI Secure 57 Dec 15, 2022
x-transformers-paddle 2.x version

x-transformers-paddle x-transformers-paddle 2.x version paddle 2.x版本 https://github.com/lucidrains/x-transformers 。 requirements paddlepaddle-gpu==2.2

yujun 7 Dec 08, 2022
Meandering In Networks of Entities to Reach Verisimilar Answers

MINERVA Meandering In Networks of Entities to Reach Verisimilar Answers Code and models for the paper Go for a Walk and Arrive at the Answer - Reasoni

Shehzaad Dhuliawala 271 Dec 13, 2022
PyMatting: A Python Library for Alpha Matting

Given an input image and a hand-drawn trimap (top row), alpha matting estimates the alpha channel of a foreground object which can then be composed onto a different background (bottom row).

PyMatting 1.4k Dec 30, 2022
Neural Style and MSG-Net

PyTorch-Style-Transfer This repo provides PyTorch Implementation of MSG-Net (ours) and Neural Style (Gatys et al. CVPR 2016), which has been included

Hang Zhang 904 Dec 21, 2022
Official PyTorch Implementation of Convolutional Hough Matching Networks, CVPR 2021 (oral)

Convolutional Hough Matching Networks This is the implementation of the paper "Convolutional Hough Matching Network" by J. Min and M. Cho. Implemented

Juhong Min 70 Nov 22, 2022
The final project of "Applying AI to 3D Medical Imaging Data" from "AI for Healthcare" nanodegree - Udacity.

Quantifying Hippocampus Volume for Alzheimer's Progression Background Alzheimer's disease (AD) is a progressive neurodegenerative disorder that result

Omar Laham 1 Jan 14, 2022
Image-based Navigation in Real-World Environments via Multiple Mid-level Representations: Fusion Models Benchmark and Efficient Evaluation

Image-based Navigation in Real-World Environments via Multiple Mid-level Representations: Fusion Models Benchmark and Efficient Evaluation This reposi

First Person Vision @ Image Processing Laboratory - University of Catania 1 Aug 21, 2022
Omnidirectional camera calibration in python

Omnidirectional Camera Calibration Key features pure python initial solution based on A Toolbox for Easily Calibrating Omnidirectional Cameras (Davide

Thomas Pönitz 12 Nov 22, 2022