For encoding a text longer than 512 tokens, for example 800. Set max_pos to 800 during both preprocessing and training.

Overview

LongScientificFormer

For encoding a text longer than 512 tokens, for example 800. Set max_pos to 800 during both preprocessing and training.

Some codes are borrowed from ONMT(https://github.com/OpenNMT/OpenNMT-py)

Data Preparation

Option 1: download the processed data

Pre-processed data

Put all files into raw_data directory

Step 2. Download Stanford CoreNLP

We will need Stanford CoreNLP to tokenize the data. Download it here and unzip it. Then add the following command to your bash_profile:

export CLASSPATH=/path/to/stanford-corenlp-4.2.2/stanford-corenlp-4.2.2.jar

replacing /path/to/ with the path to where you saved the stanford-corenlp-4.2.2 directory.

step 3. extracting sections from GROBID XML files

python preprocess.py -mode extract_pdf_sections -log_file ../logs/extract_section.log

step 4. extracting text from TIKA XML files

python preprocess.py -mode get_text_clean_tika -log_file ../logs/extract_tika_text.log

step 5. Tokenize texts from papers and slides using stanfordCoreNLP

python preprocess.py -mode tokenize  -save_path ../temp -log_file ../logs/tokenize_by_corenlp.log

Step 6. Extract source, section, and target from tokenized files

python preprocess.py -mode clean_paper_jsons -save_path ../json_data/  -n_cpus 10 -log_file ../logs/build_json.log

Step 7. Generate BERT .pt files from source, sections and targets

python preprocess.py -mode format_to_bert -raw_path ../json_data/ -save_path ../bert_data  -lower -n_cpus 40 -log_file ../logs/build_bert_files.log

Model Training

First run: For the first time, you should use single-GPU, so the code can download the BERT model. Use -visible_gpus -1, after downloading, you could kill the process and rerun the code with multi-GPUs.

Train

python train.py  -ext_dropout 0.1 -lr 2e-3  -visible_gpus 1,2,3 -report_every 200 -save_checkpoint_steps 1000 -batch_size 1 -train_steps 100000 -accum_count 2  -log_file ../logs/ext_bert -use_interval true -warmup_steps 10000

To continue training from a checkpoint

python train.py  -ext_dropout 0.1 -lr 2e-3  -train_from ../models/model_step_99000.pt -visible_gpus 1,2,3 -report_every 200 -save_checkpoint_steps 1000 -batch_size 1 -train_steps 100000 -accum_count 2  -log_file ../logs/ext_bert -use_interval true -warmup_steps 10000

Test

python train.py -mode test  -test_batch_size 1 -bert_data_path ../bert_data -log_file ../logs/ext_bert_test -test_from ../models/model_step_99000.pt -model_path ../models -sep_optim true -use_interval true -visible_gpus 1,2,3 -alpha 0.95 -result_path ../results/ext 
Owner
Athar Sefid
Athar Sefid
Orthogonal Jacobian Regularization for Unsupervised Disentanglement in Image Generation (ICCV 2021)

Orthogonal Jacobian Regularization for Unsupervised Disentanglement in Image Generation Home | PyTorch BigGAN Discovery | TensorFlow ProGAN Regulariza

Yuxiang Wei 54 Dec 30, 2022
[NeurIPS 2021] "Drawing Robust Scratch Tickets: Subnetworks with Inborn Robustness Are Found within Randomly Initialized Networks" by Yonggan Fu, Qixuan Yu, Yang Zhang, Shang Wu, Xu Ouyang, David Cox, Yingyan Lin

Drawing Robust Scratch Tickets: Subnetworks with Inborn Robustness Are Found within Randomly Initialized Networks Yonggan Fu, Qixuan Yu, Yang Zhang, S

12 Dec 11, 2022
TransMVSNet: Global Context-aware Multi-view Stereo Network with Transformers.

TransMVSNet This repository contains the official implementation of the paper: "TransMVSNet: Global Context-aware Multi-view Stereo Network with Trans

旷视研究院 3D 组 155 Dec 29, 2022
Implementation of StyleSpace Analysis: Disentangled Controls for StyleGAN Image Generation in PyTorch

StyleSpace Analysis: Disentangled Controls for StyleGAN Image Generation Implementation of StyleSpace Analysis: Disentangled Controls for StyleGAN Ima

Xuanchi Ren 86 Dec 07, 2022
Mercer Gaussian Process (MGP) and Fourier Gaussian Process (FGP) Regression

Mercer Gaussian Process (MGP) and Fourier Gaussian Process (FGP) Regression We provide the code used in our paper "How Good are Low-Rank Approximation

Aristeidis (Ares) Panos 0 Dec 13, 2021
Pytorch code for our paper "Feedback Network for Image Super-Resolution" (CVPR2019)

Feedback Network for Image Super-Resolution [arXiv] [CVF] [Poster] Update: Our proposed Gated Multiple Feedback Network (GMFN) will appear in BMVC2019

Zhen Li 539 Jan 06, 2023
A PyTorch implementation for Unsupervised Domain Adaptation by Backpropagation(DANN), support Office-31 and Office-Home dataset

DANN A PyTorch implementation for Unsupervised Domain Adaptation by Backpropagation Prerequisites Linux or OSX NVIDIA GPU + CUDA (may CuDNN) and corre

8 Apr 16, 2022
😮The official implementation of "CoNeRF: Controllable Neural Radiance Fields" 😮

CoNeRF: Controllable Neural Radiance Fields This is the official implementation for "CoNeRF: Controllable Neural Radiance Fields" Project Page Paper V

Kacper Kania 61 Dec 24, 2022
Rule based classification A hotel s customers dataset

Rule-based-classification-A-hotel-s-customers-dataset- Aim: Categorize new customers by segment and predict how much revenue they can generate This re

Şebnem 4 Jan 02, 2022
A simple API wrapper for Discord interactions.

Your ultimate Discord interactions library for discord.py. About | Installation | Examples | Discord | PyPI About What is discord-py-interactions? dis

james 641 Jan 03, 2023
A Dynamic Residual Self-Attention Network for Lightweight Single Image Super-Resolution

DRSAN A Dynamic Residual Self-Attention Network for Lightweight Single Image Super-Resolution Karam Park, Jae Woong Soh, and Nam Ik Cho Environments U

4 May 10, 2022
Build upon neural radiance fields to create a scene-specific implicit 3D semantic representation, Semantic-NeRF

Semantic-NeRF: Semantic Neural Radiance Fields Project Page | Video | Paper | Data In-Place Scene Labelling and Understanding with Implicit Scene Repr

Shuaifeng Zhi 243 Jan 07, 2023
Self-supervised spatio-spectro-temporal represenation learning for EEG analysis

EEG-Oriented Self-Supervised Learning and Cluster-Aware Adaptation This repository provides a tensorflow implementation of a submitted paper: EEG-Orie

Wonjun Ko 4 Jun 09, 2022
CBREN: Convolutional Neural Networks for Constant Bit Rate Video Quality Enhancement

CBREN This is the Pytorch implementation for our IEEE TCSVT paper : CBREN: Convolutional Neural Networks for Constant Bit Rate Video Quality Enhanceme

Zhao Hengrun 3 Nov 04, 2022
Full Stack Deep Learning Labs

Full Stack Deep Learning Labs Welcome! Project developed during lab sessions of the Full Stack Deep Learning Bootcamp. We will build a handwriting rec

Full Stack Deep Learning 1.2k Dec 31, 2022
The pytorch implementation of the paper "text-guided neural image inpainting" at MM'2020

TDANet: Text-Guided Neural Image Inpainting, MM'2020 (Oral) MM | ArXiv This repository implements the paper "Text-Guided Neural Image Inpainting" by L

LisaiZhang 75 Dec 22, 2022
PyTorch implementation for SDEdit: Image Synthesis and Editing with Stochastic Differential Equations

SDEdit: Image Synthesis and Editing with Stochastic Differential Equations Project | Paper | Colab PyTorch implementation of SDEdit: Image Synthesis a

536 Jan 05, 2023
DL & CV-based indicator toolset for the vehicle drivers via live dash-cam footage.

Vehicle Indicator Toolset Deep Learning and Computer Vision based indicator toolset for vehicle drivers using live dash-cam footages. Tracking of vehi

Alex Xu 12 Dec 28, 2021
PyTorch implementation of the implicit Q-learning algorithm (IQL)

Implicit-Q-Learning (IQL) PyTorch implementation of the implicit Q-learning algorithm IQL (Paper) Currently only implemented for online learning. Offl

Sebastian Dittert 27 Dec 30, 2022
Pytorch implementation of paper "Learning Co-segmentation by Segment Swapping for Retrieval and Discovery"

SegSwap Pytorch implementation of paper "Learning Co-segmentation by Segment Swapping for Retrieval and Discovery" [PDF] [Project page] If our project

xshen 41 Dec 10, 2022