Tensorflow implementation of Character-Aware Neural Language Models.

Overview

Character-Aware Neural Language Models

Tensorflow implementation of Character-Aware Neural Language Models. The original code of author can be found here.

model.png

This implementation contains:

  1. Word-level and Character-level Convolutional Neural Network
  2. Highway Network
  3. Recurrent Neural Network Language Model

The current implementation has a performance issue. See #3.

Prerequisites

Usage

To train a model with ptb dataset:

$ python main.py --dataset ptb

To test an existing model:

$ python main.py --dataset ptb --forward_only True

To see all training options, run:

$ python main.py --help

which will print

usage: main.py [-h] [--epoch EPOCH] [--word_embed_dim WORD_EMBED_DIM]
              [--char_embed_dim CHAR_EMBED_DIM]
              [--max_word_length MAX_WORD_LENGTH] [--batch_size BATCH_SIZE]
              [--seq_length SEQ_LENGTH] [--learning_rate LEARNING_RATE]
              [--decay DECAY] [--dropout_prob DROPOUT_PROB]
              [--feature_maps FEATURE_MAPS] [--kernels KERNELS]
              [--model MODEL] [--data_dir DATA_DIR] [--dataset DATASET]
              [--checkpoint_dir CHECKPOINT_DIR]
              [--forward_only [FORWARD_ONLY]] [--noforward_only]
              [--use_char [USE_CHAR]] [--nouse_char] [--use_word [USE_WORD]]
              [--nouse_word]

optional arguments:
  -h, --help            show this help message and exit
  --epoch EPOCH         Epoch to train [25]
  --word_embed_dim WORD_EMBED_DIM
                        The dimension of word embedding matrix [650]
  --char_embed_dim CHAR_EMBED_DIM
                        The dimension of char embedding matrix [15]
  --max_word_length MAX_WORD_LENGTH
                        The maximum length of word [65]
  --batch_size BATCH_SIZE
                        The size of batch images [100]
  --seq_length SEQ_LENGTH
                        The # of timesteps to unroll for [35]
  --learning_rate LEARNING_RATE
                        Learning rate [1.0]
  --decay DECAY         Decay of SGD [0.5]
  --dropout_prob DROPOUT_PROB
                        Probability of dropout layer [0.5]
  --feature_maps FEATURE_MAPS
                        The # of feature maps in CNN
                        [50,100,150,200,200,200,200]
  --kernels KERNELS     The width of CNN kernels [1,2,3,4,5,6,7]
  --model MODEL         The type of model to train and test [LSTM, LSTMTDNN]
  --data_dir DATA_DIR   The name of data directory [data]
  --dataset DATASET     The name of dataset [ptb]
  --checkpoint_dir CHECKPOINT_DIR
                        Directory name to save the checkpoints [checkpoint]
  --forward_only [FORWARD_ONLY]
                        True for forward only, False for training [False]
  --noforward_only
  --use_char [USE_CHAR]
                        Use character-level language model [True]
  --nouse_char
  --use_word [USE_WORD]
                        Use word-level language [False]
  --nouse_word

but more options can be found in models/LSTMTDNN and models/TDNN.

Performance

Failed to reproduce the results of paper (2016.02.12). If you are looking for a code that reproduced the paper's result, see https://github.com/mkroutikov/tf-lstm-char-cnn.

loss

The perplexity on the test sets of Penn Treebank (PTB) corpora.

Name Character embed LSTM hidden units Paper (Y Kim 2016) This repo.
LSTM-Char-Small 15 100 92.3 in progress
LSTM-Char-Large 15 150 78.9 in progress

Author

Taehoon Kim / @carpedm20

Owner
Taehoon Kim
ex OpenAI
Taehoon Kim
Face Mask Detection on Image and Video using tensorflow and keras

Face-Mask-Detection Face Mask Detection on Image and Video using tensorflow and keras Train Neural Network on face-mask dataset using tensorflow and k

Nahid Ebrahimian 12 Nov 11, 2022
Real-time multi-object tracker using YOLO v5 and deep sort

This repository contains a two-stage-tracker. The detections generated by YOLOv5, a family of object detection architectures and models pretrained on the COCO dataset, are passed to a Deep Sort algor

Mike 3.6k Jan 05, 2023
Udacity Suse Cloud Native Foundations Scholarship Course Walkthrough

SUSE Cloud Native Foundations Scholarship Udacity is collaborating with SUSE, a global leader in true open source solutions, to empower developers and

Shivansh Srivastava 34 Oct 18, 2022
Supervised domain-agnostic prediction framework for probabilistic modelling

A supervised domain-agnostic framework that allows for probabilistic modelling, namely the prediction of probability distributions for individual data

The Alan Turing Institute 112 Oct 23, 2022
A self-supervised learning framework for audio-visual speech

AV-HuBERT (Audio-Visual Hidden Unit BERT) Learning Audio-Visual Speech Representation by Masked Multimodal Cluster Prediction Robust Self-Supervised A

Meta Research 431 Jan 07, 2023
Predicting future trajectories of people in cameras of novel scenarios and views.

Pedestrian Trajectory Prediction Predicting future trajectories of pedestrians in cameras of novel scenarios and views. This repository contains the c

8 Sep 03, 2022
Simultaneous Detection and Segmentation

Simultaneous Detection and Segmentation This is code for the ECCV Paper: Simultaneous Detection and Segmentation Bharath Hariharan, Pablo Arbelaez,

Bharath Hariharan 96 Jul 20, 2022
When BERT Plays the Lottery, All Tickets Are Winning

When BERT Plays the Lottery, All Tickets Are Winning Large Transformer-based models were shown to be reducible to a smaller number of self-attention h

Sai 16 Nov 10, 2022
Exploration of some patients clinical variables.

Answer_ALS_clinical_data Exploration of some patients clinical variables. All the clinical / metadata data is available here: https://data.answerals.o

1 Jan 20, 2022
Pytorch modules for paralel models with same architecture. Ideal for multi agent-based systems

WideLinears Pytorch parallel Neural Networks A package of pytorch modules for fast paralellization of separate deep neural networks. Ideal for agent-b

1 Dec 17, 2021
Face Mask Detection System built with OpenCV, TensorFlow using Computer Vision concepts

Face mask detection Face Mask Detection System built with OpenCV, TensorFlow using Computer Vision concepts in order to detect face masks in static im

Vaibhav Shukla 1 Oct 27, 2021
Learning Pixel-level Semantic Affinity with Image-level Supervision for Weakly Supervised Semantic Segmentation, CVPR 2018

Learning Pixel-level Semantic Affinity with Image-level Supervision This code is deprecated. Please see https://github.com/jiwoon-ahn/irn instead. Int

Jiwoon Ahn 337 Dec 15, 2022
This repository contains a pytorch implementation of "StereoPIFu: Depth Aware Clothed Human Digitization via Stereo Vision".

StereoPIFu: Depth Aware Clothed Human Digitization via Stereo Vision | Project Page | Paper | This repository contains a pytorch implementation of "St

87 Dec 09, 2022
The dynamics of representation learning in shallow, non-linear autoencoders

The dynamics of representation learning in shallow, non-linear autoencoders The package is written in python and uses the pytorch implementation to ML

Maria Refinetti 4 Jun 08, 2022
Massively parallel Monte Carlo diffusion MR simulator written in Python.

Disimpy Disimpy is a Python package for generating simulated diffusion-weighted MR signals that can be useful in the development and validation of dat

Leevi 16 Nov 11, 2022
Official PyTorch Implementation of Unsupervised Learning of Scene Flow Estimation Fusing with Local Rigidity

UnRigidFlow This is the official PyTorch implementation of UnRigidFlow (IJCAI2019). Here are two sample results (~10MB gif for each) of our unsupervis

Liang Liu 28 Nov 16, 2022
TransReID: Transformer-based Object Re-Identification

TransReID: Transformer-based Object Re-Identification [arxiv] The official repository for TransReID: Transformer-based Object Re-Identification achiev

569 Dec 30, 2022
This repo contains the official code and pre-trained models for the Dynamic Vision Transformer (DVT).

Dynamic-Vision-Transformer (Pytorch) This repo contains the official code and pre-trained models for the Dynamic Vision Transformer (DVT). Not All Ima

210 Dec 18, 2022
Code for Learning Manifold Patch-Based Representations of Man-Made Shapes, in ICLR 2021.

LearningPatches | Webpage | Paper | Video Learning Manifold Patch-Based Representations of Man-Made Shapes Dmitriy Smirnov, Mikhail Bessmeltsev, Justi

Dima Smirnov 22 Nov 14, 2022