This project deals with a simplified version of a more general problem of Aspect Based Sentiment Analysis.

Overview

Aspect_Based_Sentiment_Extraction

Created on: 5th Jan, 2022.

This project deals with an important field of Natural Lnaguage Processing - Aspect Based Sentiment Analysis (ABSA). But the problem statement here is rather a simplified version of the more general ABSA.
Aspect-Based Sentiment analysis is a type of text analysis that categorizes opinions by aspect and identifies the sentiment related to each aspect. Aspects are important words that are of importance to a business or organization, where they want to be able to provide their customers with insights on how their customers feel about these important words.
The general ABSA problem, which is an active area of machine learning research, is about finding all the possible aspects and the corresponding sentiments associated with those aspects in a given text or a document. For example, given a sentence like “I like apples very much, but I hate kiwi”, an ideal absa system should be able to identify aspects like apples and kiwi with correct sentiments of positive and negative respectively.
But here, in the problem statement that this project deals with, an aspect word/phrase is already given from the given text, which means that our problem is rather simplified and we don’t need to worry about the complex task of identifying aspects as well in the text, at least for this problem statement that I am dealing with. In future, I will be working with the more general version of this problem, where aspects are also needed to be indentified.


A brief description of approach

This article explores the use of a pre-trained language model, BERT (Bidirectional Encoder Representaton from Transformers), for the purpose of solving the aforementioned problem. BERT offers very robust contextual embeddings which are useful to solve the variety of problems. Therefore, the sole idea here is to explore the modelling capabilities of the BERT embeddings, by making use of the sentence pair input for the aspect sentiment prediction task. The model which I came up with was able to achieve 99.40% accuracy on the training data and 96.16% accuracy on the test data.

Instructions to run and test files

Clone this repository and navigate to the project folder:
git clone https://github.com/stardust-88/Aspect_Based_Sentiment_Extraction.git
cd Aspect_Based_sentiment_Extraction

To install the dependencies:
pip3 install -r requirements.txt

To train:
Navigate to the src folder and run the below command:
python train.py

For inference:
Navigate to the src folder and run the below command:
python inference.py

Instructions for using trained model weights

I have saved my trained weights to google drive and generated the link, which can be used to download the same. This can be done through below steps.

  1. Navigate to the the models directory.
  2. When inside the models directory, run the file download_model.py: python download_model.py

So, if the user wants to do the inference using pre-trained weights, first download the weights following above two steps, then then run the inference.py script.

Results from the model

  1. Accuracy curve:

  1. Loss curve:

  1. Classification report:

  1. Confusion matrix:

Owner
Naman Rastogi
An undergraduate in Computer Science and Engineering. Trying to discover fundamental patterns with machine learning.
Naman Rastogi
Topic Modelling for Humans

gensim – Topic Modelling in Python Gensim is a Python library for topic modelling, document indexing and similarity retrieval with large corpora. Targ

RARE Technologies 13.8k Jan 02, 2023
Simple program that translates the name of files into English

Simple program that translates the name of files into English. Useful for when editing/inspecting programs that were developed in a foreign language.

0 Dec 22, 2021
Code repository of the paper Neural circuit policies enabling auditable autonomy published in Nature Machine Intelligence

Code repository of the paper Neural circuit policies enabling auditable autonomy published in Nature Machine Intelligence

9 Jan 08, 2023
Code for the project carried out fulfilling the course requirements for Fall 2021 NLP at NYU

Introduction Fairseq(-py) is a sequence modeling toolkit that allows researchers and developers to train custom models for translation, summarization,

Sai Himal Allu 1 Apr 25, 2022
Official source for spanish Language Models and resources made @ BSC-TEMU within the "Plan de las Tecnologías del Lenguaje" (Plan-TL).

Spanish Language Models 💃🏻 Corpora 📃 Corpora Number of documents Size (GB) BNE 201,080,084 570GB Models 🤖 RoBERTa-base BNE: https://huggingface.co

PlanTL-SANIDAD 203 Dec 20, 2022
Final Project for the Intel AI Readiness Boot Camp NLP (Jan)

NLP Boot Camp (Jan) Synopsis Full Name: Prameya Mohanty Name of your School: Delhi Public School, Rourkela Class: VIII Title of the Project: iTransect

TheCodingHub 1 Feb 01, 2022
Python code for ICLR 2022 spotlight paper EViT: Expediting Vision Transformers via Token Reorganizations

Expediting Vision Transformers via Token Reorganizations This repository contain

Youwei Liang 101 Dec 26, 2022
Creating a chess engine using GPT-3

GPT3Chess Creating a chess engine using GPT-3 Code for my article : https://towardsdatascience.com/gpt-3-play-chess-d123a96096a9 My game (white) vs GP

19 Dec 17, 2022
Google's Meena transformer chatbot implementation

Here's my attempt at recreating Meena, a state of the art chatbot developed by Google Research and described in the paper Towards a Human-like Open-Domain Chatbot.

Francesco Pham 94 Dec 25, 2022
Arabic speech recognition, classification and text-to-speech.

klaam Arabic speech recognition, classification and text-to-speech using many advanced models like wave2vec and fastspeech2. This repository allows tr

ARBML 177 Dec 27, 2022
News-Articles-and-Essays - NLP (Topic Modeling and Clustering)

NLP T5 Project proposal Topic Modeling and Clustering of News-Articles-and-Essays Students: Nasser Alshehri Abdullah Bushnag Abdulrhman Alqurashi OVER

2 Jan 18, 2022
A framework for training and evaluating AI models on a variety of openly available dialogue datasets.

ParlAI (pronounced “par-lay”) is a python framework for sharing, training and testing dialogue models, from open-domain chitchat, to task-oriented dia

Facebook Research 9.7k Jan 09, 2023
StarGAN - Official PyTorch Implementation

StarGAN - Official PyTorch Implementation ***** New: StarGAN v2 is available at https://github.com/clovaai/stargan-v2 ***** This repository provides t

Yunjey Choi 5.1k Dec 30, 2022
Unet-TTS: Improving Unseen Speaker and Style Transfer in One-shot Voice Cloning

Unet-TTS: Improving Unseen Speaker and Style Transfer in One-shot Voice Cloning English | 中文 ❗ Now we provide inferencing code and pre-training models

164 Jan 02, 2023
NL-Augmenter 🦎 → 🐍 A Collaborative Repository of Natural Language Transformations

NL-Augmenter 🦎 → 🐍 The NL-Augmenter is a collaborative effort intended to add transformations of datasets dealing with natural language. Transformat

684 Jan 09, 2023
Mycroft Core, the Mycroft Artificial Intelligence platform.

Mycroft Mycroft is a hackable open source voice assistant. Table of Contents Getting Started Running Mycroft Using Mycroft Home Device and Account Man

Mycroft 6.1k Jan 09, 2023
Torchrecipes provides a set of reproduci-able, re-usable, ready-to-run RECIPES for training different types of models, across multiple domains, on PyTorch Lightning.

Recipes are a standard, well supported set of blueprints for machine learning engineers to rapidly train models using the latest research techniques without significant engineering overhead.Specifica

Meta Research 193 Dec 28, 2022
Convolutional Neural Networks for Sentence Classification

Convolutional Neural Networks for Sentence Classification Code for the paper Convolutional Neural Networks for Sentence Classification (EMNLP 2014). R

Yoon Kim 2k Jan 02, 2023
Crowd sourced training data for Rasa NLU models

NLU Training Data Crowd-sourced training data for the development and testing of Rasa NLU models. If you're interested in grabbing some data feel free

Rasa 169 Dec 26, 2022
Machine Learning Course Project, IMDB movie review sentiment analysis by lstm, cnn, and transformer

IMDB Sentiment Analysis This is the final project of Machine Learning Courses in Huazhong University of Science and Technology, School of Artificial I

Daniel 0 Dec 27, 2021