This Repository is an up-to-date version of Harvard nlp's Legacy code and a Refactoring of the jupyter notebook version as a shell script version.

Overview

2022_the_annotated_transformer

Goal

This Repository is an up-to-date version of Harvard nlp's Legacy code and a Refactoring of the jupyter notebook version as a shell script version.

Key points

  • We have re-factored Harvard NLP's Annotated Trasformer into a shell script version.

  • Dataset utilized Multi30K. (The dataset is small, so you can see the results quickly even on computers with low specifications.)

  • We provide the Colab version along with the shell script version, making it easy to modify the model and test the method.

    https://colab.research.google.com/drive/1SrRmC_Ti8IepeHFNBZBjNxl_wkTSJReC?usp=sharing

  • Loss Graph can be drawn.

  • BLEU Score can be measured.

file structure

├── models
│   ├── __init__.py
│   ├── blocks
│   │   ├── __init__.py
│   │   ├── decoder_layer.py
│   │   ├── encoder_layer.py
│   ├── embedding
│   │   ├── __init__.py
│   │   ├── positional_encoding.py
│   │   └── token_embedding.py
│   ├── layers
│   │   ├── __init__.py
│   │   ├── layer_norm.py
│   │   ├── multi_headed_attention.py
│   │   ├── position_wise_feed_forward.py
│   │   └── sublayer_connection.py
│   ├── model
│   │   ├── __init__.py
│   │   ├── decoder.py
│   │   ├── encoder_decoder.py
│   │   ├── encoder.py
│   │   ├── generator.py
│   └── util.py
├── result
│   ├── loss_graph.png
│   ├── train_loss.txt
│   └── valid_loss.txt
├── saved
├── utils
    ├── __init__.py
    ├── batch.py
    ├── batch_size_fn.py
    ├── bleu.py
    ├── data_loader.py
    ├── epoch_time.py
    ├── greedy_decode.py
    ├── label_smoothing.py
    ├── make_model.py
    ├── NoamOpt.py
    ├── run_epoch.py
    ├── simple_loss_compute.py
    └── tokenizer.py
├── README.md
├── test.py
├── train.py
├── config.py
├── data.py
└── graph.py

Training Result

Train Validation loss graph

image

Test set(unseen data) Translation Example

image

Test set(unseen data) BLEU Score Average: 35.870847920953594

Reference

https://nlp.seas.harvard.edu/2018/04/03/attention.html

https://jalammar.github.io/illustrated-transformer/

https://www.facebook.com/groups/TensorFlowKR/permalink/1618169785190740/

https://github.com/hyunwoongko/transformer

Owner
신재욱
신재욱
automatically crawl every URL and find cross site scripting (XSS)

scancss Fastest tool to find XSS. scancss is a fastest tool to detect Cross Site scripting (XSS) automatically and it's also an intelligent payload ge

Md. Nur habib 30 Sep 24, 2022
WhPhisher: a Phishing tool With Python

WhPhisher Herramienta para hacer phishing con muchos métodos de túneling -----Como Instalarlo------- pkg install python3 pkg install git git clone htt

WhBeatZ 80 Jan 02, 2023
CVE-2021-22005 - VMWare vCenter Server File Upload to RCE

CVE-2021-22005 - VMWare vCenter Server File Upload to RCE Analyze Usage ------------------------------------------------------------- [*] CVE-2021-220

r0cky 224 Aug 05, 2022
Bug Alert: a service for alerting security and IT professionals of high-impact and 0day vulnerabilities

Bug Alert Bug Alert is a service for alerting security and IT professionals of h

BugAlert.org 208 Dec 15, 2022
A simple automatic tool for finding vulnerable log4j hosts

Log4Scan A simple automatic tool for finding vulnerable log4j hosts Installation pip3 install -r requirements.txt Usage usage: log4scan.py [-h] (-f FI

Federico Rapetti 20018955 6 Mar 10, 2022
xp_CAPTCHA(白嫖版) burp 验证码 识别 burp插件

xp_CAPTCHA(白嫖版) 说明 xp_CAPTCHA (白嫖版) 验证码识别 burp插件 安装 需要python3 小于3.7的版本 安装 muggle_ocr 模块(大概400M左右) python3 -m pip install -i http://mirrors.aliyun.com/

算命縖子 588 Jan 09, 2023
Show apps recorded storage files by jailbreak

0x101 Show registered storage files of apps by jailbreak Legal disclaimer: Usage of insTof for attacking targets without prior mutual consent is illeg

0x 4 Oct 24, 2022
Threat research and reporting from IronNet's Threat Research Teams

IronNet Threat Research 🕵️ Overview This repository contains IronNet's Threat Research. Research & Reporting 📝 Project Description Cobalt Strike Res

36 Dec 02, 2022
All in One CRACKER911181's Tool. This Tool For Hacking and Pentesting. 🎭

All in One CRACKER911181's Tool. This Tool For Hacking and Pentesting. 🎭

Cracker 331 Jan 01, 2023
SPV SecurePasswordVerification

SPV SecurePasswordVerification Its is python module for doing a secure password verification without sharing the password directly. Features The passw

Merwin 1 Feb 12, 2022
Notebooks, slides and dataset of the CorrelAid Machine Learning Winter School

CorrelAid Machine Learning Spring School Welcome to the CorrelAid ML Spring School! In this repository you can find the slides and other files for the

CorrelAid 12 Nov 23, 2022
SonicWall SMA-100 Unauth RCE Exploit (CVE-2021-20038)

Bad Blood Bad Blood is an exploit for CVE-2021-20038, a stack-based buffer overflow in the httpd binary of SMA-100 series systems using firmware versi

Jake Baines 80 Dec 29, 2022
Separation of Mainlobes and Sidelobes in the Ultrasound Image Based on the Spatial Covariance (MIST) and Aperture-Domain Spectrum of Received Signals

Separation of Mainlobes and Sidelobes in the Ultrasound Image Based on the Spatial Covariance (MIST) and Aperture-Domain Spectrum of Received Signals

Rehman Ali 3 Jan 03, 2023
We protect the privacy of the data on your computer by using the camera of your Debian based Pardus operating system. 🕵️

Pardus Lookout We protect the privacy of the data on your computer by using the camera of your Debian based Pardus operating system. The application i

Ahmet Furkan DEMIR 19 Nov 18, 2022
On-demand scanning for container registries

Lacework registry scanner Install & configure Lacework CLI Integrate a Container Registry Go to Lacework Resources Containers Container Image In

Will Robinson 1 Dec 14, 2021
Make your own huge Wordlist with advanced options

#It's my first tool i hope to be useful for everyone, Make your own huge Wordlist with advanced options, You need python3 to run this tool, If you hav

0.1Arafa 6 Dec 08, 2022
Writing and posting code throughout my new journey into python!

bootleg-productions consider this account to be a journal for me to record my progress throughout my python journey feel free to copy codes from this

1 Dec 30, 2021
这次是可可萝病毒!

可可萝病毒! 事情是这样的,我又开始不干正事了。 众所周知,在Python里,0x0等于0,但是不等于可可萝。 这很不好,我们得把它改成可可萝! 效果 一般的Python—— Python 3.8.0 (tags/v3.8.0:fa919fd, Oct 14 2019, 19:37:50) [MSC

黄巍 29 Jul 14, 2022
HashDB API hash lookup plugin for IDA Pro

HashDB IDA Plugin Malware string hash lookup plugin for IDA Pro. This plugin connects to the OALABS HashDB Lookup Service. Adding New Hash Algorithms

OALabs 237 Dec 21, 2022
Colin O'Flynn's Hacakday talk at Remoticon 2021 support repo.

Hardware Hacking Resources This repo holds some of the examples used in Colin's Hardware Hacking talk at Remoticon 2021. You can see the very sketchy

Colin O'Flynn 19 Sep 12, 2022