Kaggle G2Net Gravitational Wave Detection : 2nd place solution

Overview

Kaggle G2Net Gravitational Wave Detection : 2nd place solution

Solution writeup: https://www.kaggle.com/c/g2net-gravitational-wave-detection/discussion/275341

Instructions

1. Download data

You have to download the competition dataset from competition website, and place the files in input/ directory.

┣ input/
┃   ┣ training_labels.csv
┃   ┣ sample_submission.csv
┃   ┣ train/
┃   ┣ test/
┃
┣ configs.py
┣ ...

(Optional:) Add your hardware configurations

# configs.py
HW_CFG = {
    'RTX3090': (16, 128, 1, 24), # CPU count, RAM amount(GB), GPU count, GPU RAM(GB)
    'A100': (9, 60, 1, 40), 
    'Your config', (128, 512, 8, 40) # add your hardware config!
}

2. Setup python environment

conda

conda env create -n kumaconda -f=environment.yaml
conda activate kumaconda

docker

WIP

3. Prepare data

Two new files - input/train.csv and input/test/.csv will be created.

python prep_data.py

(Optional:) Prepare waveform cache

Optionally you can speed up training by making waveform cache.
This is not recommend if your machine has RAM size smaller than 32GB.
input/train_cache.pickle and input/test_cache.pickle will be created.

python prep_data.py --cache

Then, add cache path to Baseline class in configs.py.

# configs.py
class Baseline:
    name = 'baseline'
    seed = 2021
    train_path = INPUT_DIR/'train.csv'
    test_path = INPUT_DIR/'test.csv'
    train_cache = INPUT_DIR/'train_cache.pickle' # here
    test_cache = INPUT_DIR/'test_cache.pickle' # here
    cv = 5

4. Train nueral network

Each experiment class has a name (e.g. name for Nspec16 is nspec_16).
Outputs of an experiment are

  • outoffolds.npy : (train size, 1) np.float32
  • predictions.npy : (cv fold, test size, 1) np.float32
  • {name}_{timestamp}.log : training log
  • foldx.pt : pytorch checkpoint

All outputs will be created in results/{name}/.

python train.py --config {experiment class}
# [Options]
# --progress_bar    : Everyone loves progress bar
# --inference       : Run inference only
# --tta             : Run test time augmentations (FlipWave)
# --limit_fold x    : Train a single fold x. You must run inference again by yourself.

5. Train neural network again (pseudo-label)

For experiments with name starting with Pseudo, you must use train_pseudo.py.
Outputs and options are the same as train.py.
Make sure the dependent experiment (see the table below) was successfully run.

python train_pseudo.py --config {experiment class}

Experiments

# Experiment Dependency Frontend Backend Input size CV Public LB Private LB
1 Pseudo06 Nspec12 CWT efficientnet-b2 256 x 512 0.8779 0.8797 0.8782
2 Pseodo07 Nspec16 CWT efficientnet-b2 128 x 1024 0.87841 0.8801 0.8787
3 Pseudo12 Nspec12arch0 CWT densenet201 256 x 512 0.87762 0.8796 0.8782
4 Pseudo13 MultiInstance04 CWT xcit-tiny-p16 384 x 768 0.87794 0.8800 0.8782
5 Pseudo14 Nspec16arch17 CWT efficientnet-b7 128 x 1024 0.87957 0.8811 0.8800
6 Pseudo18 Nspec21 CWT efficientnet-b4 256 x 1024 0.87942 0.8812 0.8797
7 Pseudo10 Nspec16spec13 CWT efficientnet-b2 128 x 1024 0.87875 0.8802 0.8789
8 Pseudo15 Nspec22aug1 WaveNet efficientnet-b2 128 x 1024 0.87846 0.8809 0.8794
9 Pseudo16 Nspec22arch2 WaveNet efficientnet-b6 128 x 1024 0.87982 0.8823 0.8807
10 Pseudo19 Nspec22arch6 WaveNet densenet201 128 x 1024 0.87831 0.8818 0.8804
11 Pseudo17 Nspec23arch3 CNN efficientnet-b6 128 x 1024 0.87982 0.8823 0.8808
12 Pseudo21 Nspec22arch7 WaveNet effnetv2-m 128 x 1024 0.87861 0.8831 0.8815
13 Pseudo22 Nspec23arch5 CNN effnetv2-m 128 x 1024 0.87847 0.8817 0.8799
14 Pseudo23 Nspec22arch12 WaveNet effnetv2-l 128 x 1024 0.87901 0.8829 0.8811
15 Pseudo24 Nspec30arch2 WaveNet efficientnet-b6 128 x 1024 0.8797 0.8817 0.8805
16 Pseudo25 Nspec25arch1 WaveNet efficientnet-b3 256 x 1024 0.87948 0.8820 0.8803
17 Pseudo26 Nspec22arch10 WaveNet resnet200d 128 x 1024 0.87791 0.881 0.8797
18 PseudoSeq04 Seq03aug3 ResNet1d-18 - 0.87663 0.8804 0.8785
19 PseudoSeq07 Seq12arch4 WaveNet - 0.87698 0.8796 0.8784
20 PseudoSeq03 Seq09 DenseNet1d-121 - 0.86826 0.8723 0.8703
Owner
Hiroshechka Y
ML Engineer | Kaggle Master | Public Health
Hiroshechka Y
TensorFlow 101: Introduction to Deep Learning for Python Within TensorFlow

TensorFlow 101: Introduction to Deep Learning I have worked all my life in Machine Learning, and I've never seen one algorithm knock over its benchmar

Sefik Ilkin Serengil 896 Jan 04, 2023
Framework for evaluating ANNS algorithms on billion scale datasets.

Billion-Scale ANN http://big-ann-benchmarks.com/ Install The only prerequisite is Python (tested with 3.6) and Docker. Works with newer versions of Py

Harsha Vardhan Simhadri 132 Dec 24, 2022
[ICCV'2021] Image Inpainting via Conditional Texture and Structure Dual Generation

[ICCV'2021] Image Inpainting via Conditional Texture and Structure Dual Generation

Xiefan Guo 122 Dec 11, 2022
RITA is a family of autoregressive protein models, developed by LightOn in collaboration with the OATML group at Oxford and the Debora Marks Lab at Harvard.

RITA: a Study on Scaling Up Generative Protein Sequence Models RITA is a family of autoregressive protein models, developed by a collaboration of Ligh

LightOn 69 Dec 22, 2022
RoBERTa Marathi Language model trained from scratch during huggingface 🤗 x flax community week

RoBERTa base model for Marathi Language (मराठी भाषा) Pretrained model on Marathi language using a masked language modeling (MLM) objective. RoBERTa wa

Nipun Sadvilkar 23 Oct 19, 2022
Official repository of OFA. Paper: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework

Paper | Blog OFA is a unified multimodal pretrained model that unifies modalities (i.e., cross-modality, vision, language) and tasks (e.g., image gene

OFA Sys 1.4k Jan 08, 2023
Change Detection in SAR Images Based on Multiscale Capsule Network

SAR_CD_MS_CapsNet Code for the paper "Change Detection in SAR Images Based on Multiscale Capsule Network" , IEEE Geoscience and Remote Sensing Letters

Feng Gao 21 Nov 29, 2022
The code uses SegFormer for Semantic Segmentation on Drone Dataset.

SegFormer_Segmentation The code uses SegFormer for Semantic Segmentation on Drone Dataset. The details for the SegFormer can be obtained from the foll

Dr. Sander Ali Khowaja 1 May 08, 2022
Source code for our paper "Empathetic Response Generation with State Management"

Source code for our paper "Empathetic Response Generation with State Management" this repository is maintained by both Jun Gao and Yuhan Liu Model Ove

Yuhan Liu 3 Oct 08, 2022
An implementation of the AlphaZero algorithm for Gomoku (also called Gobang or Five in a Row)

AlphaZero-Gomoku This is an implementation of the AlphaZero algorithm for playing the simple board game Gomoku (also called Gobang or Five in a Row) f

Junxiao Song 2.8k Dec 26, 2022
Multi-Task Deep Neural Networks for Natural Language Understanding

New Release We released Adversarial training for both LM pre-training/finetuning and f-divergence. Large-scale Adversarial training for LMs: ALUM code

Xiaodong 2.1k Dec 30, 2022
Generic ecosystem for feature extraction from aerial and satellite imagery

Note: Robosat is neither maintained not actively developed any longer by Mapbox. See this issue. The main developers (@daniel-j-h, @bkowshik) are no l

Mapbox 1.9k Jan 06, 2023
A Jupyter notebook to play with NVIDIA's StyleGAN3 and OpenAI's CLIP for a text-based guided image generation.

A Jupyter notebook to play with NVIDIA's StyleGAN3 and OpenAI's CLIP for a text-based guided image generation.

Eugenio Herrera 175 Dec 29, 2022
This repository provides the official code for GeNER (an automated dataset Generation framework for NER).

GeNER This repository provides the official code for GeNER (an automated dataset Generation framework for NER). Overview of GeNER GeNER allows you to

DMIS Laboratory - Korea University 50 Nov 30, 2022
A new video text spotting framework with Transformer

TransVTSpotter: End-to-end Video Text Spotter with Transformer Introduction A Multilingual, Open World Video Text Dataset and End-to-end Video Text Sp

weijiawu 67 Jan 03, 2023
This project provides the proof of the uniqueness of the equilibrium and the global asymptotic stability.

Delayed-cellular-neural-network This project provides the proof of the uniqueness of the equilibrium and the global asymptotic stability. There is als

4 Apr 28, 2022
Cancer metastasis detection with neural conditional random field (NCRF)

NCRF Prerequisites Data Whole slide images Annotations Patch images Model Training Testing Tissue mask Probability map Tumor localization FROC evaluat

Baidu Research 731 Jan 01, 2023
Learning a mapping from images to psychological similarity spaces with neural networks.

LearningPsychologicalSpaces v0.1: v1.1: v1.2: v1.3: v1.4: v1.5: The code in this repository explores learning a mapping from images to psychological s

Lucas Bechberger 8 Dec 12, 2022
particle tracking model, works with the ROMS output file(qck.nc, his.nc)

particle-tracking-model-for-ROMS particle tracking model, works with the ROMS output file(qck.nc, his.nc) description this is a 2-dimensional particle

xusheng 1 Jan 11, 2022
Final term project for Bayesian Machine Learning Lecture (XAI-623)

Mixquality_AL Final Term Project For Bayesian Machine Learning Lecture (XAI-623) Youtube Link The presentation is given in YoutubeLink Problem Formula

JeongEun Park 3 Jan 18, 2022