ICLR 2022 Paper submission trend analysis

Last update: Dec 06, 2022

Related tags

Data Analysis ICLR2022-OpenReviewData

Overview

Visualize ICLR 2022 OpenReview Data

ICLR 2022 Paper submission analysis from https://openreview.net/group?id=ICLR.cc/2022/Conference

Requirements

pip install wordcloud nltk pandas imageio selenium tqdm

download nltk packages

import nltk
nltk.download('punkt')
nltk.download('averaged_perceptron_tagger')
nltk.download('wordnet')
nltk.download('stopwords')

if you got anything wrong when calling webdriver.Edge('msedgedriver.exe'), you can

Delete msedgedriver.exe since it may only work on my computer (Windows)
Install Microsoft Edge (Chromium): Ensure you have installed Microsoft Edge (Chromium). To confirm that you have Microsoft Edge (Chromium) installed, go to edge://settings/help in the browser, and verify the version number is Version 75 or later.
Download Microsoft Edge Driver:
- Go to edge://settings/help to get the version of Edge.
Navigate to the Microsoft Edge Driver downloads page and download the driver that matches the Edge version number.

From https://stackoverflow.com/questions/63529124/how-to-open-up-microsoft-edge-using-selenium-and-python

Crawl Data

Run crawl_paperlist.py to crawl the list of papers (~0.5h).

Paper List (3,407 submission in total

crawl_paperlist.py only crawls 3,000 papers, but it has 3,407 in total. The full paper list are in follows:

Visualization

Keywords Frequency

The top 50 common keywords (uncased) and their frequency:

Keywords Cloud

The word clouds formed by keywords of submissions show the hot topics including deep learning, reinforcement learning, representation learning, graph neural network, etc.

Title Keywords Frequency

The top 50 common title keywords (uncased) and their frequency:

Title Keywords Cloud

The word clouds formed by keywords of submission titles:

Acknowledgment

Inspired by this repo: https://github.com/evanzd/ICLR2021-OpenReviewData

ICLR 2022 Paper submission trend analysis

Related tags

Overview

Visualize ICLR 2022 OpenReview Data

Requirements

Crawl Data

Paper List (3,407 submission in total

Visualization

Acknowledgment

Owner

Jintang Li

PostQF is a user-friendly Postfix queue data filter which operates on data produced by postqueue -j.

Incubator for useful bioinformatics code, primarily in Python and R

CleanX is an open source python library for exploring, cleaning and augmenting large datasets of X-rays, or certain other types of radiological images.

Aggregating gridded data (xarray) to polygons

A 2-dimensional physics engine written in Cairo

Datashader is a data rasterization pipeline for automating the process of creating meaningful representations of large amounts of data.

pyhsmm MITpyhsmm - Bayesian inference in HSMMs and HMMs. MIT

Developed for analyzing the covariance for OrcVIO

Supply a wrapper ``StockDataFrame`` based on the ``pandas.DataFrame`` with inline stock statistics/indicators support.

Extract data from a wide range of Internet sources into a pandas DataFrame.

Active Learning demo using two small datasets

TE-dependent analysis (tedana) is a Python library for denoising multi-echo functional magnetic resonance imaging (fMRI) data

Statistical & Probabilistic Analysis of Store Sales, University Survey, & Manufacturing data

This repo contains a simple but effective tool made using python which can be used for quality control in statistical approach.

A model checker for verifying properties in epistemic models

PyIOmica (pyiomica) is a Python package for omics analyses.

Gathering data of likes on Tinder within the past 7 days

A real-time financial data streaming pipeline and visualization platform using Apache Kafka, Cassandra, and Bokeh.

A data structure that extends pyspark.sql.DataFrame with metadata information.

TheMachineScraper 🐱‍👤 is an Information Grabber built for Machine Analysis