Tools and data for measuring the popularity & growth of various programming languages.

Overview

growth-data

Tools and data for measuring the popularity & growth of various programming languages.

Install the dependencies

$ pip install -r requirements.txt

Example queries

Number of (non-fork) repositories

sqlite> .mode column
sqlite> SELECT
    ds,
    github_search_q AS q,
    MAX(github_search_total_count) AS num_repos
  FROM github_search
  GROUP BY 1, 2
  ORDER BY 3;
ds          q                                  num_repos
----------  ---------------------------------  ---------
2021-12-22  language:tla and fork:false        64       
2021-12-22  language:lean and fork:false       75       
2021-12-22  language:idris and fork:false      140      
2021-12-22  language:agda and fork:false       192      
2021-12-22  language:ada and fork:false        438      
2021-12-22  language:coq and fork:false        509      
2021-12-22  language:erlang and fork:false     2260     
2021-12-22  language:ocaml and fork:false      2278     
2021-12-22  language:fortran and fork:false    3196     
2021-12-22  language:verilog and fork:false    3882     
2021-12-22  language:assembly and fork:false   8654     
2021-12-22  language:haskell and fork:false    10052    
2021-12-22  language:terraform and fork:false  10254    
2021-12-22  language:rust and fork:false       21906    
2021-12-22  language:go and fork:false         67601    
2021-12-22  language:r and fork:false          114942   
2021-12-22  language:c and fork:false          174439   
2021-12-22  language:c++ and fork:false        270351   
2021-12-22  language:python and fork:false     762729   
2021-12-22  language:java and fork:false       943381   
sqlite> 

Stats about the average (non-fork) repository

sqlite> .mode column
sqlite> SELECT
    github_search.ds AS ds,
    github_search_q AS q,
    COUNT(*) AS repos,
    SUM(github_repo_has_issues) AS repos_with_issues,
    SUM(github_repo_has_wiki) AS repos_with_wiki,
    SUM(github_repo_has_pages) AS repos_with_pages,
    SUM(github_repo_license_name != '') AS repos_with_license,
    SUM(github_repo_size) AS sum_repo_size,
    SUM(github_repo_stargazers_count) AS sum_stars,
    AVG(github_repo_stargazers_count) AS avg_stars,
    AVG(github_repo_forks_count) AS avg_forks,
    AVG(github_repo_size) AS avg_size,
    AVG(github_repo_open_issues_count) AS avg_open_issues
  FROM github_search INNER JOIN github_search_repo
  ON github_search.obj_id = github_search_obj_id
  GROUP BY 1, 2
  ORDER BY 3;
ds          q                              repos  repos_with_issues  repos_with_wiki  repos_with_pages  repos_with_license  sum_repo_size  sum_stars  avg_stars         avg_forks         avg_size          avg_open_issues  
----------  -----------------------------  -----  -----------------  ---------------  ----------------  ------------------  -------------  ---------  ----------------  ----------------  ----------------  -----------------
2021-12-22  language:tla and fork:false    64     63                 61               1                 23                  1393879        1937       30.265625         2.34375           21779.359375      0.359375         
2021-12-22  language:lean and fork:false   75     73                 72               5                 22                  1119783        1475       19.6666666666667  1.85333333333333  14930.44          1.61333333333333 
2021-12-22  language:idris and fork:false  140    139                136              4                 63                  108818         1242       8.87142857142857  0.85              777.271428571429  0.728571428571429
2021-12-22  language:agda and fork:false   192    188                187              9                 51                  394233         1725       8.984375          0.90625           2053.296875       0.291666666666667
2021-12-22  language:ada and fork:false    438    421                406              12                155                 2387761        2210       5.04566210045662  1.13926940639269  5451.50913242009  1.09360730593607 
2021-12-22  language:coq and fork:false    509    502                493              42                204                 2894476        4304       8.45579567779961  1.50098231827112  5686.59332023576  0.846758349705305
sqlite>

Stats about the average recently-updated (non-fork) repository

sqlite> .mode column
sqlite> SELECT
    github_search.ds AS ds,
    github_search_q AS q,
    COUNT(*) AS repos,
    SUM(github_repo_has_issues) AS repos_with_issues,
    SUM(github_repo_has_wiki) AS repos_with_wiki,
    SUM(github_repo_has_pages) AS repos_with_pages,
    SUM(github_repo_license_name != '') AS repos_with_license,
    SUM(github_repo_size) AS sum_repo_size,
    SUM(github_repo_stargazers_count) AS sum_stars,
    AVG(github_repo_stargazers_count) AS avg_stars,
    AVG(github_repo_forks_count) AS avg_forks,
    AVG(github_repo_size) AS avg_size,
    AVG(github_repo_open_issues_count) AS avg_open_issues
  FROM github_search INNER JOIN github_search_repo
  ON github_search.obj_id = github_search_obj_id
  WHERE github_repo_updated_at >= '2021-01-01T00:00:00Z'
  GROUP BY 1, 2
  ORDER BY 3;
ds          q                              repos  repos_with_issues  repos_with_wiki  repos_with_pages  repos_with_license  sum_repo_size  sum_stars  avg_stars         avg_forks         avg_size          avg_open_issues  
----------  -----------------------------  -----  -----------------  ---------------  ----------------  ------------------  -------------  ---------  ----------------  ----------------  ----------------  -----------------
2021-12-22  language:tla and fork:false    33     32                 30               1                 18                  1322462        1921       58.2121212121212  4.39393939393939  40074.6060606061  0.636363636363636
2021-12-22  language:idris and fork:false  44     44                 43               3                 23                  33576          1052       23.9090909090909  2.22727272727273  763.090909090909  1.61363636363636 
2021-12-22  language:lean and fork:false   46     44                 43               3                 14                  1116533        1442       31.3478260869565  2.93478260869565  24272.4565217391  2.58695652173913 
2021-12-22  language:agda and fork:false   77     74                 75               8                 24                  310115         1520       19.7402597402597  1.93506493506494  4027.46753246753  0.376623376623377
2021-12-22  language:ada and fork:false    168    165                148              10                82                  1615474        2065       12.2916666666667  2.67261904761905  9615.91666666667  2.80357142857143 
2021-12-22  language:coq and fork:false    211    206                201              32                113                 1962100        4018       19.042654028436   3.22748815165877  9299.05213270142  1.89099526066351 
sqlite> 
Code repository of the paper Neural circuit policies enabling auditable autonomy published in Nature Machine Intelligence

Code repository of the paper Neural circuit policies enabling auditable autonomy published in Nature Machine Intelligence

9 Jan 08, 2023
PyABSA - Open & Efficient for Framework for Aspect-based Sentiment Analysis

PyABSA - Open & Efficient for Framework for Aspect-based Sentiment Analysis

YangHeng 567 Jan 07, 2023
A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)

MMF is a modular framework for vision and language multimodal research from Facebook AI Research. MMF contains reference implementations of state-of-t

Facebook Research 5.1k Dec 26, 2022
🐍 A hyper-fast Python module for reading/writing JSON data using Rust's serde-json.

A hyper-fast, safe Python module to read and write JSON data. Works as a drop-in replacement for Python's built-in json module. This is alpha software

Matthias 479 Jan 01, 2023
Perform sentiment analysis and keyword extraction on Craigslist listings

craiglist-helper synopsis Perform sentiment analysis and keyword extraction on Craigslist listings Background I love Craigslist. I've found most of my

Mark Musil 1 Nov 08, 2021
Deep Learning for Natural Language Processing - Lectures 2021

This repository contains slides for the course "20-00-0947: Deep Learning for Natural Language Processing" (Technical University of Darmstadt, Summer term 2021).

0 Feb 21, 2022
The aim of this task is to predict someone's English proficiency based on a text input.

English_proficiency_prediction_NLP The aim of this task is to predict someone's English proficiency based on a text input. Using the The NICT JLE Corp

1 Dec 13, 2021
:id: A python library for accurate and scalable fuzzy matching, record deduplication and entity-resolution.

Dedupe Python Library dedupe is a python library that uses machine learning to perform fuzzy matching, deduplication and entity resolution quickly on

Dedupe.io 3.6k Jan 02, 2023
chaii - hindi & tamil question answering

chaii - hindi & tamil question answering This is the solution for rank 5th in Kaggle competition: chaii - Hindi and Tamil Question Answering. The comp

abhishek thakur 33 Dec 18, 2022
AMUSE - financial summarization

AMUSE AMUSE - financial summarization Unzip data.zip Train new model: python FinAnalyze.py --task train --start 0 --count how many files,-1 for all

1 Jan 11, 2022
Subtitle Workshop (subshop): tools to download and synchronize subtitles

SUBSHOP Tools to download, remove ads, and synchronize subtitles. SUBSHOP Purpose Limitations Required Web Credentials Installation, Configuration, an

Joe D 4 Feb 13, 2022
FireFlyer Record file format, writer and reader for DL training samples.

FFRecord The FFRecord format is a simple format for storing a sequence of binary records developed by HFAiLab, which supports random access and Linux

77 Jan 04, 2023
Codes to pre-train Japanese T5 models

t5-japanese Codes to pre-train a T5 (Text-to-Text Transfer Transformer) model pre-trained on Japanese web texts. The model is available at https://hug

Megagon Labs 37 Dec 25, 2022
Uses Google's gTTS module to easily create robo text readin' on command.

Tool to convert text to speech, creating files for later use. TTRS uses Google's gTTS module to easily create robo text readin' on command.

0 Jun 20, 2021
λ‚΄λΆ€ μž‘μ—…μš© django + vue(vuetify) boilerplate. μ§  ν•˜λ©΄ λŒμ•„κ°.

Pocket Galaxy μ•„μ£Ό κ°„λ‹¨ν•œ 개인용, ν˜Ήμ€ λ‚΄λΆ€μš© νˆ΄μ„ λ§Œλ“€μ–΄μ•Όν•˜λŠ”λ° 이왕이면 웹이 νŽΈν•˜μ£ ? κ·ΈλŸ΄λ•Œλ₯Ό μœ„ν•΄ λ§Œλ“€μ–΄λ‘” django와 vue(vuetify)둜 이뀄진 boilerplate μž…λ‹ˆλ‹€. 각 폴더에 μžˆλŠ” μ„€λͺ…μ„œλŒ€λ‘œ 싀행을 μ‹œν‚€λ©΄ 일단 λ‹Ήμž₯ λ­”κ°€κ°€ λŒμ•„κ°‘λ‹ˆ

Jamie J. Seol 16 Dec 03, 2021
Extracting Summary Knowledge Graphs from Long Documents

GraphSum This repo contains the data and code for the G2G model in the paper: Extracting Summary Knowledge Graphs from Long Documents. The other basel

Zeqiu (Ellen) Wu 10 Oct 21, 2022
Translate - a PyTorch Language Library

NOTE PyTorch Translate is now deprecated, please use fairseq instead. Translate - a PyTorch Language Library Translate is a library for machine transl

775 Dec 24, 2022
Code associated with the "Data Augmentation using Pre-trained Transformer Models" paper

Data Augmentation using Pre-trained Transformer Models Code associated with the Data Augmentation using Pre-trained Transformer Models paper Code cont

44 Dec 31, 2022
πŸ€—πŸ–ΌοΈ HuggingPics: Fine-tune Vision Transformers for anything using images found on the web.

πŸ€— πŸ–ΌοΈ HuggingPics Fine-tune Vision Transformers for anything using images found on the web. Check out the video below for a walkthrough of this proje

Nathan Raw 185 Dec 21, 2022
TTS is a library for advanced Text-to-Speech generation.

TTS is a library for advanced Text-to-Speech generation. It's built on the latest research, was designed to achieve the best trade-off among ease-of-training, speed and quality. TTS comes with pretra

Mozilla 6.5k Jan 08, 2023