This code is part of the reproducibility package for the SANER 2022 paper "Generating Clarifying Questions for Query Refinement in Source Code Search".

Last update: Dec 04, 2021

Related tags

Deep Learning ZaCQ

Overview

Clarifying Questions for Query Refinement in Source Code Search

This code is part of the reproducibility package for the SANER 2022 paper "Generating Clarifying Questions for Query Refinement in Source Code Search".

It consists of five folders:

codesearch/ - API to access the CodeSearchNet datasets and neural bag-of-words code retrieval method.
cq/ - Implementation of the ZaCQ system, including an implementation of the the TaskNav development task extraction algorithm and two baseline query refinement methods.
data/ - Includes pretrained code search model and config files for task extraction.
evaluation/ - Scripts to run and evaluate ZaCQ.
interface/ - Backend and Frontend servers for a search interface implementing ZaCQ.

Setup

Clone the CodeSearchNet package to the root directory, and download the CSN datasets

cd ZaCQ
git clone https://github.com/github/CodeSearchNet.git
cd CodeSearchNet/scripts
./download_and_preprocess

Use a CSN model to create vector representations for candidate code search results. A pretrained Neural BoW model is included in this package.

cd codesearch
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
python _setup.py

This will save and index vectors in the data folder. It will also generate search results for the 99 CSN queries.

Task extraction is fairly quick for small sets of code search results, but it is expensive to do repeatedly. To expedite the evaluation, we cache the extracted tasks for the results of the 99 CSN queries, as well as keywords for all functions in the datasets.

cd cq
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
python _setup.py

Cached tasks and keywords are stored in the data folder.

Evaluation

To evaluate the ZaCQ and the other query refinement methods on the CSN queries, you may use the following:

cd evaluation
python run_queries.py
python evaluate.py

The run_queries script determines the subset of CSN queries that can be automatically evaluated, and simulates interactive refinement sessions for all valid questions for each language in CSN. For ZaCQ, the script runs through a set of predefined hyperparameter combinations. The script calculates NDCG, MAP, and MRE metrics for each refinement method and hyperparameter configuration, and stores them in the data/output folder

The evaluate script averages the metrics across all languages after 1-N rounds of refinement. For ZaCQ, it also records the best-performing hyperparamter combination after n rounds of refinement.

Interface

To run the interactive search interface, you need to run two backend servers and start the GUI server:

cd interface/cqserver
python ClarifyAPI.py

cd interface/searchserver
python SearchAPI.py

cd interface/gui
npm start

By default, you can access the GUI at localhost:3000

This code is part of the reproducibility package for the SANER 2022 paper "Generating Clarifying Questions for Query Refinement in Source Code Search".

Related tags

Overview

Clarifying Questions for Query Refinement in Source Code Search

Setup

Evaluation

Interface

Owner

Zachary Eberhart

NVIDIA Merlin is an open source library providing end-to-end GPU-accelerated recommender systems, from feature engineering and preprocessing to training deep learning models and running inference in production.

Repository for "Space-Time Correspondence as a Contrastive Random Walk" (NeurIPS 2020)

Utilities and information for the signals.numer.ai tournament

Human segmentation models, training/inference code, and trained weights, implemented in PyTorch

Time series annotation library.

🕹️ Official Implementation of Conditional Motion In-betweening (CMIB) 🏃

一个多语言支持、易使用的 OCR 项目。An easy-to-use OCR project with multilingual support.

Official Implementation of SimIPU: Simple 2D Image and 3D Point Cloud Unsupervised Pre-Training for Spatial-Aware Visual Representations

Neural Architecture Search Powered by Swarm Intelligence 🐜

keyframes-CNN-RNN(action recognition)

Object Detection Projekt in GKI WS2021/22

Code for Generating Disentangled Arguments with Prompts: A Simple Event Extraction Framework that Works

This Artificial Intelligence program can take a black and white/grayscale image and generate a realistic or plausible colorized version of the same picture.

O-CNN: Octree-based Convolutional Neural Networks for 3D Shape Analysis

The original implementation of TNDM used in the NeurIPS 2021 paper (no longer being updated)

Code release for DS-NeRF (Depth-supervised Neural Radiance Fields)

a Lightweight library for sequential learning agents, including reinforcement learning

[NeurIPS 2021] “Improving Contrastive Learning on Imbalanced Data via Open-World Sampling”,

An index of algorithms for learning causality with data

Stock-Prediction - prediction of stock market movements using sentiment analysis and deep learning.