GIANT

Code and data for paper "GIANT: Scalable Creation of a Web-scale Ontology"

https://arxiv.org/pdf/2004.02118.pdf

Please cite our paper if this project is helpful to your work or research, thanks.

How to run

Download files Stanford CoreNLP (https://stanfordnlp.github.io/CoreNLP/download.html) and Chinese word embedding (https://ai.tencent.com/ailab/nlp/embedding.html). For word embedding, see note in the bottom.

Revise paths and put files in appropriate paths File paths are defined in common/constants.py. So just go to that file and change the paths according to your own setting. Similarly for other paths defined in some source files.
test run

python3 GIANT_main.py
--data_type concept
--train_file "../../../../Datasets/original/concept/concepts.json"
--emb_tags
--task_output_dims 2
--tasks "phrase"
--edge_types_list "seq" "dep" "contain" "synonym"
--d_model 32
--layers 3
--num_bases 5
--epochs 10
--mode train
--debug

Note: add —processed_emb in above command can help to prevent re-processing word embeddings (as it is time consuming). In this case, you also don't need to download the Chinese word embedding file. It is quite big. Our experience shows that add word embedding feature as a part of node features is not quite helpful in our tasks. Therefore, I think it is safe to ignore the word embedding features in your experiments. If not using word embedding, you may need to revise data_loader.py to avoid some running errors. However, you can still try to improve by word embeddings.

code and data for paper "GIANT: Scalable Creation of a Web-scale Ontology"

Related tags

Overview

GIANT

How to run

Owner

Excalibur

Built a deep neural network (DNN) that functions as an end-to-end machine translation pipeline

A Pytorch implementation of "Splitter: Learning Node Representations that Capture Multiple Social Contexts" (WWW 2019).

This repository contains the implementations related to the experiments of a set of publicly available datasets that are used in the time series forecasting research space.

SplineConv implementation for Paddle.

Code for the RA-L (ICRA) 2021 paper "SeqNet: Learning Descriptors for Sequence-Based Hierarchical Place Recognition"

Pytorch implementation for the Temporal and Object Quantification Networks (TOQ-Nets).

Prototypical Cross-Attention Networks for Multiple Object Tracking and Segmentation, NeurIPS 2021 Spotlight

EMNLP 2020 - Summarizing Text on Any Aspects

Playing around with FastAPI and streamlit to create a YoloV5 object detector

Source code of D-HAN: Dynamic News Recommendation with Hierarchical Attention Network

Python library for analysis of time series data including dimensionality reduction, clustering, and Markov model estimation

Efficient face emotion recognition in photos and videos

The codes and related files to reproduce the results for Image Similarity Challenge Track 1.

A simple AI that will give you si ple task and this is made with python

Patch Rotation: A Self-Supervised Auxiliary Task for Robustness and Accuracy of Supervised Models

Post-training Quantization for Neural Networks with Provable Guarantees

Riemannian Geometry for Molecular Surface Approximation (RGMolSA)

PSANet: Point-wise Spatial Attention Network for Scene Parsing, ECCV2018.

Code and datasets for the paper "KnowPrompt: Knowledge-aware Prompt-tuning with Synergistic Optimization for Relation Extraction"

Machine learning notebooks in different subjects optimized to run in google collaboratory