CJK computer science terms comparison / 中日韓電腦科學術語對照 / 日中韓のコンピュータ科学の用語対照 / 한·중·일 전산학 용어 대조

Overview

CJK computer science terms comparison

GitHub Netlify Creative Commons License

This repository contains the source code of the website. You can see the website from the following link:

Greater China, Japan, and Korea, the so-called Sinosphere (漢字文化圈; literally: "Chinese character cultural sphere"), have borrowed many concepts through Sinoxenic vocabularies from the West since the modern era. Some of them have their own translations, but some have imported translations from neighboring countries. In some translations, both native and foreign stems are combined. As a result, Sinosphere countries share a lot of words, but to some extent they have their own parts. And this is no different in computer science translations.

This page contains comparison tables of how computer science terms, mostly derived from English, are translated and called in different regions of Sinosphere.

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Introduction

Cognates

Cognates are words that are derived from one side or share a common etymology.

For example, the English word computer and the Korean word 컴퓨터, the Japanese word 計算科学 (keisan kagaku) and the Chinese word 計算科學 (jìsuàn kēxué), that are both meaning computational science, are cognates.

Cognates are indicated by the same colored border.

Calque (loan translation)

Calque is a word or phrase borrowed from another language by literal word-for-word or root-for-root translation.

For example, the Chinese word 軟件 is a translation of the English word software, which translates the English words soft (ruǎn; soft or flexible) and ware (jiàn; clothes or item) respectively.

Matching words/roots between languages in this way are underlined with the same color & shape.

Homophonic translations

For a root transcribed from a foreign word, the original word is displayed on the root.

For example, as the Japanese word コンピュータ (konpyu-ta) is a transcription of English word computer, it is displayed like: コンピュータcomputer.

Romanized pronunciation

The pronunciation of each word is shown in Latin letters in parentheses below the word. The transcription system for each language is as follows:

Mandarin (China & Taiwan) : Hanyu Pinyin

Cantonese (Hong Kong) : Jyutping (Linguistic Society of Hong Kong Cantonese Romanization Scheme)

Japanese : Hepburn romanization

Korean : Revised Romanization of Korean

Basic terms

Show table.

Units

Show table.

Fields of study

Show table.

Computer programming

Show table.

Tools

Show table.

Theory of computation

Show table.

*[CJK]: Chinese, Japanese, and Korean languages

Owner
Hong Minhee (洪 民憙)
A software engineer from Seoul. An advocate of F/OSS, Open Web, and Cypherpunk. Hack into East Asian languages.
Hong Minhee (洪 民憙)
Script and models for clustering LAION-400m CLIP embeddings.

clustering-laion400m Script and models for clustering LAION-400m CLIP embeddings. Models were fit on the first million or so image embeddings. A subje

Peter Baylies 22 Oct 04, 2022
InferSent sentence embeddings

InferSent InferSent is a sentence embeddings method that provides semantic representations for English sentences. It is trained on natural language in

Facebook Research 2.2k Dec 27, 2022
ConvBERT: Improving BERT with Span-based Dynamic Convolution

ConvBERT Introduction In this repo, we introduce a new architecture ConvBERT for pre-training based language model. The code is tested on a V100 GPU.

YITUTech 237 Dec 10, 2022
A natural language processing model for sequential sentence classification in medical abstracts.

NLP PubMed Medical Research Paper Abstract (Randomized Controlled Trial) A natural language processing model for sequential sentence classification in

Hemanth Chandran 1 Jan 17, 2022
Unofficial implementation of Google's FNet: Mixing Tokens with Fourier Transforms

FNet: Mixing Tokens with Fourier Transforms Pytorch implementation of Fnet : Mixing Tokens with Fourier Transforms. Citation: @misc{leethorp2021fnet,

Rishikesh (ऋषिकेश) 217 Dec 05, 2022
Chinese NewsTitle Generation Project by GPT2.带有超级详细注释的中文GPT2新闻标题生成项目。

GPT2-NewsTitle 带有超详细注释的GPT2新闻标题生成项目 UpDate 01.02.2021 从网上收集数据,将清华新闻数据、搜狗新闻数据等新闻数据集,以及开源的一些摘要数据进行整理清洗,构建一个较完善的中文摘要数据集。 数据集清洗时,仅进行了简单地规则清洗。

logCong 785 Dec 29, 2022
ChessCoach is a neural network-based chess engine capable of natural-language commentary.

ChessCoach is a neural network-based chess engine capable of natural-language commentary.

Chris Butner 380 Dec 03, 2022
NLP-based analysis of poor Chinese movie reviews on Douban

douban_embedding 豆瓣中文影评差评分析 1. NLP NLP(Natural Language Processing)是指自然语言处理,他的目的是让计算机可以听懂人话。 下面是我将2万条豆瓣影评训练之后,随意输入一段新影评交给神经网络,最终AI推断出的结果。 "很好,演技不错

3 Apr 15, 2022
Pytorch-Named-Entity-Recognition-with-BERT

BERT NER Use google BERT to do CoNLL-2003 NER ! Train model using Python and Inference using C++ ALBERT-TF2.0 BERT-NER-TENSORFLOW-2.0 BERT-SQuAD Requi

Kamal Raj 1.1k Dec 25, 2022
Codes for processing meeting summarization datasets AMI and ICSI.

Meeting Summarization Dataset Meeting plays an essential part in our daily life, which allows us to share information and collaborate with others. Wit

xcfeng 39 Dec 14, 2022
Package for controllable summarization

summarizers summarizers is package for controllable summarization based CTRLsum. currently, we only supports English. It doesn't work in other languag

Hyunwoong Ko 72 Dec 07, 2022
Labelling platform for text using distant supervision

With DataQA, you can label unstructured text documents using rule-based distant supervision.

245 Aug 05, 2022
Under the hood working of transformers, fine-tuning GPT-3 models, DeBERTa, vision models, and the start of Metaverse, using a variety of NLP platforms: Hugging Face, OpenAI API, Trax, and AllenNLP

Transformers-for-NLP-2nd-Edition @copyright 2022, Packt Publishing, Denis Rothman Contact me for any question you have on LinkedIn Get the book on Ama

Denis Rothman 150 Dec 23, 2022
A design of MIDI language for music generation task, specifically for Natural Language Processing (NLP) models.

MIDI Language Introduction Reference Paper: Pop Music Transformer: Beat-based Modeling and Generation of Expressive Pop Piano Compositions: code This

Robert Bogan Kang 3 May 25, 2022
Machine Learning Course Project, IMDB movie review sentiment analysis by lstm, cnn, and transformer

IMDB Sentiment Analysis This is the final project of Machine Learning Courses in Huazhong University of Science and Technology, School of Artificial I

Daniel 0 Dec 27, 2021
Fast, DB Backed pretrained word embeddings for natural language processing.

Embeddings Embeddings is a python package that provides pretrained word embeddings for natural language processing and machine learning. Instead of lo

Victor Zhong 212 Nov 21, 2022
NLP - Machine learning

Flipkart-product-reviews NLP - Machine learning About Product reviews is an essential part of an online store like Flipkart’s branding and marketing.

Harshith VH 1 Oct 29, 2021
This repo stores the codes for topic modeling on palliative care journals.

This repo stores the codes for topic modeling on palliative care journals. Data Preparation You first need to download the journal papers. bash 1_down

3 Dec 20, 2022
DeBERTa: Decoding-enhanced BERT with Disentangled Attention

DeBERTa: Decoding-enhanced BERT with Disentangled Attention This repository is the official implementation of DeBERTa: Decoding-enhanced BERT with Dis

Microsoft 1.2k Jan 03, 2023
Yet Another Neural Machine Translation Toolkit

YANMTT YANMTT is short for Yet Another Neural Machine Translation Toolkit. For a backstory how I ended up creating this toolkit scroll to the bottom o

Raj Dabre 121 Jan 05, 2023