Backend for the Autocomplete platform. An AI assisted coding platform.

Overview

Introduction

A custom predictor allows you to deploy your own prediction implementation, useful when the existing serving implementations don't fit your needs. If migrating from Cortex, the custom predictor work exactly the same way as PythonPredictor does in Cortex. Most PythonPredictors can be converted to custom predictor by copy pasting the code and renaming some variables.

The custom predictor is packaged as a Docker container. It is recommended, but not required, to keep large model files outside of the container image itself and to load them from a storage volume. This example follows that pattern. You will need somewhere to publish your Docker image once built. This example leverages Docker Hub, where storing public images are free and private images are cheap. Google Container Registry and other registries can also be used.

Make sure you use a GPU enabled Docker image as a base, and that you enable GPU support when loading the model.

Getting Started

After installing kubectl and adding your CoreWeave Cloud access credentials, the following steps will deploy the Inference Service. Clone this repository and folder, and execute all commands in there. We'll be using all the files.

Sign up for a Docker Hub account, or use a different container registry if you already have one. The free plan works perfectly fine, but your container images will be accessible by anyone. This guide assumes a private registry, requiring authentication. Once signed up, create a new repository. For the rest of the guide, we'll assume that the name of the new repository is gpt-6b.

Build the Docker image

  1. Enter the custom-predictor directory. Build and push the Docker image. No modifications are needed to any of the files to follow along. The default Docker tag is latest. We strongly discourage you to use this, as containers are cached on the nodes and in other parts of the CoreWeave stack. Once you have pushed to a tag, do not push to that tag again. Below, we use simple versioning by using tag 1 for the first iteration of the image.
    export DOCKER_USER=thotailtd
    docker build -t $DOCKER_USER/gpt-6b:v1alpha1 .
    docker push $DOCKER_USER/gpt-6b:v1alpha1

Set up repository access

  1. Create a Secret with the Docker Hub credentials. The secret will be named docker-hub. This will be used by nodes to pull your private image. Refer to the Kubernetes Documentation for more details.

    kubectl create secret docker-registry docker-hub --docker-server=https://index.docker.io/v1/ --docker-username=<your-name> --docker-password=<your-pword> --docker-email=<your-email>
  2. Tell Kubernetes to use the newly created Secret by patching the ServiceAccount for your namespace to reference this Secret.

    kubectl patch serviceaccounts default --patch "$(cat image-secrets-serviceaccount.patch.yaml)"

Download the model

As we don't want to bundle the model in the Docker image for performance reasons, a storage volume needs to be set up and the pre-trained model downloaded to it. Storage volumes are allocated using a Kubernetes PersistentVolumeClaim. We'll also deploy a simple container that we can use to copy files to our newly created volume.

  1. Apply the PersistentVolumeClaim and the manifest for the sleep container.

    $ kubectl apply -f model-storage-pvc.yaml
    persistentvolumeclaim/model-storage created
    $ kubectl apply -f sleep-deployment.yaml
    deployment.apps/sleep created
  2. The volume is mounted to /models inside the sleep container. Download the pre-trained model locally, create a directory for it in the shared volume and upload it there. The name of the sleep Pod is assigned to a variable using kubectl. You can also get the name with kubectl get pods.

    The model will be loaded to Amazon S3 soon. Now I directly uploaded it to CoreWeave
    
    export SLEEP_POD=$(kubectl get pod -l "app.kubernetes.io/name=sleep" -o jsonpath='{.items[0].metadata.name}')
    kubectl exec -it $SLEEP_POD -- sh -c 'mkdir /models/sentiment'
    kubectl cp ./sleep_383500 $SLEEP_POD:/models/sentiment/
  3. (Optional) Instead of copying the model from the local filesystem, the model can be downloaded from Amazon S3. The Amazon CLI utilities already exist in the sleep container.

    $ export SLEEP_POD=$(kubectl get pod -l "app.kubernetes.io/name=sleep" -o jsonpath='{.items[0].metadata.name}')
    $ kubectl exec -it $SLEEP_POD -- sh
    $# aws configure
    $# mkdir /models/sentiment
    $# aws s3 sync --recursive s3://thot-ai-models /models/sentiment/

Deploy the model

  1. Modify sentiment-inferenceservice.yaml to reference your docker image.

  2. Apply the resources. This can be used to both create and update existing manifests.

     $ kubectl apply -f sentiment-inferenceservice.yaml
     inferenceservice.serving.kubeflow.org/sentiment configured
  3. List pods to see that the Predictor has launched successfully. This can take a minute, wait for Ready to indicate 2/2.

    $ kubectl get pods
    NAME                                                           READY   STATUS    RESTARTS   AGE
    sentiment-predictor-default-px8xk-deployment-85bb6787d7-h42xk  2/2     Running   0          34s

    If the predictor fails to init, look in the logs for clues kubectl logs sentiment-predictor-default-px8xk-deployment-85bb6787d7-h42xk kfserving-container.

  4. Once all the Pods are running, we can get the API endpoint for our model. The API endpoints follow the Tensorflow V1 HTTP API.

    $ kubectl get inferenceservices
    NAME        URL                                                                          READY   DEFAULT TRAFFIC   CANARY TRAFFIC   AGE
    sentiment   http://sentiment.tenant-test.knative.chi.coreweave.com/v1/models/sentiment   True    100                                23h

    The URL in the output is the public API URL for your newly deployed model. A HTTPs endpoint is also available, however this one bypasses any canary deployments. Retrieve this one with kubectl get ksvc.

  5. Run a test prediction on the URL from above. Remember to add the :predict postfix.

     $ curl -d @sample.json http://sentiment.tenant-test.knative.chi.coreweave.com/v1/models/sentiment:predict
    {"predictions": ["positive"]}
  6. Remove the InferenceService. This will delete all the associated resources, except for your model storage and sleep Deployment.

    $ kubectl delete inferenceservices sentiment
    inferenceservice.serving.kubeflow.org "sentiment" deleted
    ```# thot.ai-Back-End
Owner
Tatenda Christopher Chinyamakobvu
Tatenda Christopher Chinyamakobvu
iBOT: Image BERT Pre-Training with Online Tokenizer

Image BERT Pre-Training with iBOT Official PyTorch implementation and pretrained models for paper iBOT: Image BERT Pre-Training with Online Tokenizer.

Bytedance Inc. 435 Jan 06, 2023
用Resnet101+GPT搭建一个玩王者荣耀的AI

基于pytorch框架用resnet101加GPT搭建AI玩王者荣耀 本源码模型主要用了SamLynnEvans Transformer 的源码的解码部分。以及pytorch自带的预训练模型"resnet101-5d3b4d8f.pth"

冯泉荔 2.2k Jan 03, 2023
A 30000+ Chinese MRC dataset - Delta Reading Comprehension Dataset

Delta Reading Comprehension Dataset 台達閱讀理解資料集 Delta Reading Comprehension Dataset (DRCD) 屬於通用領域繁體中文機器閱讀理解資料集。 本資料集期望成為適用於遷移學習之標準中文閱讀理解資料集。 本資料集從2,108篇

272 Dec 15, 2022
Grover is a model for Neural Fake News -- both generation and detectio

Grover is a model for Neural Fake News -- both generation and detection. However, it probably can also be used for other generation tasks.

Rowan Zellers 856 Dec 24, 2022
A sentence aligner for comparable corpora

About Yalign is a tool for extracting parallel sentences from comparable corpora. Statistical Machine Translation relies on parallel corpora (eg.. eur

Machinalis 128 Aug 24, 2022
中文空间语义理解评测

中文空间语义理解评测 最新消息 2021-04-10 🚩 排行榜发布: Leaderboard 2021-04-05 基线系统发布: SpaCE2021-Baseline 2021-04-05 开放数据提交: 提交结果 2021-04-01 开放报名: 我要报名 2021-04-01 数据集 pa

40 Jan 04, 2023
translate using your voice

speech-to-text-translator Usage translate using your voice description this project makes translating a word easy, all you have to do is speak and...

1 Oct 18, 2021
Long text token classification using LongFormer

Long text token classification using LongFormer

abhishek thakur 161 Aug 07, 2022
CCF BDCI 2020 房产行业聊天问答匹配赛道 A榜47/2985

CCF BDCI 2020 房产行业聊天问答匹配 A榜47/2985 赛题描述详见:https://www.datafountain.cn/competitions/474 文件说明 data: 存放训练数据和测试数据以及预处理代码 model_bert.py: 网络模型结构定义 adv_train

shuo 40 Sep 28, 2022
GVT is a generic translation tool for parts of text on the PC screen with Text to Speak functionality.

GVT is a generic translation tool for parts of text on the PC screen with Text to Speech functionality. I wanted to create it because the existing tools that I experimented with did not satisfy me in

Nuked 1 Aug 21, 2022
Kurumi ChatBot

KurumiChatBot Just another Telegram AI chat bot written in Python using Pyrogram. A public running instance can be found on telegram as @TokisakiChatB

Yoga Pranata 3 Jun 28, 2022
Creating an Audiobook (mp3 file) using a Ebook (epub) using BeautifulSoup and Google Text to Speech

epub2audiobook Creating an Audiobook (mp3 file) using a Ebook (epub) using BeautifulSoup and Google Text to Speech Input examples qual a pasta do seu

7 Aug 25, 2022
초성 해석기 based on ko-BART

초성 해석기 개요 한국어 초성만으로 이루어진 문장을 입력하면, 완성된 문장을 예측하는 초성 해석기입니다. 초성: ㄴㄴ ㄴㄹ ㅈㅇㅎ 예측 문장: 나는 너를 좋아해 모델 모델은 SKT-AI에서 공개한 Ko-BART를 이용합니다. 데이터 문장 단위로 이루어진 아무 코퍼스나

Dawoon Jung 29 Oct 28, 2022
CMeEE 数据集医学实体抽取

医学实体抽取_GlobalPointer_torch 介绍 思想来自于苏神 GlobalPointer,原始版本是基于keras实现的,模型结构实现参考现有 pytorch 复现代码【感谢!】,基于torch百分百复现苏神原始效果。 数据集 中文医学命名实体数据集 点这里申请,很简单,共包含九类医学

85 Dec 28, 2022
Official code repository of the paper Linear Transformers Are Secretly Fast Weight Programmers.

Linear Transformers Are Secretly Fast Weight Programmers This repository contains the code accompanying the paper Linear Transformers Are Secretly Fas

Imanol Schlag 77 Dec 19, 2022
Implementation of paper Does syntax matter? A strong baseline for Aspect-based Sentiment Analysis with RoBERTa.

RoBERTaABSA This repo contains the code for NAACL 2021 paper titled Does syntax matter? A strong baseline for Aspect-based Sentiment Analysis with RoB

106 Nov 28, 2022
Fixes mojibake and other glitches in Unicode text, after the fact.

ftfy: fixes text for you print(fix_encoding("(ง'⌣')ง")) (ง'⌣')ง Full documentation: https://ftfy.readthedocs.org Testimonials “My life is li

Luminoso Technologies, Inc. 3.4k Dec 29, 2022
Higher quality textures for the Metal Gear Solid series.

Metal Gear Solid: HD Textures Higher quality textures for the Metal Gear Solid series. The goal is to maximize the quality of assets that the engine w

Samantha 6 Dec 06, 2022
An assignment on creating a minimalist neural network toolkit for CS11-747

minnn by Graham Neubig, Zhisong Zhang, and Divyansh Kaushik This is an exercise in developing a minimalist neural network toolkit for NLP, part of Car

Graham Neubig 63 Dec 29, 2022
Unsupervised Document Expansion for Information Retrieval with Stochastic Text Generation

Unsupervised Document Expansion for Information Retrieval with Stochastic Text Generation Official Code Repository for the paper "Unsupervised Documen

NLP*CL Laboratory 2 Oct 26, 2021