FlagAI: Toolkit for Large-Scale General AI Models

FlagAI

Toolkit for Large-Scale General AI Models

FlagAI is a fast, easy-to-use and extensible toolkit for large-scale model. It supports training, fine-tuning, and deploying models on various downstream tasks with multi-modality. It provides an API to quickly download pre-trained models and fine-tune them on multiple datasets. It also allows for parallel training with fewer than 10 lines of code and provides a prompt-learning toolkit for few-shot tasks.

Features

Quickly Download Models via API

Downloading over 30 models via API, such as Aquila, AltCLIP, AltDiffusion, WuDao GLM, etc. for Chinese and English tasks.

Parallel training with less than 10 lines of code .

FlagAI integrates PyTorch, Deepspeed, Megatron-LM, BMTrain for easy data/model parallelism in less than 10 lines of code.

Use few-shot learning tools easily.

FlagAI offers a toolkit for prompt-learning that enables few-shot performance.

Good at Chinese tasks

These models handle (Chinese/English) Text for various tasks, such as text classification, information extraction, question answering, summarization, and text generation. They are especially suitable for Chinese tasks.

Getting Started 🚀

Requirements

Python version >= 3.8
PyTorch version >= 1.8.0
[Optional] For training/testing models on GPUs, you’ll also need to install CUDA and NCCL

Installation

To install FlagAI with pip:

pip install -U flagai

[Optional] To install FlagAI and develop locally:

git clone https://github.com/FlagAI-Open/FlagAI.git
python setup.py install

[Optional] install NVIDIA’s apex For faster training

git clone https://github.com/NVIDIA/apex
cd apex
pip install -v --disable-pip-version-check --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./

[Optional] For ZeRO optimizers, install DEEPSPEED (>= 0.7.7)

git clone https://github.com/microsoft/DeepSpeed
cd DeepSpeed
DS_BUILD_CPU_ADAM=1 DS_BUILD_AIO=1 DS_BUILD_UTILS=1 pip install -e .
ds_report # check the deespeed status

[Optional] For BMTrain training, install BMTrain (>= 0.2.2)

git clone https://github.com/OpenBMB/BMTrain
cd BMTrain
python setup.py install

[Optional] For BMInf low-resource inference, install BMInf

pip install bminf

[Optional] For Flash Attention, install Flash-attention (>=1.0.2)

pip install flash-attn

To access your docker environment on a single node, you have to configure the ports for ssh. For example, use root@127.0.0.1 with port 711.

>>> vim ~/.ssh/config
Host 127.0.0.1
    Hostname 127.0.0.1
    Port 7110
    User root

To enable secure communication between docker nodes, create ssh keys and distribute the public key to each node (in ~/.ssh/)

>>> ssh-keygen -t rsa -C "xxx@xxx.com"

tokenizer and Load model

With the AutoLoad class from FlagAI, you can easily load the model and tokenizer you need, for example:

from flagai.auto_model.auto_loader import AutoLoader

auto_loader = AutoLoader(
    task_name="title-generation",
    model_name="BERT-base-en"
)
model = auto_loader.get_model()
tokenizer = auto_loader.get_tokenizer()

The task_name parameter can be changed to model different tasks. This example shows how to use the title_generation task. You can fine-tune or test the model and tokenizer with this task.

Toolkits and Pre-trained Models

The code is based partially on GLM, Transformers，timm and DeepSpeedExamples.

Toolkits

Name	Description	Examples
`GLM_custom_pvp`	Customizing PET templates	README.md
`GLM_ptuning`	p-tuning tool	——
`BMInf-generate`	Accelerating generation	README.md

Pre-trained Models

Model	Task	Train	Finetune	Inference/Generate	Examples
Aquila	Natural Language Processing	✅	✅	✅	README.md
ALM	Arabic Text Generation	✅	❌	✅	README.md
AltCLIP	Image-Text Matching	✅	✅	✅	README.md
AltCLIP-m18	Image-Text Matching	✅	✅	✅	README.md
AltDiffusion	Text-to-Image Generation	❌	❌	✅	README.md
AltDiffusion-m18	Text-to-Image Generation,supporting 18 languages	❌	❌	✅	README.md
BERT-title-generation-english	English Title Generation	✅	❌	✅	README.md
CLIP	Image-Text Matching	✅	❌	✅	——
CPM3-finetune	Text Continuation	❌	✅	❌	——
CPM3-generate	Text Continuation	❌	❌	✅	——
CPM3_pretrain	Text Continuation	✅	❌	❌	——
CPM_1	Text Continuation	❌	❌	✅	README.md
EVA-CLIP	Image-Text Matching	✅	✅	✅	README.md
Galactica	Text Continuation	❌	❌	✅	——
GLM-large-ch-blank-filling	Blank Filling	❌	❌	✅	TUTORIAL
GLM-large-ch-poetry-generation	Poetry Generation	✅	❌	✅	TUTORIAL
GLM-large-ch-title-generation	Title Generation	✅	❌	✅	TUTORIAL
GLM-pretrain	Pre-Train	✅	❌	❌	——
GLM-seq2seq	Generation	✅	❌	✅	——
GLM-superglue	Classification	✅	❌	❌	——
GPT-2-text-writting	Text Continuation	❌	❌	✅	TUTORIAL
GPT2-text-writting	Text Continuation	❌	❌	✅	——
GPT2-title-generation	Title Generation	❌	❌	✅	——
OPT	Text Continuation	❌	❌	✅	README.md
RoBERTa-base-ch-ner	Named Entity Recognition	✅	❌	✅	TUTORIAL
RoBERTa-base-ch-semantic-matching	Semantic Similarity Matching	✅	❌	✅	TUTORIAL
RoBERTa-base-ch-title-generation	Title Generation	✅	❌	✅	TUTORIAL
RoBERTa-faq	Question-Answer	❌	❌	✅	README.md
Swinv1	Image Classification	✅	❌	✅	——
Swinv2	Image Classification	✅	❌	✅	——
T5-huggingface-11b	Train	✅	❌	❌	TUTORIAL
T5-title-generation	Title Generation	❌	❌	✅	TUTORIAL
T5-flagai-11b	Pre-Train	✅	❌	❌	——
ViT-cifar100	Pre-Train	✅	❌	❌	——