FlagAI

FlagAI: Toolkit for Large-Scale General AI Models

1 min


FlagAI

Toolkit for Large-Scale General AI Models

Github logo

FlagAI is a fast, easy-to-use and extensible toolkit for large-scale model. It supports training, fine-tuning, and deploying models on various downstream tasks with multi-modality. It provides an API to quickly download pre-trained models and fine-tune them on multiple datasets. It also allows for parallel training with fewer than 10 lines of code and provides a prompt-learning toolkit for few-shot tasks.

Features

Quickly Download Models via API

Downloading over 30 models via API, such as Aquila, AltCLIP, AltDiffusion, WuDao GLM, etc. for Chinese and English tasks.

Parallel training with less than 10 lines of code .

FlagAI integrates PyTorch, Deepspeed, Megatron-LM, BMTrain for easy data/model parallelism in less than 10 lines of code.

Use few-shot learning tools easily.

FlagAI offers a toolkit for prompt-learning that enables few-shot performance.

Good at Chinese tasks

These models handle (Chinese/English) Text for various tasks, such as text classification, information extraction, question answering, summarization, and text generation. They are especially suitable for Chinese tasks.

Getting Started πŸš€

Requirements

  • Python version >= 3.8
  • PyTorch version >= 1.8.0
  • [Optional] For training/testing models on GPUs, you’ll also need to install CUDA and NCCL

Installation

  • To install FlagAI with pip:
pip install -U flagai
  • [Optional] To install FlagAI and develop locally:
git clone https://github.com/FlagAI-Open/FlagAI.git
python setup.py install
  • [Optional] install NVIDIA’s apex For faster training
git clone https://github.com/NVIDIA/apex
cd apex
pip install -v --disable-pip-version-check --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./
  • [Optional]Β  For ZeRO optimizers, install DEEPSPEEDΒ (>= 0.7.7)
git clone https://github.com/microsoft/DeepSpeed
cd DeepSpeed
DS_BUILD_CPU_ADAM=1 DS_BUILD_AIO=1 DS_BUILD_UTILS=1 pip install -e .
ds_report # check the deespeed status
  • [Optional] For BMTrain training, installΒ BMTrainΒ (>= 0.2.2)
git clone https://github.com/OpenBMB/BMTrain
cd BMTrain
python setup.py install
  • [Optional] For BMInf low-resource inference, installΒ BMInf
pip install bminf

pip install flash-attn

To access your docker environment on a single node, you have to configure the ports for ssh. For example, use [email protected] with port 711.

>>> vim ~/.ssh/config
Host 127.0.0.1
    Hostname 127.0.0.1
    Port 7110
    User root

To enable secure communication between docker nodes, create ssh keys and distribute the public key to each node (in ~/.ssh/)

>>> ssh-keygen -t rsa -C "[email protected]"

tokenizer andΒ Load model

With the AutoLoad class from FlagAI, you can easily load the model and tokenizer you need, for example:

from flagai.auto_model.auto_loader import AutoLoader

auto_loader = AutoLoader(
    task_name="title-generation",
    model_name="BERT-base-en"
)
model = auto_loader.get_model()
tokenizer = auto_loader.get_tokenizer()

The task_name parameter can be changed to model different tasks. This example shows how to use the title_generation task. You can fine-tune or test the model and tokenizer with this task.

Toolkits and Pre-trained Models

The code is based partially on GLM, Transformers,timm and DeepSpeedExamples.

Toolkits

Name Description Examples
GLM_custom_pvp Customizing PET templates README.md
GLM_ptuning p-tuning tool β€”β€”
BMInf-generate Accelerating generation README.md

Pre-trained Models

Model Task Train Finetune Inference/Generate Examples
Aquila Natural Language Processing βœ… βœ… βœ… README.md
ALM Arabic Text Generation βœ… ❌ βœ… README.md
AltCLIP Image-Text Matching βœ… βœ… βœ… README.md
AltCLIP-m18 Image-Text Matching βœ… βœ… βœ… README.md
AltDiffusion Text-to-Image Generation ❌ ❌ βœ… README.md
AltDiffusion-m18 Text-to-Image Generation,supporting 18 languages ❌ ❌ βœ… README.md
BERT-title-generation-english English Title Generation βœ… ❌ βœ… README.md
CLIP Image-Text Matching βœ… ❌ βœ… β€”β€”
CPM3-finetune Text Continuation ❌ βœ… ❌ β€”β€”
CPM3-generate Text Continuation ❌ ❌ βœ… β€”β€”
CPM3_pretrain Text Continuation βœ… ❌ ❌ β€”β€”
CPM_1 Text Continuation ❌ ❌ βœ… README.md
EVA-CLIP Image-Text Matching βœ… βœ… βœ… README.md
Galactica Text Continuation ❌ ❌ βœ… β€”β€”
GLM-large-ch-blank-filling Blank Filling ❌ ❌ βœ… TUTORIAL
GLM-large-ch-poetry-generation Poetry Generation βœ… ❌ βœ… TUTORIAL
GLM-large-ch-title-generation Title Generation βœ… ❌ βœ… TUTORIAL
GLM-pretrain Pre-Train βœ… ❌ ❌ β€”β€”
GLM-seq2seq Generation βœ… ❌ βœ… β€”β€”
GLM-superglue Classification βœ… ❌ ❌ β€”β€”
GPT-2-text-writting Text Continuation ❌ ❌ βœ… TUTORIAL
GPT2-text-writting Text Continuation ❌ ❌ βœ… β€”β€”
GPT2-title-generation Title Generation ❌ ❌ βœ… β€”β€”
OPT Text Continuation ❌ ❌ βœ… README.md
RoBERTa-base-ch-ner Named Entity Recognition βœ… ❌ βœ… TUTORIAL
RoBERTa-base-ch-semantic-matching Semantic Similarity Matching βœ… ❌ βœ… TUTORIAL
RoBERTa-base-ch-title-generation Title Generation βœ… ❌ βœ… TUTORIAL
RoBERTa-faq Question-Answer ❌ ❌ βœ… README.md
Swinv1 Image Classification βœ… ❌ βœ… β€”β€”
Swinv2 Image Classification βœ… ❌ βœ… β€”β€”
T5-huggingface-11b Train βœ… ❌ ❌ TUTORIAL
T5-title-generation Title Generation ❌ ❌ βœ… TUTORIAL
T5-flagai-11b Pre-Train βœ… ❌ ❌ β€”β€”
ViT-cifar100 Pre-Train βœ… ❌ ❌ β€”β€”

Project External Links

Join Guidady AI Mail List

Subscribe to our mailing list and get interesting stuff and updates to your email inbox.

Thank you for subscribing.

Something went wrong.


Like it? Share with your friends!

0
86 shares

0 Comments

Your email address will not be published. Required fields are marked *

Belmechri

I am an IT engineer, content creator, and proud father with a passion for innovation and excellence. In both my personal and professional life, I strive for excellence and am committed to finding innovative solutions to complex problems.
Choose A Format
Personality quiz
Series of questions that intends to reveal something about the personality
Trivia quiz
Series of questions with right and wrong answers that intends to check knowledge
Poll
Voting to make decisions or determine opinions
Story
Formatted Text with Embeds and Visuals
List
The Classic Internet Listicles
Countdown
The Classic Internet Countdowns
Open List
Submit your own item and vote up for the best submission
Ranked List
Upvote or downvote to decide the best list item
Meme
Upload your own images to make custom memes
Video
Youtube and Vimeo Embeds
Audio
Soundcloud or Mixcloud Embeds
Image
Photo or GIF
Gif
GIF format