FemtoGPT
Minimal Generative Pretrained Transformer
FemtoGPT is a minimal Generative Pretrained Transformer written entirely in Rust.
It does not use any external libraries for tensor operations or model training/inference. It follows the same architecture as nanoGPT, which was explained by Andrej Karpathy in his video lecture.
FemtoGPT is a useful resource for anyone who wants to learn more about how large language models work at a low level.
FemtoGPT is a minimalistic implementation of a GPT model that relies only on a few libraries. It uses rand
/rand-distr
, for random generation, serde
/bincode
for data serialization, and rayon
for parallel computing.
However, femtoGPT is not very efficient, because it uses naive algorithms for basic operations like matrix multiplication.
The gradients are verified using gradient-checking technique, but there might be some errors in the layer implementations.
To train your GPT model, you need to create a file named dataset.txt
and fill it with your desired text. The text should have a low diversity of characters for optimal results.
Then run this command:
cargo run --release
The model will begin training and store the data in the train_data
directory. You can pause the training and resume it anytime!
Output samples
This is the result of training a 300k parameter model on the Shakespeare database for hours:
LIS:
Tore hend shater sorerds tougeng an herdofed seng he borind,
Ound ourere sthe, a sou so tousthe ashtherd, m se a man stousshan here hat mend serthe fo witownderstesther s ars at atheno sel theas,
thisth t are sorind bour win soutinds mater horengher
This is not as good as expected, but on the positive side, it seems like it has generated words that are easy to pronounce.
The current task of the team is to train a model with 10M parameters and verify the accuracy of the code.
UPDATE:
This is the result of further training on a comparable model for several hours:
What like but wore pad wo me che nogns yous dares,
As supt it nind bupart 'the reed:
And hils not es
The model demonstrates some knowledge of vocabulary and syntax!
Join Guidady AI Mail List
Subscribe to our mailing list and get interesting stuff and updates to your email inbox.
0 Comments