Falcon LLM
Open-Source LLMs
Falcon LLM is a large language model (LLM) with 40 billion parameters that can generate natural language and code. It was developed by Technology Innovation Institute (TII) in Abu Dhabi and is open-sourced under the Apache License Version 2.0. Falcon LLM can be used for various tasks such as chatbots, content creation, language translation, and sentiment analysis. It is one of the most efficient and powerful LLMs in the world, outperforming GPT-3 and other state-of-the-art models on several benchmarks.
Falcon is a massive decoder-only model with 40 billion parameters that can generate text autoregressively. It was trained on 1 trillion tokens using 384 GPUs on AWS for two months.
The pretraining data for Falcon came from public web crawls that were filtered, deduplicated, and cleaned to remove machine-generated text and adult content. The resulting pretraining dataset had almost five trillion tokens.
This dataset was supplemented with selected sources such as academic papers and social media conversations to diversify Falcon AI skills.
Finally, Falcon’s performance was evaluated on open-source benchmarks such as EAI Harness, HELM, and BigBench.
Artefact | Link | Type | Details |
---|---|---|---|
🥇 Falcon-40B | Here | pretrained model | 40B parameters trained on 1,000 billion tokens. |
Falcon-40B-Instruct | Here | instruction/chat model | Falcon-40B finetuned on the Baize dataset. |
🥈 Falcon-7B | Here | pretrained model | 6.7B parameters trained on 1,500 billion tokens. |
Falcon-7B-Instruct | Here | instruction/chat model | Falcon-7B finetuned on the Baize, GPT4All, and GPTeacher datasets. |
📀 RefinedWeb | Here | pretraining web dataset | ~600 billion “high-quality” tokens. |
Falcon-RW-1B | Here | pretrained model | 1.3B parameters trained on 350 billion tokens. |
Falcon-RW-7B | Here | pretrained model | 7.5B parameters trained on 350 billion tokens. |
Is Falcon LLM free?
Yes, Falcon LLM is free for research and commercial use under the Apache License Version 2.0.
Join Guidady AI Mail List
Subscribe to our mailing list and get interesting stuff and updates to your email inbox.
How can I access Falcon LLM?
You can access Falcon LLM through the Hugging Face Page
Join Guidady AI Mail List
Subscribe to our mailing list and get interesting stuff and updates to your email inbox.
What is Falcon-40B?
Falcon-40B is a general-purpose large language model with 40 billion parameters trained on one trillion tokens of text and code. It is one of the four versions of Falcon LLM available.
Join Guidady AI Mail List
Subscribe to our mailing list and get interesting stuff and updates to your email inbox.
Can I submit my case proposal?
Yes, you can submit your case proposal for using Falcon LLM to the Technology Innovation Institute (TII) through their website. TII is offering training compute power and commercialization opportunities for exceptional use cases.
Join Guidady AI Mail List
Subscribe to our mailing list and get interesting stuff and updates to your email inbox.
Join Guidady AI Mail List
Subscribe to our mailing list and get interesting stuff and updates to your email inbox.
0 Comments