MPT-7B: A New Standard for Open-Source, Commercially Usable LLMs

MPT-7B: A New Standard for Open-Source, Commercially Usable LLMs

Jonathan Frankle, Chief Scientist @MosaicML, just announced the latest entry in the MosaicML Foundation Series: MPT-7B

There's a lot to absorb about this one. Mosaic trained this model from scratch on 1 trillion tokens, at a cost of $200,000 taking 9.5 days. It's Apache-2.0 licensed and the model weights are available today.

They're accompanying the base model with an instruction-tuned model called MPT-7B-Instruct (licensed for commercial use) and a non-commercially licensed MPT-7B-Chat trained using OpenAI data. They also announced MPT-7B-StoryWriter-65k+ - "a model designed to read and write stories with super long context lengths" - with a previously unheard of 65,000 token context length.

They're releasing these models mainly to demonstrate how inexpensive and powerful their custom model training service is. It's a very convincing demo!

Subscribe to ssv.ai

Don’t miss out on the latest issues. Sign up now to get access to the library of members-only issues.
jamie@example.com
Subscribe