A New Era of NovelAI Begins Now
TLDRThe transcript introduces exciting updates in the field of AI, particularly in text generation. NovelAI is set to integrate painting into its image generation capabilities, allowing users to modify and customize images through an image-to-image interface. Two new V2 modules are being introduced for Sigurd and Andrew, enhancing their capabilities. The highlight is the introduction of Cleo, NovelAI's first custom-made model, developed in-house with a focus on storytelling. Trained on 1.5 trillion tokens, Cleo boasts a Lambada score of 73% and an 8192 token context, making it a powerful and efficient model with only 3 billion parameters. Although experimental, Cleo is a testament to NovelAI's ability to develop advanced language models. Initially available to Opus subscribers, Cleo is expected to be accessible to all users in two weeks, promising more innovative features in the pipeline.
Takeaways
- π¨ Painting is being integrated into image generation, offering users a new creative avenue.
- πΆ A cute dog is mentioned as a distraction, emphasizing the fun and lighthearted nature of the update.
- πΌοΈ Image-to-image interface allows for customization and replacement of elements in images.
- π The official release of the painting feature is announced for two days from the time of the transcript, on a Thursday.
- π Two new text generation modules, Sigurd and Terpy, are introduced, both of V2 variety, indicating significant updates.
- π₯ Cleo, a custom-made model, is introduced as the first in-house creation trained from scratch with a custom tokenizer and dataset.
- π Cleo has been trained on 1.5 trillion tokens, showcasing a vast general knowledge base.
- π Cleo achieves a Lambada score of 73 percent, surpassing other models of similar size.
- π During fine-tuning, Cleo reached a Lambada score of 74, exceeding expectations.
- π’ Cleo features an 8192 token context, a significant increase from previous models.
- π Cleo is a 3 billion parameter model, indicating a balance between size and performance.
- π§βπ¬ Cleo serves as a proof of concept, demonstrating the company's capability to train large language models.
- π§ Training on larger models has already begun, hinting at future advancements.
- π Opus subscribers get early access to Cleo, while the general release is planned for two weeks later.
- π The team expresses gratitude for patience and enthusiasm from the community, promising more exciting developments to come.
Q & A
What is the main announcement regarding image generation?
-The main announcement is the introduction of painting to image generation, which allows users to modify images and replace certain elements with text.
When is the painting feature officially dropping?
-The painting feature is set to be officially released in two days, on Thursday.
What new developments are there for text generation?
-Two new modules of the V2 variety are being introduced for text generation, which are complete replacements to the original modules.
Who is Cleo and what makes her special?
-Cleo is the first custom-made model created entirely in-house, trained from scratch with a custom tokenizer, 6 terabyte pre-trained data set, and a custom fine tune. She is trained on 1.5 trillion tokens of data and has a high general knowledge and performance, with a Lambada score of 73 percent.
What is the significance of Cleo's token context size?
-Cleo features an 8192 token context, which is an insane amount, allowing her to process and understand much more information compared to other models.
How big is Cleo in terms of parameters?
-Cleo is a 3 billion parameters model, which is considered small yet powerful for the capabilities it offers.
What is the current status of Cleo's availability?
-Cleo is still somewhat experimental, and is initially available for Opus subscribers to play with. The general public can expect to access Cleo in two weeks.
Why is Cleo considered a proof of concept model?
-Cleo serves as proof that the team can successfully train a large language model, which is a significant achievement and a sign of better times to come.
What are the benefits of using Cleo for training larger models?
-Using Cleo, the team was able to finalize the training process, fix any dataset issues, and smooth out any problems before moving on to train even larger models.
What does the team have planned for the future?
-The team has more exciting developments in the pipeline for the year, with larger models being trained and more advancements in AI technology.
How does the team show appreciation to their audience?
-The team expresses their gratitude to the audience for their patience and for keeping them motivated, promising even more cool features and improvements in the future.
What is the current status of the larger models that the team is working on?
-The script mentions that the team has already begun training much larger models, but more details on this will be shared in the future.
Outlines
π¨ Introducing Painting to Image Generation and New Text Gen Modules
The speaker begins by addressing the audience and quickly moves on to the main topic of introducing painting to image generation. They humorously divert attention to a cute dog, emphasizing the fun aspect of their work. The speaker then announces the availability of the new feature through the image-to-image interface, allowing users to modify and personalize images. Excitement builds as the speaker introduces two significant updates related to text generation. First, the mention of new V2 modules for Sigur and Andrew, suggesting improvements over the original versions. The second update is the introduction of Cleo, a custom-made model developed in-house, which is a significant achievement for the team. Cleo is highlighted for its extensive training on a large dataset, high general knowledge, and impressive LAMBADA score, making it superior to other models like Creek. The speaker also mentions Cleo's large token context and compact size, positioning it as a proof of concept for future advancements. Lastly, the speaker expresses gratitude to the team and the audience for their patience and support.
π Upcoming Excitement and Subscriber Previews
The speaker teases that there are even more exciting developments in the pipeline for the current year, aiming to match or exceed the enthusiasm of their audience. They mention that Opus subscribers will have the opportunity to experience these new features first, hinting at a sense of exclusivity and reward for their loyalty. The paragraph ends with a musical note, suggesting a rhythmic and upbeat continuation of their journey.
Mindmap
Keywords
Image Generation
Text Generation
Modules
Tokenizer
Pre-trained Data Set
Fine Tune
Parameter Count
Lambada Score
Token Context
Proof of Concept
Subscribers
Highlights
Introducing painting to image generation, a new feature in NovelAI.
The feature allows users to modify images and replace text with something else.
Official release of the painting feature is scheduled for two days from now, on Thursday.
Sigurd and Andrew terpy are receiving brand new V2 modules.
Cleo, NovelAI's first custom-made model, is introduced.
Cleo was created in-house with a custom tokenizer and trained on a 6 terabyte dataset.
Cleo has been trained on 1.5 trillion tokens, offering better general knowledge.
Cleo achieved a Lambada score of 73 percent, surpassing similarly sized models.
During fine-tuning, Cleo reached a Lambada score of 74.
Cleo features an 8192 token context, a significant increase from previous models.
Cleo is a 3 billion parameter model, offering a compact yet powerful package.
Cleo serves as a proof of concept for NovelAI's ability to train large language models.
Training on Cleo has helped finalize training processes and resolve dataset issues.
NovelAI has begun training even larger models following the success with Cleo.
Opus subscribers will have early access to Cleo, while the general release is in two weeks.
The NovelAI team expresses gratitude for the patience and support of their community.
More exciting developments are planned for the year ahead.
Subscribers will have the opportunity to experiment with Cleo first.