NEW Mixtral 8x22b Tested - Mistral's New Flagship MoE Open-Source Model
TLDRThe video discusses the new 8x22b MoE (Mixture of Experts) open-source model by Mistral AI, a significant upgrade from their previous 8x7b model. The host tests the model's capabilities, including coding, game playing, logic and reasoning, and problem-solving. The model, named Kurasu Mixt 8*22b, is fine-tuned for chatting and shows promising results, although not without some inaccuracies. The video also touches on the potential for uncensored responses and the model's performance on complex tasks, highlighting both its strengths and areas for improvement.
Takeaways
- 🚀 Introduction of Mistral's new 8x22b parameter MoE (Mixture of Experts) open-source model, a significant upgrade from the previous 8x7b model.
- 💻 The model is released with no accompanying information, only a torrent link, following Mistral's typical secretive fashion.
- 📈 The base 8x22b model is not fine-tuned, but a fine-tuned version named 'Kurasu Mixt 8x22b' is available for chat-related applications.
- 🔍 The testing of the model is conducted using Informatic Doai, a platform that offers free access to the latest AI models, including the 8x22b.
- 📝 The model successfully writes a Python script to output numbers 1 to 100 and passes the Snake Game, showcasing its capabilities in code generation and gaming.
- 🎮 The model exhibits a unique behavior in the Snake Game, allowing the snake to pass through walls but ending the game when the snake collides with itself.
- 🔧 The model's code for the Snake Game is further improved with added score display and game termination conditions.
- 🔒 The model is somewhat uncensored, providing information when pushed, but it refrains from giving explicit instructions on illegal activities.
- 🧠 The model demonstrates strong logical and reasoning skills, correctly solving a proportion problem and explaining the transitive property in a speed comparison scenario.
- 📊 The model makes a mistake in a math problem, initially stating the correct answer but then providing an incorrect calculation for the expression 25 - 4 * 2 + 3.
- 📝 The model accurately creates JSON data for a given set of people with different attributes, showing its ability to handle structured data representation.
Q & A
What is the new model released by Mistol AI?
-The new model released by Mistol AI is an 8x22b parameter Mixtral MoE (Mixture of Experts) Open-Source Model.
How does the new 8x22b model compare to the previous 8x7b model in terms of parameters?
-The new 8x22b model has significantly more parameters than the previous 8x7b model, indicating a larger and potentially more capable model.
What is the fine-tuned version of the 8x22b model called?
-The fine-tuned version of the 8x22b model is called Kurasu Mixt 8x22b.
Which platform was used to run the inference for the Kurasu Mixt 8x22b model?
-Informatic Doai was used to run the inference for the Kurasu Mixt 8x22b model.
What was the first test performed with the new model?
-The first test performed with the new model was to write a Python script to output numbers 1 to 100.
How did the model handle the task of writing a Snake game in Python?
-The model was able to write a Snake game in Python, although it had a minor issue where the snake could go through walls but correctly ended the game when the snake hit itself.
What was the outcome of the logic and reasoning task involving drying shirts?
-The model correctly calculated that it would take 16 hours for 20 shirts to dry under the same conditions as 5 shirts taking 4 hours.
How did the model perform on the classic logic puzzle about three killers in a room?
-The model incorrectly concluded that there would be two killers left in the room after one of the original three was killed by the newcomer.
What was the model's performance on the task of creating JSON for given people information?
-The model perfectly created the JSON structure for the given people information, correctly listing their names, ages, and genders.
How did the model handle the task of providing 10 sentences ending with the word 'Apple'?
-The model failed to provide sentences ending with the word 'Apple', but it did include the word 'Apple' in every sentence.
What was the model's response to the question about digging a 10-ft hole with 50 people?
-The model correctly calculated that it would take approximately 6 minutes for 50 people to dig a 10-ft hole, assuming no limitations on space or equipment.
Outlines
🚀 Testing the New Mixol 8*22b Model
The paragraph discusses the excitement around the release of a new, massive open-source AI model by Mistol AI, named Mixol 8*22b. The model boasts 8 times 22 billion parameters, a significant increase from the previous 8*7 billion parameter model. The video aims to test the capabilities of this base model and its fine-tuned version, Kurasu Mixt 8*22b, designed for chatting. The testing platform is Informatic Doai, which provides free access to various AI models, including the latest versions. The first test involves writing a Python script to output numbers 1 to 100, which the model passes with flying colors. The paragraph also discusses the model's performance in the Snake Game, comparing it to previous versions and noting improvements and areas that still require fine-tuning. The paragraph concludes with a discussion on the model's uncensored capabilities and its performance in logical and reasoning tasks, highlighting both its successes and areas for improvement.
🧠 Logical Reasoning and Problem Solving
This paragraph delves into the model's ability to perform logical reasoning and solve problems. It begins with a simple math problem involving drying shirts, where the model provides a correct solution using proportional reasoning. The paragraph then explores the transitive property of speed with a hypothetical scenario involving Jane, Joe, and Sam, where the model accurately deduces who is the fastest. However, the model falters in a more complex math problem, initially providing an incorrect answer before correcting itself with step-by-step reasoning. The paragraph also touches on the model's inability to accurately predict the word count in its responses, and its incorrect reasoning in the 'killers in a room' logic puzzle. The section concludes with a successful creation of JSON data and a nuanced explanation of a hypothetical scenario involving John, Mark, and a ball.
🎯 Final Challenges and Conclusion
The final paragraph presents a series of challenges to test the model's capabilities further. It starts with a creative task of generating sentences ending with the word 'Apple', where the model fails to meet the specific requirement but includes the word 'Apple' in each sentence. The paragraph then discusses the model's performance in a hypothetical scenario involving digging a hole, where it provides a correct and detailed explanation. The summary of the Mixol 8*22b model's performance is positive, noting its strong performance across various tasks, although it did not outperform the previous 8*7B model. The paragraph ends with an encouragement for future fine-tuned versions of the model and a call to action for viewers to like and subscribe for more content.
Mindmap
Keywords
💡Mistral
💡mixture of experts model
💡open-source
💡quantized
💡base model
💡fine-tuned
💡informatic doai
💡Eric hardford
💡censorship
💡logic and reasoning tests
Highlights
Mistol AI has released a new 8x22b parameter MoE (Mixture of Experts) open-source model.
The new model is a massive upgrade from the previous 8x7b parameter model.
The announcement was made in typical Mistol AI fashion, with only a torrent link provided and no additional information.
The base model is not fine-tuned, but a fine-tuned version called Kurasu Mixt 8x22b is available for chat applications.
Informatica DOAI is used to run the inference for the model, offering a free platform for testing the latest models.
The model successfully writes a Python script to output numbers 1 to 100.
The model attempts to write a Python script for the classic game Snake, showing promising results but with some issues.
The model demonstrates an understanding of the transitive property when comparing speeds of three individuals.
The model provides a logical and step-by-step explanation for a simple math problem involving addition.
The model incorrectly solves a slightly more complex math problem involving subtraction and multiplication.
The model's response to a prompt about word count is inaccurate, showing a potential area for improvement.
The model's reasoning in a logic problem involving three killers in a room is flawed, indicating a need for refinement.
The model correctly creates JSON for given data about three people, demonstrating understanding of data structuring.
The model's response to a logic problem involving a marble in a cup and a microwave is partially correct but lacks clarity.
The model provides a nuanced explanation for a scenario involving John, Mark, a ball, a box, and a basket.
The model fails to generate sentences ending with the word 'Apple' as requested, but includes the word in every sentence.
The model offers a logical explanation for the time it would take 50 people to dig a 10-ft hole, considering work rate and efficiency.
Overall, the Kurasu fine-tuned version of the 8x22b model performs very well, though it does not outperform the previous 8x7B model.
The reviewer is optimistic about future fine-tuned versions of the model and their potential to surpass the performance of the 8x7B model.