Today marks the 25th anniversary of TechSpot. As a reliable source for tech analysis and recommendations, TechSpot has a lot to offer.
Here’s something to look forward to: Google has officially released Gemini, its most advanced AI model yet. Highly regarded in comparison to GPT-4 in various tests, we should remain cautiously optimistic until independent tests are conducted.
Google seems to have chosen the perfect timing for the launch of Gemini, as OpenAI, the developer of GPT-4, is dealing with internal struggles. This unintended but helpful scenario gives OpenAI more time to process and respond to the news.
Google is fully engaged in promoting Gemini, releasing numerous videos on YouTube Twitter and an extended post on its blog, showcasing the AI’s impressive capabilities. However, it’s important to remember that Google is a for-profit company and will naturally put its products in the best possible light.
Some questions have arisen about what Gemini really is (beyond the zodiac sign). The best way to understand Gemini’s remarkable abilities is to see them in action, take a look at this video here.
— Sundar Pichai (@sundarpichai) December 6, 2023
Setting aside any disclaimers, the video post by Sundar Pichai (above) is likely the best demonstration of Gemini’s capabilities. In the video, a Gemini-infused chatbot shows its ability to understand various types of input, primarily audio and visual in this example. Gemini is described as “multimodal,” meaning it can understand text, image, and video inputs.
It can accurately identify objects in photos or videos, transcribe spoken words into text, and provide a meaningful response to complex queries. It can compare different communication modes and identify the significance when multiple inputs are used simultaneously. It can also respond using various forms of output.
The AI model is available in three sizes: Gemini Ultra, the most complex model tailored for data; Gemini Pro, ideal for scaling for specific projects; and Gemini Nano, designed for “on-device tasks.” For example, Google has announced plans to integrate Gemini Nano into the Pixel 8 Pro.
Understanding Google’s benchmarking can be quite challenging unless you closely follow AI training and development. DeepMind CEO Demis Hassabis highlights the most important aspects in Google’s blog post.
In the MMLU benchmark, which measures extensive multitask language understanding in 57 subjects like mathematics, physics, law, and ethics, Gemini scored an industry-high 90 percent, surpassing GPT-4’s score of 86.4 percent. This indicates that Gemini has an exceptional understanding of language across various subjects, potentially making it more versatile and useful in diverse applications.
Hassabis also claims that Gemini surpasses GPT-4 in the new MMMU (Massive Multidiscipline Multimodal Understanding and Reasoning) benchmark, scoring 59.4 compared to 56.8 percent.