Google unveils Gemini 1.5 with an insane context window

Google recently released Gemini, its most powerful AI model. It swiftly overtook the role of Google’s flagship AI model, replacing Bard completely. While Gemini is only two months old, Google has introduced us to Gemini 1.5, the next generation in the company’s AI tech.

If you’ve been under a rock for the past couple of weeks, then you’ll need a little refresher. Google Bard is gone; the company replaced it with Gemini. You can access it through the website and through the official app. Also, if you’re looking for a more advanced version of the model, you can try out Gemini Advanced. This is the chatbot that uses the Gemini Ultra model.

So, with Bard gone, it looks like Google has its sights set on replacing Google Assistant eventually. We’re not sure when this is going to happen, but we’re already seeing signs of this happening. For example, you can now use the Gemini app as a replacement for the Google Assistant on your phone.

Google introduced Gemini 1.5 with an insane context window

Being the 1.5 version of Gemini, you can expect an experience much more powerful than version 1.0. The company announced this new model through a Google blog post. Both the CEO of Google (Sundar Pichai) and the CEO of Google DeepMind (Demis Hassabis) explained why Gemini 1.5 is superior to the first model.

What’s a context window? What are tokens?

Before hopping into what makes this iteration more powerful, here’s a refresher on context windows and tokens. A token is a bit of information that can be processed by a model. It could be a section of a word, a bit of audio, a bit of a video, or a bit of an image. For example, a word as simple as “Toaster” is made up of a handful of tokens.

A model can only understand a certain number of tokens at a time. The number of tokens a model can understand at a time is called the context window. The larger the context window, the larger your query can be.

Say, you paste your college report into Gemini to summarize, and your report is 2,000 words long, (let’s just say that equates to 5,000 tokens). As long as the context window is larger than 5,000 tokens, then Gemini can understand every bit of information in your report.

Gemini 1.5 could have a tremendous context window

The most significant change is the massive context window. The company is rolling out Gemini 1.5 Pro for early testing, and it has an impressive context window of 128,000 tokens. To put that into perspective, Gemini 1.0 has a context window of 32,000. That’s 4x as many tokens.

It doesn’t stop there, as a small group of testers will gain access to a version of Gemini 1.5 with a context window of up to 1 million tokens. With a window like that, you can feed it a 700,000-word novel, 30,000 lines of code, 11 hours of audio, or an hour of video, and it will understand every bit of it. With that many tokens, you can fit the first four books of Stephen King’s Dark Tower series into it. That’s more than 609,000 words and 2,000 pages.

Google even said that it had tested up to 10 million tokens internally, but that won’t be making it to the public anytime soon. In any case, it’s great to see that Google is pushing the envelope with AI technology so far and so quickly.

Other improvements

Along with the increased context window, you can expect other improvements like better reasoning, better learning skills, better ethics, and many more. The blog post goes into much more detail. So, if you’re an AI enthusiast, then you’ll want to read the blog post. It dives into more of what’s powering this AI model.

Read the blog post

2024-02-16 15:08:50