October 4, 2024

For more than a year, Google has raced to build technology that could match ChatGPT, the eye-opening chatbot offered by the San Francisco artificial intelligence start-up OpenAI.

On Wednesday, the tech giant took another step in the ongoing race, releasing a new version of its own chatbot, Google Bard. Available to English speakers in more than 170 territories and countries, including the United States, beginning immediately, the updated bot is underpinned by new A.I. technology called Gemini, which the company has been developing since the start of the year.

“This is the beginning of the Gemini era,” Sundar Pichai, Google’s chief executive, said in an interview. “It’s the realization of the vision we had when we set up Google DeepMind,” the company’s A.I. lab. He said that Google would roll three different versions of the technology into a wide range of products and services in the coming months.

Mr. Pichai and Demis Hassabis, who oversees Google DeepMind, said Gemini was more powerful than Google’s previous chatbot technologies, and that it could generate more-accurate responses and come closer to mimicking human reasoning in some situations.

“We’re superhappy with Gemini’s performance,” Dr. Hassabis said.

When OpenAI wowed the world with the A.I. chatbot ChatGPT late last year, Google was caught flat-footed. The tech giant had spent years developing similar technology, but like other tech giants — most notably Meta — it was reluctant to release a technology that could generate biased, false or otherwise toxic information.

In March, Google released its own chatbot, Bard, to middling reviews. A month later, the company announced that it had combined its two A.I. labs — Google Brain and DeepMind — bringing together more than 2,000 researchers and engineers. And in May, at its flagship Google I/O conference, it announced that the new Google DeepMind lab had started developing Gemini.

After founding the Brain lab in 2011, Google acquired DeepMind in 2014, paying $650 million for the London A.I. start-up. DeepMind operated largely independently of the Brain lab and the rest of Google for a decade and even tried to spin out of the company in 2017. But as Google struggled to catch up with OpenAI, Mr. Pichai combined the two labs under Dr. Hassabis, a neuroscientist who co-founded DeepMind.

Google released benchmark test results claiming that Gemini’s most powerful version outperformed OpenAI’s latest technology, GPT-4, in several key areas. It is better at generating computer code than previous Google technologies, Mr. Pichai said, and it can more accurately summarize news articles and other text documents.

Gemini was also designed to analyze images and sounds, but those skills will not be rolled into the Bard chatbot until a later date.

Google has built three versions of Gemini with three different sets of skills. The largest, Ultra, is designed to tackle complex tasks and will debut next year. Pro, the mid-tier offering, will be rolled out to numerous Google services, starting Wednesday, with the Bard chatbot. Nano, the smallest version, will power some features on the Pixel 8 Pro smartphone, such as summarizing audio recordings and offering suggested text responses in WhatsApp starting Wednesday.

Gemini is what scientists call a large language model, or L.L.M., a complex mathematical system that can learn skills by analyzing vast amounts of data, including digital books, Wikipedia articles and online bulletin boards. By identifying patterns in all that text, an L.L.M. learns to generate text on its own. That means it can write term papers, generate computer code and even carry on a conversation.

With Gemini, Google has also trained the technology on digital images and sounds. It is what researchers call a “multimodal” system, meaning it can analyze and respond to both images and sounds. If you give it a math problem that includes lines, shapes and other images, for example, it can answer in much the way a high school student would.

That portion of the technology, however, will not be available to consumers until sometime next year. Google also acknowledged that like similar systems, Gemini is prone to mistakes. It can get facts wrong or even “hallucinate” — make stuff up.

Google Cloud, which offers A.I. and computing services to other companies, has been eager to provide clients with Gemini as it competes for deals against OpenAI and Microsoft. After OpenAI briefly forced out Sam Altman, its chief executive, last month, leaving the company in limbo, Google Cloud created a migration plan in an attempt to poach its rival’s customers.

Clients could pay Google the same price as their current OpenAI rate and get cloud credits, or discounts, thrown in.

Google said that cloud customers would have access to Gemini Pro — the mid-tier offering — on Dec. 13. Mr. Pichai said that some outsiders were now testing Gemini Ultra — the most powerful version of the technology.

Though Google has spent the last year racing to retake the A.I. lead from OpenAI, Mr. Pichai said there was enough room in the market for all the A.I. providers.

“It’s so far from a zero-sum game,” Mr. Pichai said. “We have a sense of excitement at what we’re launching. We also realize we’re in very early days because we can see the follow-up progress we are making.”