October 4, 2024

First, there were talking digital assistants like Siri, Alexa and Google Assistant. Then there were online chatbots like ChatGPT and Google Bard. Now, the two are merging.

On Thursday, Google introduced Gemini, a smartphone app that behaves like a talking digital assistant as well as a conversational chatbot. Responding to voice and text requests, it can answer questions, write poetry, generate images, draft emails, analyze personal photos and take other actions, like setting a timer or placing a phone call.

Immediately available to English speakers in more than 150 countries and territories, including the United States, Gemini replaces Bard and Google Assistant. It is underpinned by artificial intelligence technology that the company has been developing since early last year.

The new app is designed to do an array of tasks, including serving as a personal tutor, helping computer programmers with coding tasks and even preparing job hunters for interviews, Google said.

“It can help you role-play in a variety of scenarios,” said Sissie Hsiao. a Google vice president in charge of the company’s Google Assistant unit, during a briefing with reporters.

When ChatGPT arrived from OpenAI at the end of 2022, wowing the public with the way it answered questions, wrote term papers and generated computer code, Google found itself playing catch-up. Like other tech giants, the company had spent years developing similar technology but had not released a product as advanced as ChatGPT.

(The New York Times sued OpenAI and its partner, Microsoft, in December, claiming copyright infringement of news content related to A.I. systems.)

Google released its own chatbot, Bard, in March to middling reviews. In the weeks that followed, the company merged its two leading A.I. labs — Google Brain and DeepMind — and announced that the combined lab was developing new A.I. technology called Gemini.

Gemini is what researchers call a large language model, or L.L.M., a mathematical system that can learn skills by analyzing vast amounts of data, including books, computer programs and online chatter. By identifying patterns in all that text, an L.L.M. can learn to generate text on its own. That means it can write poetry, generate computer code and even carry on a conversation.

It is also prone to mistakes. It can get facts wrong or “hallucinate” — make stuff up.

Gemini is a “multimodal” system, meaning it can respond to both images and sounds. After analyzing a math problem that included graphs, shapes and other images, it could answer the question much the way a high school student would.

In December, Google used a limited version of this technology to upgrade Bard. Now, the company has retired the Bard name and is releasing a more powerful version of the technology through the Gemini app, which is available on Android phones and the web. A version for iPhones will arrive “in the coming weeks,” Google said.

Google created a free but limited version of the Gemini app. A more powerful version — called Gemini Advanced and underpinned by a version of Google’s Ultra language model — is available for a $19.99 monthly subscription. Google offers a free two-month trial.

Google has released benchmark test results claiming that Ultra outperformed OpenAI’s latest technology, GPT-4, in several key areas, including generating computer code and summarizing news articles.

The Gemini app can also generate, analyze and respond to images. Users can upload a photo from their Super Bowl party, for instance, and ask the app to generate a caption.

Google also said it would offer similar technology through the Google Workspace and Google Cloud business services. This will allow customers to use the technology alongside apps like Gmail and Google Docs.

On Android phones, the new app will replace Google Assistant if users download Gemini. Like Google Assistant, it can respond to voice commands, though it also responds to text commands.

Google said it would also continue to offer and improve Google Assistant.

Last year, OpenAI released a similar version of its ChatGPT chatbot that can respond to voice commands. Most industry insiders believe that the A.I. technology that drives chatbots like ChatGPT will merge with and replace digital assistants like Apple’s Siri and Amazon’s Alexa.