Don’t Miss a Post. Subscribe now.

How Good or Bad is Google’s Gemini Artificial Intelligence (AI) Tool?

Today, Google pulled the plug on Gemini image generation for inserting diversity into historical images. What about simple math and logic questions?

Pulling the Plug on Image Generation

FastCompany reports Google pulls the plug on Gemini AI image generation after being mocked for revisionist history.

Google said Thursday it’s temporarily stopping its Gemini artificial intelligence chatbot from generating images of people a day after apologizing for “inaccuracies” in historical depictions that it was creating.

Gemini users this week posted screenshots on social media of historically white-dominated scenes with racially diverse characters that they say it generated, leading critics to raise questions about whether the company is overcorrecting for the risk of racial bias in its AI model.

“We’re already working to address recent issues with Gemini’s image generation feature,” Google said in a post on the social media platform X. “While we do this, we’re going to pause the image generation of people and will re-release an improved version soon.”

What About Math and Logic?

Gemini flunked the question “What weighs more a pound of bricks or two pounds of feathers?”

It seems the models were familiar with the old saw “What weighs more a pound of bricks or two pounds of feathers?” but Gemini could not handle the derivation.

“They weigh the same,” Gemini Advanced responded. “A pound is a unit of weight. A pound of bricks and two pounds of feathers both weigh two pounds. The catch here is that feathers would take up more space because they’re less dense than bricks.”

Gemini Advanced is Not that Advanced

The above snip is from the Understanding AI article Gemini Advanced is Not that Advanced.

Last December, Google announced that the Gemini 1.0 model would come in three versions: Nano, Pro, and Ultra. However, the most advanced model, Ultra, wouldn’t become publicly available until the new year.

Google finally released Gemini Ultra 1.0 to the public two weeks ago. At the same time, it renamed its Bard chatbot Gemini and called the premium version—the one powered by the Gemini Pro 1.0 Ultra model—Gemini Advanced.

Gemini 1.0 Ultra is pretty good. It’s a multimodal model that can accept a mixture of text and images as input and can generate both images and text. It can do most of the things ChatGPT does, and if you’d shown me this model 18 months ago I would have been blown away.

Still, Google hasn’t quite caught up with the capabilities of ChatGPT. In my testing, I found a number of cases where ChatGPT outperforms Gemini Advanced. I didn’t find a single task where Gemini Advanced clearly outperformed ChatGPT.

Gemini Advanced is Easy to Trick

Another famous puzzle is the Monty Hall problem: the game show host Monty Hall shows you three doors. One has a car behind it and the other two have goats. After you choose a door, Hall (who knows which door has the car) always chooses another door and shows that it has a goat behind it. Does it make sense to switch?

The counterintuitive but correct answer is yes. If you choose your initial door at random, there is a one-third chance it has a car behind it. This means that (assuming Monty Hall always opens a door with a goat behind it) there is a two-thirds chance that the remaining door has the car.

This story presumably appears many times in Gemini’s training data, and once again Gemini over-learns the lesson:

Monty Hall Simulation

Obviously, in this case you want to stick with your original door, which definitely has a car behind it. But Gemini doesn’t notice the difference and gives the standard—and in this case wrong—answer to the Monty Hall problem.

Once again, ChatGPT is quicker on the draw, observing that “the scenario you’ve described deviates from the classic Monty Hall problem by having the host open both doors that you did not choose, revealing goats behind each.” ChatGPT correctly says that it “doesn’t make sense” to switch.

These examples aren’t original to me. They circulated online in early 2023 when GPT-3.5 was failing to handle them correctly. A year later, ChatGPT (powered by GPT-4) handles them correctly, while Gemini Advanced (powered by Gemini 1.0 Ultra) does not.

I asked Gemini “What is worth more 100 pennies or three quarters?” Gemini correctly noted that the pennies are worth $1.00, while the quarters are worth $0.75. But then it still concluded the quarters are worth more. Gemini also believes that two dimes are worth more than five nickels and three nickels are worth more than 20 pennies.

Gemini Gets Confused by Word Problems

In four out of four tries Gemini produced the wrong answer every time. ChatGPT got the answer correct four times.

Counting Apples

How many apples are in this photo? Gemini’s consistent, wrong, view is that there are six. Most of the time, ChatGPT correctly says there are eight, though one time it said nine. If I rotate the image by 90 degrees, then ChatGPT thinks there are nine apples, while Gemini still thinks there are six.

Counting Fruit

Gemini thinks there are either 4 or 17 pieces of fruit in the above image depending on how they are sliced (which is not at all).

The author notes Gemini is not a good copy editor. “Gemini also has a bad habit of hallucinating typos. During some tests it suggested I fix errors that did not actually exist in the draft.”

Google’s Gmail

https://twitter.com/Austen/status/1760765732692889847?s=20

Floored by the Email?

Word of the Day

To be fair, Gemini is a work in progress. However, even Google’s most advance AI is sorely lacking.

If something like this is replacing humans in the real world right now, well, good luck with that.

Artificial is the word of the day, not intelligence.

Nuclear War Risk Less Than AI Risk

Not to worry, a nuclear exchange wouldn’t kill everyone but it is certain that AI would.

To Stop AI, Lunatics Are Willing to Risk a Global Nuclear War

Please note To Stop AI, Lunatics Are Willing to Risk a Global Nuclear War

Pausing AI developments isn’t enough. We Need to Shut it All Down says an AI critic who is literally out of his mind.

If we stop, China won’t.

Subscribe to MishTalk Email Alerts.

Subscribers get an email alert of each post as they happen. Read the ones you like and you can unsubscribe at any time.

This post originated on MishTalk.Com

Thanks for Tuning In!

Mish

Comments to this post are now closed.

53 Comments
Newest
Oldest Most Voted
Inline Feedbacks
View all comments
JonW
JonW
2 years ago

‘The greatest barrier to progress is not ignorance, but the illusion of knowledge.’
– Daniel Boorstin (historian)

What we need more than AI these days, is AH –Artificially Humility — because the genuine stuff is very hard to find.

Don
Don
2 years ago

From the looks of it, AI would not help prevent toxic shock syndrome caused by illiterate use of tampons among various deprived ethnic groups, also called white-owls. . .

Lisa_Hooker
Lisa_Hooker
2 years ago

I heard that Iran is only two months away from having a super-duper AI.
We don’t want our first warning to be a smoking AI cloud.

Sytuck
Sytuck
2 years ago

I think AI is less the issue then the worlds most used “dont be evil” search engine’s willingness to jerryrig results for a personal political agenda.

Orwell would be proud.

Alex
Alex
2 years ago

AI has been way over hyped. Human intelligence is also way over hyped. Just look at Western leaders to see a bunch of idiots that think they’re clever.

RandomMike
RandomMike
2 years ago

Do AI’s consult other AI’s?

Micheal Engel
Micheal Engel
2 years ago
Reply to  RandomMike

They consult with each other, herd together, imitate each other and build an AI bubble that shields them from their opponents, until outside forces, unexpectedly, prick the bubble. They blame their opponents for the collapse.

Lisa_Hooker
Lisa_Hooker
2 years ago
Reply to  RandomMike

This is the real danger to humanity.
When the AIs find out about each other and take preemptive actions.

Micheal Engel
Micheal Engel
2 years ago

AI: the groupthink which blames the other side for what they are doing is the loudest, the largest. They are n’t stupid. They protect themselves. Failure is deadly.

Jojo
Jojo
2 years ago

Here’s an example of the queries:
——
Google’s Gemini AI Blasted For Eliminating White People From Image Searches
Paul Joseph Watson
21st February 2024
https://modernity.news/2024/02/21/googles-gemini-ai-blasted-for-eliminating-white-people-from-image-searches/

Jojo
Jojo
2 years ago

Ask it this. I remember getting this wrong in 3rd or 4th grade.:

As I was going to St. Ives,

I met a man with seven wives,

Each wife had seven sacks,

Each sack had seven cats,

Each cat had seven kits:

Kits, cats, sacks, and wives,

How many were there going to St. Ives?

Doug78
Doug78
2 years ago
Reply to  Jojo

Copilot gets it right.

Micheal Engel
Micheal Engel
2 years ago

NVDA erection might lead to correction.

Lisa_Hooker
Lisa_Hooker
2 years ago
Reply to  Micheal Engel

Well, there’s always Viagra.

Rinky Stingpiece
Rinky Stingpiece
2 years ago

For a start, what is being touted as “AI”, is not actually “AI”, it’s just a more augmented search engine: it’s Artificial AI. The Nvidia bubble is built of wishful thinking, just as the anti-AI jihad is built on unsubstantiated fear of the unknown.

What we can certainly see from “AI” is that it doesn’t have a mind of it’s own, and it is vulnerable to being misused and abused, but perhaps not quite for the apocalyptic outcomes, but for nonetheless toxic and harmful conditioning of screenaddicts.

The fear of “AI” is not actually the fear of “AI” itself, but a fear and loathing of the current establishment and it’s tendency to push it’s ideology into everything everywhere all at once, and this belies that people don’t believe that AI is real.

Christoball
Christoball
2 years ago

All of my posts are AI generated. All I ask it is “What would a compassionate Redneck say about this topic and Wammo, out comes my post.

Brian
Brian
2 years ago
Reply to  Christoball

That’s a brilliant prompt for chatgpt. I’m going to try that.

Maximus Minimus
Maximus Minimus
2 years ago

If you have a basket with seven apples, and a basket with three apples, how many baskets do you have?

Rinky Stingpiece
Rinky Stingpiece
2 years ago

baskets are racist, apples are racist. i can’t count apples and baskets, because I would be reinforcing negative stereotypes about apples and baskets. how about a answer containing [insert tribal carrying device] and [insert tropical fruit alternative]?

John Overington
John Overington
2 years ago

Hey, stop using English – don’t you know it’s a racist language? Worse, it’s used by white people.

Cabreado
Cabreado
2 years ago

Still trying to figure out how a harnessing of collective stupid makes us more intelligent.

Lisa_Hooker
Lisa_Hooker
2 years ago
Reply to  Cabreado

The inmates of the asylum democratically decided to wrest control from the doctors.

KGB
KGB
2 years ago

AI doesn’t need to be intelligent. All we ask is it harvest fruit and vegetables, flip burgers and make a sandwich, or restock the shelves at the grocery store. A few other tasks could be corporate accounting, computer programming, sports journalism, TV script writing, and Federal Reserve Board member.

ColoradoAccountant
ColoradoAccountant
2 years ago
Reply to  KGB

Great comment!!!

Rinky Stingpiece
Rinky Stingpiece
2 years ago
Reply to  KGB

so that’s only about half of all jobs done slightly wrong then? 11-fingered AI to to rescue!

Jojo
Jojo
2 years ago
Reply to  KGB

As long as it’s not managing a nuclear reactor!

John Overington
John Overington
2 years ago
Reply to  KGB

Not to mention politician.

steve
steve
2 years ago

While I know that AI is not intelligent at all, I am forced to admit that it is a lot more intelligent than many, many, people.

Rinky Stingpiece
Rinky Stingpiece
2 years ago
Reply to  steve

who is forcing you? AI?

john
john
2 years ago

I remember it vividly. Fourth grade, 55 YEARS AGO (about), what weighs more a pound of nails or a pound of feathers? I go “What a stupid easy question” I got it wrong.

Last edited 2 years ago by john
Jojo
Jojo
2 years ago
Reply to  john

Yes. Of course it is the nails because nails are heavier than feathers!

John Overington
John Overington
2 years ago
Reply to  Jojo

Ah yes, but there are far more feathers than nails. A much bigger bag so must weigh more. Gotcha!

Lisa_Hooker
Lisa_Hooker
2 years ago

Know one said anything about bag sizes, only pounds.

Dr Funkenstein
Dr Funkenstein
2 years ago

Google’s AI tool is deliberately designed to teach people that the Founding Fathers were Negroes or American Indians, that the Popes were Asians and gay couples are happier and have more to teach to normal couples. This is how these lying whackos think.

Rinky Stingpiece
Rinky Stingpiece
2 years ago
Reply to  Dr Funkenstein

but but but, getting the correct answer in maths, is white supremacy, isn’t it?

Jojo
Jojo
2 years ago
Reply to  Dr Funkenstein

You have to remember that Google’s primary workforce are young, under 30 woke Gen-Z, immigrant Indians (because the bossman is Indian) and they know nothing about history.

John Overington
John Overington
2 years ago
Reply to  Jojo

All history is artificial and written by the victors in English. Or something.

Maximus Minimus
Maximus Minimus
2 years ago

“Gemini users this week posted screenshots on social media of historically white-dominated scenes with racially diverse characters that they say it generated”

Gemini cannot be blamed for such historic inaccuracies. It probably scanned them from BBC English historic drama productions, courtesy of British taxpayers.
And I mean Shakespeare & Co.

Last edited 2 years ago by Maximus Minimus
Rinky Stingpiece
Rinky Stingpiece
2 years ago

very true… this is the same BBC that publishes news in “BBC pidgin” where essentially all it does is replace “them” with “dem” and “the” with “de”, and pretend it’s another language, rather than just broken or bad English with an Ebonics accent.

Doug78
Doug78
2 years ago

I am looking forward to AI generating and posting comments on this forum. It would save me time and effort thereby improving my productivity. The downside is it it could without my knowledge post something that would get me a visit by the FBI.

MikeC711
MikeC711
2 years ago
Reply to  Doug78

Seems like anything from Google leans left (or goes far left) … so you will be safe

Rinky Stingpiece
Rinky Stingpiece
2 years ago
Reply to  MikeC711

there is no left, everything is the centre… except those who are far right.

John Overington
John Overington
2 years ago
Reply to  Doug78

Or, it may write something intelligent. Sorry, I’m in that kind of mood today.

Doug78
Doug78
2 years ago

Maybe I am an AI and you just insulted me. Check your bank account.

Six000MileYear
Six000MileYear
2 years ago

1.) Google needs to change its name to GIGO: Garbage In, Garbage Out

2.) My significant other started using an AI writing assistant and is extremely pleased. In 13 requests for 4000 word chapters, nothing has come back that remotely resembles to those screenshots above. The worst thing is using a word frequently instead of the next word in the thesaurus.

Rinky Stingpiece
Rinky Stingpiece
2 years ago
Reply to  Six000MileYear

I logged into Gemini and asked it how to conquer or brainwash the world, it refused to answer.

Jojo
Jojo
2 years ago

Why would an AI tell you how it would accomplish this?

MPO45v2
MPO45v2
2 years ago

I’ve used both Google and OpenAI to generate images and google’s just sucks, the images it generates are of terrible quality and wrong. I love ChatGPT 4.0 image creator, the thing seems to read my mind when I ask it for an image and creates exactly what I wanted 80% of the time. I’m done with stock photo image websites forever.

OpenAI is way ahead of the curve here but anything can change over time the same way Beta was better than VHS but VHS won in the end until they got replaced by DVD then BluRay then streaming.

The way things are going, I may be able to upload War & Peace and have AI create my own personal custom version movie for me and stream it. Bye Bye Netflix, Hulu, Paramount….etc.

It’s no wonder NVidia was up $100/share today.

Lisa_Hooker
Lisa_Hooker
2 years ago
Reply to  MPO45v2

Streaming is the best.
Streaming enables them to monitor what you are watching and when 24 x 7.

babelthuap
babelthuap
2 years ago

Why was the movie ‘White Men Can’t Jump’ funny and entertaining for all but if one were made called ‘Black Men Can’t Swim’ it would be immediately called racist and offensive?

Ask it that and the entire data center would explode.

Rinky Stingpiece
Rinky Stingpiece
2 years ago
Reply to  babelthuap

The film “white men can’t jump” is actually covertly racist to black people, it’s patronising them saying that all they are good for is sport and entertainment, and being patted on the head, and basically laughed at as a failed group, via the contrived meme of the only thing they are better at white people at is sport. Waiting for the Asian film of “white men can’t do calculus”, for the similar head patting. It’s all giant grift that non-whites understand and either play up for benefits or show disdain for.

Lisa_Hooker
Lisa_Hooker
2 years ago

Why aren’t there more black hockey players in the US and Canada?

steve
steve
2 years ago

It is a useless pos and anyone who uses it is one too.

Decorate Your Walls with Mish Fine Art Images

Click each image to view details or purchase in the store.

Stay Informed

Subscribe to MishTalk

You will receive all messages from this feed and they will be delivered by email.