The Impossible Artificial Intelligence Challenge
Why Google's Gemini AI and others like it have struggled
Generative artificial intelligence lacks the common sense you and I have.
Which is why Google’s Gemini model recently produced images of an African American George Washington. And why it created ethnic minority Nazis in 1940s Germany, along with a litany of other odd and concerning image generations.
You and I know that George Washington was white.
In fact, we even know exceptions to universal rules that we have barely or never encountered.
As this AI researcher explained through the example that “birds can fly”, artificial intelligence struggles with common sense.
Yes, birds generally can fly, but we know that not all birds can actually fly. Penguins, for example, cannot fly. Roosters also prefer to stay on the ground. Even a seagull stuck in an oil spill cannot fly very easily.
Imagine trying to train a generative AI model on all possible exceptions to the general rule that birds can fly.
As humans, we can reason through various scenarios and exceptions even if we’ve never directly encountered a seagull stuck in an oil spill. It’s a remarkable skill if you think about it. Artificial intelligence models struggle here because if they’re not fed the right comprehensive data set that includes every possibility and nuance, their output will be incomplete and likely inaccurate.
It will also be missing that very human concept that’s difficult to define: common sense.
Generative AI tools like Gemini and ChatGPT should therefore offer a range of possibilities. They should acknowledge that universal truths are rare, but certain historical facts are undisputed (until evidence suggests otherwise).
So while creations from these tools in response to queries like “Can all birds fly?” should be appropriately nuanced, certain historical facts like “Was George Washington white?” should be narrowly defined.
This is common sense for most of us, but hardly common or easy for artificial intelligence.
It shouldn’t surprise anyone that even Google - one of the best-resourced companies in the world - failed to construct common sense accurately. They admitted it themselves.
“It’s clear that this feature missed the mark. Some of the images generated are inaccurate or even offensive. We’re grateful for users’ feedback and are sorry the feature didn't work well.”
As for their trouble tuning for nuanced queries vs defined queries (including historical facts):
“[O]ur tuning to ensure that Gemini showed a range of people failed to account for cases that should clearly not show a range.”
History. The laws of physics.
Certain queries should never show a diverse range of possibilities unless someone can prove otherwise. Such as evidence that the speed of light is not 299,792,458 meters per second.
It’s an almost impossible artificial intelligence challenge. Tuning for diverse possibilities in some cases, but tuning for no diversity in others.
Google simply overcorrected and applied too much diversity in their tuning across the board.
As much as the recent Gemini images and results raised eyebrows, they should not have shocked anyone. And I highly doubt it’s a product of anyone with a political agenda. If anything, Google was too conservative in tuning their LLMs to present a broader array of possibilities than is realistically possible.
And it’s not like other generative AI tools like ChatGPT don’t make mistakes. It hallucinates regularly as many lawyers have learned the hard way (don’t cancel your legal research tools yet!).
The core reason I suspect this happens - and will keep happening - is that these LLMs do not actually understand what they’re saying. They are presenting probabilities.
What is the next word most likely to appear in this sequence?
That’s the question LLMs consistently ask themselves when responding to a query. It’s not like they grasp a basic understanding of bird wing aerodynamics. They just know that the next likely word in the sequence: “birds can ___” is most likely “fly.” So they spit that out.
More people need to appreciate the impossible task that generative AI systems are trying to perform. There is a constant battle between nuance and certainty, one that regularly evolves too as new information arrives.
These models will continue to make mistakes as their tuning is continuously tweaked and refined. So puff pieces from the likes of Business Insider should not outrage us by the fact Google “is trying to influence the way its AI produces results” like it’s some terrible crime.
Of course they’re trying to influence the way its artificial intelligence produces results. OpenAI does the same with ChatGPT.
All the more reason why we need guardrails around artificial intelligence to reasonably ensure safety and soundness. Not to mention best efforts at communicating truth, fact, and nuance.
But that is for another article.
For more of my writing and content, follow me here:
Website: https://polispandit.com
Medium: https://medium.com/@johnny-p