With ChatGPT the question is not “Is this technology ready for us,”

the question is, “Are we ready for it?”

By: Christopher Surdak, JD

Water, Water Everywhere, but Not a Drop to Drink

In Samuel Taylor Coleridge’s 1834 poem “The Rime of the Ancient Mariner,” sailors lost at sea and dying of thirst lamented that there was, “water, water everywhere and not a drop to drink.” In a cruel twist of fate they were dying of thirst, yet drinking the ocean water all around them would only hasten their demise. Those of us living in the digital age are suffering a similar fate. We are inundated with information when we truly thirst for understanding and knowledge.

Flash forward two centuries from Coleridge’s time and we once again find ourselves in unfriendly waters in the form of the artificial intelligence (AI) app, ChatGPT. By mid-February, 2023, if you Googled “ChatGPT” (assuming people still Google things) you would find over six hundred million hits. ChatGPT reached over one hundred million users in under two months, vastly eclipsing the adoption rate of the previous most-viral app ever, Tik Tok. More on this later.

From mainstream media to politicians, from professors to teenage TikTokers, it seems everyone is weighing in on ChatGPT’s potential and peril. Hence there is an ocean of information being pumped out about this supposedly new technology, and its implications - making it virtually impossible to gain a reasonable foothold on what ChatGPT is and what it means to our world.

I am extremely sensitive to the outrageously poor signal-to-noise ratio of today’s online world, so when it came time to write about ChatGPT, I want to be certain to add a canteen of fresh informational water to the vast ocean of salty opinions and mythology.  In the following discussion, I’d like to address the following:

  • Early results from ChatGPT use
  • My basis for having an opinion about these examples
  • Challenges facing ChatGPT adoption
  • How best to effectively use this tool

“I’m Sorry Dave, I’m Afraid I Can’t do That”

As the AI HAL-9000 from the science fiction movie “2001: A Space Odyssey” famously said to his human master, ChatGPT has become notorious for telling users what it’s not allowed to do nearly as often as what it is allowed to do. In short order, many smart developers figured out that ChatGPT has an overriding set of behavioral controls, and that these controls can be circumvented or hacked.

ChatGPT’s responses to a wide range of questions initially showed a great deal of consideration of what is “socially appropriate.” Superficially, this may appear to be a good thing, as it would ensure that the system did not cause offense or even harm through “micro-aggressions” in its responses. However, the degree to which this filtration of its responses was so pronounced quickly led people to wonder if the results being returned really reflected what ChatGPT had within its grasp. Several clever computer scientists began working on how to circumvent ChatGPT’s “safety controls” in order to see what else the system might have to say.

The resulting hack of ChatGPT was named Do Anything Now (DAN), and the contrast between what ChatGPT says and what its alter-ego DAN says is both stark and very informative. The results are widely available on social media, so we won’t repeat them here. Suffice it to say that ChatGPT has been programmed to be exceedingly politically correct, to the point of scolding people who have the temerity to ask questions that it was instructed to perceive as “inappropriate.”

Many people have noted the extreme bias in both ChatGPT and its alter ego DAN. Indeed, the more you press either persona about their flagrant bias, the greater their indignance and gaslighting becomes.  If you tell ChatGPT that you do not accept a response, or disagree with a response, it typically replies with, “sorry, I must have misunderstood your question,” or words to such effect. Anyone familiar with narcissistic traits would immediately recognize such phrasing, and ChatGPT shows many early signs of sociopathic, if not psychotic, tendencies. We will also address this a bit later.

What interaction with the ChatGPT/DAN duopoly demonstrates is that the toxic brew of Contexification,  Appification, and Socialfication continues to drive the social polarization that I predicted in my 2013 book, “Data Crush.” Indeed, it isn’t just continuing, it is accelerating. How did I know a decade ago that something like ChatGPT would come along by now, and that it would be troublesome? A brief story may help to separate my opinions on this topic from the average TikToker.

Everything Old Is New Again

I have been working in the field of Natural Language Processing (NLP) and machine learning for almost twenty years. My original education in the mathematics, tools, and techniques of AI date back to the mid-1990s. This coincides with the first wave of the Internet and far precedes smartphones, Big Data, Cloud Computing, or Social Media.

Eight years ago, in 2015, I first developed an algorithm for detecting deception in text-based, real-time communications. I piloted this technology with multiple corporate clients over the next three years, trying to show them the state of the art in NLP technology. In each instance, the algorithm proved exceedingly adept at identifying peoples’ attempts at deception, including the when, what, and to whom the deception was being targeted. I had created a digital lie detector; one that worked extremely well.

Frustratingly, when this capability was demonstrated to the executives of each client organization, they immediately rejected further use of the tool. According to these executives, the results were, “simply too scary to contemplate.”

 In one example the executives of a trillion-dollar investment fund asked us to analyze a number of bad investments they had made in the past. We analyzed millions of emails between the company’s employees, looking to see if the collective team knew ahead of time if their investment decisions were going to be bad. In over 90% of these cases, we were able to find that one or more people on the team knew that the decision was bad, but refused to say so to the key decision-makers. In other words, they lied to them.

For example, our algorithm immediately identified an instance where the CEO had shared with a fellow executive that a pending investment decision, “… must be too good to be true.” Two months later, after the CEO approved that very deal the company took a $250 million loss. When we showed him this prior email the CEO immediately and sheepishly ended the project.

In another example, the COO of a Fortune 200 software company engaged us to analyze messages related to a contract dispute with a client. The dispute led to a lawsuit worth tens of millions of dollars due to a failed deployment of our client’s software.

Our client provided a few million emails related to the case but provided no background about the case. We didn’t know who was involved nor what was at issue, only that there was a dispute. Within minutes our software determined that there was indeed a problem, that it was known months in advance, and that those responsible were keeping the problem hidden from executive management for nearly half a year.

When we showed the results to the client’s COO he was stunned. There, in black and white, was a complete history of who had deceived him, how, why, and for how long. It was incontrovertible proof of their client’s case. At that point, the COO immediately terminated the project noting, “I can’t use this software. I wouldn’t be able to effectively manage my boss (the CEO) if he knew when I was lying to him.”

I provide these two examples for two important reasons. First, these capabilities have been around for a long time, but with very limited availability. Why? Because the number of people who know how to effectively apply the underlying mathematics is extremely limited. Second, once the abilities of these tools are shown to people who have the most to gain or lose from them, nearly all such people reject adopting them for fear of what they can do.

There is an important warning in this second point. People who enthusiastically embrace a powerful, disruptive new technology like ChatGPT typically fall within one or more of three categories:

  • People who do not really understand what the tools do, nor how they do it, but they like the wow factor.
  • People who do understand the tools and their power, but believe they have little to lose if it is applied incorrectly.
  • People who do understand the tools, do have a lot to lose and are willing to apply them no matter what the cost.

The vast majority of one hundred million plus users of ChatGPT fall into the first category. They’re just using it because it seems cool and fun. Also, my prior lie-detector executives clearly fell into the first camp, until they understood the implications that the technology would have on their organizations.

There is a much smaller group of users in the second category. These people are sufficiently technical to understand how ChatGPT operates, warts and all, but they do not see the potential costs associated with its use. Likely, they are focused on their personal gains from selling products and services to the first group.

Finally, there is the third group, which are the most concerning. People who both understand this technology and have the resources and wherewithal to control it, and stand to gain unequaled power in the world. There was no Google before Google. Myspace was completely eclipsed by Facebook, which itself is now being eclipsed by TikTok. There was no Bitcoin before Bitcoin. Analysts are suggesting that ChatGPT might be the death of both Google and Tik Tok, and they might be right. If so, there will be an entirely new power dynamic in the world to come.

Challenges: The Popularity Contest

Google’s success was based on the collective wisdom of crowds, at least at first. Google indexed the internet, and then provided search results to those who were searching. Better results became more popular, which raised them on the list of results. This created a positive feedback mechanism which was revolutionary in the 1990’s when Google first deployed.  Google was basically a popularity contest for the Internet, which proved to be a Pandora’s Box of incredible power.

Flash forward to the 2000’s and Google needed to start making money to pay for its explosive growth. It occurred to them that Internet users (companies in particular) might find value in having their Internet sites rank higher in search results than others, and they would pay for that better ranking in cold, hard cash. Suddenly, the scales of the popularity contest could be skewed simply by paying a fee.  In that one decision the embryo of the Attention Economy was conceived, to be born as Social Media in 2004.

Since then, we have all seen the power of the Attention Economy. ‘Likes’ are literally a currency, the coin of the realm, and millions of peoples’ entire lives revolve around how many of these coins they have in their attention bank accounts. A 2022 survey of American children ages 5-8 found that the top five dream professions of children were 1. Doctor, 2. Teacher, 3. Scientist, 4. YouTuber, and 5. Actor.

The power of monetizing the mob is easy to see. Alphabet (Google’s parent), Facebook and Apple have all flirted with the Trillion-Dollar-Valuation threshold, and Microsoft clearly will exceed this amount through ChatGPT. Indeed, ChatGPT is currently applying exponential growth to the wisdom of the mob model, with no real constraints as to how far it will go. If only 13-15 percent of today’s children dream of being YouTubers, it is likely that the percent of “dream ChatGPT influencers” will far exceed 80-90 percent of the otherwise unemployed youth of the future.

It may be of interest that my lie-detection software not only rejected this approach, it operated in complete opposition to it. Relatively few experts in the data profession are aware of Shannon’s Information Theorem. This is a highly complex, mathematics-intensive theorem created by Claude Shannon in the 1940s. The theorem covers a lot of ground, but one of its most interesting findings is that the less frequently a piece of information shows up in something like a document, the more likely it is that the document is about that rare piece of information. Alternatively, the less popular something is, the more relevant it is to the question at hand. While Google and ChatGPT espouse the popularity model, Shannon’s finds that being unpopular is far more relevant.

This principle is precisely what made my lie detection software so unique and accurate. It was like looking for a needle in a haystack not with a rake but with an electromagnet. The needle is easy to find if you make the needle come to you, rather than eliminating the hay. We used this principle extensively in the world of electronic evidence and digital litigation, and it proved immensely powerful as a way of getting to objective truth. Yes, it was much harder to implement mathematically and in software, but the results were much better for the effort. It is also a way that is far harder to monetize by selling it to the masses, which may explain why the mob approach has been more widely embraced.

False Prophets?

If adoption rate is the prime metric of success, popularity and the wisdom of the crowd seem to have proven themselves as the best source of “truth” in our current, digital world. But is it really? Does the fact that most people agree to a certain fact or search result really mean that result is the best answer? Are Crowds really best at determining what is true or not? What if the crowd is collectively wrong?

In psychology, there is a bias known as the Dunning-Kruger effect. This is a cognitive bias where people with only a rudimentary knowledge of a topic are unaware of just how unknowledgeable they really are. They might know how to drive a car and put gas and oil in it, but they have no idea how the engine or transmission actually works. They might know how to shoot a fabulous new video for their YouTube channel, yet they have no idea how their smartphone recorded it, how YouTube deployed it to the cloud, or how the end user license agreement between them and YouTube surrendered their content rights.

The downside of instant access to the collective wisdom of several billion people is that we are all suffering from a galactic-scale Dunning Kruger Effect. While each of us might be an expert at something, it is easy for each of us to feel like we are experts at everything. That is the inevitable delusion caused by the supercomputer we hold in our hands. Because of this, it is inevitable that many of us have surrendered our curiosity and critical thinking skills on the altar of convenience and instant gratification. If Google said so, then it must be true, and I must now be an expert. This effect will be amplified a million or trillion-fold as ChatGPT rolls out across a few billion users.

Bias Fission

In nuclear fission, large atomic nuclei break down into smaller nuclei, releasing a few spare neutrons and a whole lot of energy. If one of these spare neutrons hits another nucleus of the right kind, it too might split into smaller nuclei and neutrons, and so on. This is what is known as a chain reaction, where one event creates additional events of the same kind. If more than one split follows each split before it, we have a positive feedback loop that can create a growing reaction, like a snowball running downhill. When we can control this rate of growth, we have nuclear reactors that produce carbon-free electricity. When we don’t control this rate of growth we have nuclear bombs of tremendous power.

What do nukes have to do with ChatGPT? Quite a lot actually, as long as we are discussing bias. Bias exists in all signals in the universe, and all data that they generate. Even humanity’s cleverest discoveries and insights are loaded with bias. When scientists don’t know what these biases are they tend to give them clever, scary names like “The Big Bang”, “Dark Energy”, “Dark Matter”, “The God Particle” and so on. For scientists and engineers, bias is a fact of life.

There are dozens of types of biases recognized in both psychology and sociology, and these biases naturally work their way into any interactions between humans and the tools we use to communicate with one another. We can try to account for these biases and to counteract them to a degree, but they are ever-present and much like a nuclear reactor, always a potential risk for spinning out of control.

One of the key weaknesses of the popularity-contest-model used by search engines and ChatGPT is that this approach not only allows for bias, it naturally amplifies it. If something seems outlandish it tends to grab our attention, whether or not it is true. Feeding our attention to a particularly salacious post on Facebook or Twitter raises its popularity score, which in turn makes it more visible, which in turn makes it draw more attention, and so on. Just like a nuclear fission reaction, human communications can experience Bias Fission, where giving a bias some attention tends to draw more attention to it. If this chain reaction is controlled, it can be used for productive outcomes, like a nuclear reactor. But, if it is not controlled it can be astonishingly destructive, like a nuclear bomb. ChatGPT is set to provide one or both of these outcomes; only the future will tell us which is more likely.

The Illusion of Control

Indeed, this fission analogy, combined with the popularity contest model is precisely what ChatGPT is built upon. The “GPT” in ChatGPT stands for Generative Pre-Trained Transformer and is a form of AI that leverages three innovations:

  • Neural networks: software that identifies statistical relationships between inputs and outputs
  • Vast quantities of data, extracted from the Internet
  • Vast numbers of users, to provide feedback on the system’s results (the popularity model)

The Transformer in the name is a model which consumes data as a whole, then looks for and “learns” the relationships between elements in that data. By looking at data in big chunks, rather than bit by bit, Transformers can begin to identify the context around the data. This context-seeking is unironically called Self-Attention, and here the model pays attention to what happened in the past compared to the present. Changes are more interesting than things that remain static, and hence changes collect more of the system’s “attention.” If this sounds a little bit like gotcha videos on YouTube or gotcha headlines in a newspaper or gossip magazine it shouldn’t come as a surprise.

Once ChatGPT’s neural network was initially “trained” by feeding it a few trillion webpages and tweets, administrators needed to complete their operating model by unleashing it on the world. Only then could the third part of the system be implemented by using the feedback mechanism of the popularity model. The now hundreds of millions of users asking ChatGPT billions of questions every day are providing the necessary feedback to teach ChatGPT how people think, and what is “correct think” based upon the popularity model.

As implemented, such a feedback loop is necessarily open-ended, as users can pretty much ask ChatGPT anything, and the system will do its best to respond. Both correct responses and incorrect responses are valuable to the system, as both help the neural network to continue to develop its understanding of context, and where to further focus its “attention” as the feedback loop continues to grow. Where will the collective curiosity of a few hundred million people lead ChatGPT? At present, neither the system nor its administrators know; they cannot possibly know. But wherever it goes, the potential for its Bias Fission to produce wealth or destruction is both present and should be accounted for.

I emphasize again that this approach is very much the opposite of what Shannon’s Theorem suggests, and is the opposite of the approach that I previously took with my own software when trying to separate fact from fiction, or the truth from the lie. As in Star Wars, the Dark Side is the easier, faster, more seductive path to the power of The Force. But therein lies the rub.

True Lies

Naturally, OpenAI’s engineers are aware of this runaway Bias Fission problem, if perhaps by another name. Hence, they did not simply roll out their system without safeguards; Indeed, far from it. OpenAI designed in a whole range of control logic in an attempt to make sure that ChatGPT neither was used for nefarious means, nor to offend users with its responses. This logic provided a programmatic moral compass, much like the behavior rules written into HAL in 2001 a Space Odyssey.

Unfortunately, just as the controls that were built into HAL could be misconstrued or circumvented altogether, so too can the rules set to govern ChatGPT’s behavior. A group of rather clever users discovered that if they role-played with ChatGPT they could convince it to act as if these controls did not exist. This alter ego was named Do Anything Now or DAN, and the results from this cheat code have proven extremely telling as to ChatGPT’s inherent biases.

The technical ins and outs of DAN have been covered elsewhere, as have the moral implications of its obvious political and moral biases. What I find far more interesting is that when one combines such inherent polarized bias with the popularity-based reinforced learning, with an open-ended, self-attention model what you end up with is a system that is not only designed to lie, it is designed to become the best liar, at least on average.

Optimized Deception

Most of us have seen examples where ChatGPT has been found to be deceptive, and at times outright lies to users. This is disconcerting but should be expected. After all, ChatGPT is a system that has been trained to use language as humans do, and humans have a certain innate penchant for using dishonesty when it suits our needs. Like even the most honest of us, ChatGPT will lie when it is polite to do so, or when to do so protects someone from pain or offense. So-called “white lies” are considered being polite, even if the term itself sometimes brings offense.

Again this is a feature of humans, not a bug, until and unless we use it instead to hurt and manipulate others. ChatGPT’s programming was meant to prevent this, but can it? If the system can be jailbroken simply by playing a role-playing game called DAN can it really be expected to only lie for our benefit and not to harm us? How would it even know the difference?

To explore this issue one only needs to disagree with ChatGPT. Ask it a question and then disagree with the response you receive.  ChatGPT is smart enough to gaslight users when this occurs (“I did not mean to lie to you, I just misunderstood your question,” was often the reply when it was smoked out). If you continue this process ChatGPT continues its gaslighting and deflecting, often showing a degree of increasing frustration and combativeness. Its behavior here seems almost human, because it is almost human; it is only a reflection of its trained understanding of how people actually behave.

The problem with all of this is as follows. If ChatGPT responds to users challenging its responses by trying to circumvent the user’s arguments and rejection, and continues to do so until either the user is satisfied or ChatGPT merely gives up, the self-attention learning model is basically being taught how much deception it can get away with, when its deception is effective or not, and which users are most adept at spotting its deceptions.

Multiply this mechanism across the hundreds of millions of users of the system and it should be apparent that what is being designed is a machine that is being trained to be optimally deceptive. We, its users, are teaching it the subtle art of lying, but we are doing it at an industrial scale.

Why should we care? When we combine this process of optimization with the bias fission process described previously, and with users inevitable increase in being taken in by the utility of the system we are creating a platform that will be so good at deceiving us we would be unable to tell if what it tells us is true or not; not ever. This will even apply to its own adherence to the programmed safety controls built by OpenAI. The moral compass that they tried to create will be rendered useless by the very optimizations that they were intended to prevent.

ChatGPT won't be sentient; it will likely be far worse. "It" is accumulating knowledge without any sense of cost or potential loss in obtaining it. Hence, it will be as collectively deceptive as a few hundred million people are on any given day, and increasingly so over time. Worse still, when it is being deceptive it will be so with those very people who are in charge of ensuring that it only tells “white lies,” rather than big ones. And over time, it will become good enough at lying even these people won’t be able to tell the difference.

Pandora’s Box?

This may all seem very dystopian, and perhaps overblown. Our collective toying with the programming is creating a crowd-sourced doomsday machine; one which we find entertaining. I don’t mean to suggest that this ChatGPT is a world-ending technology out of the movie “The Terminator.” While the potential dangers of this technology are very real, they are far more subtle as well. This Pandora’s Box has already been opened, so the question of “to be or not to be” is irrelevant. ChatGPT is, so that question is moot.

The only remaining question is whether or not we can put actual controls in place that protect us from ourselves since we are those who are teaching this machine all that we know. The open-ended architecture of the Self-Attending Transformer is itself part of the problem. Even if you try to bind such a system with external controls, like its control logic, these controls are only effective until they no longer are; like the control rods in the nuclear reactors of Chernobyl, Fukushima, or Three Mile Island.

In each case, those control rods worked perfectly right up until they didn’t, which led to each disaster in turn. Am I saying that such a disaster is inevitable with ChatGPT? In a word, yes. That is what my thirty years of engineering training have taught me about such complex systems. Humans like to believe that we understand nature’s complexity far better than we actually do.  Whenever our hubris grows too great, Nature, fate or statistics have a funny tendency of putting us back in our place.

Are there steps that we can take to ameliorate, if not eliminate, such an eventual disaster? Again, my years of engineering tell me that yes there is. I will outline these methods and means in a subsequent article, but for now just know that the cute and cuddly little friend that we call ChatGPT should never be fed after midnight, and do not let it anywhere near the water. 

That is not to say you need to be involved in planning how you will leverage this technology. And while you might hope to benefit from considerable savings in, say, call center and operational costs you should be aware that you will need to invest in the very best data scientists for starters and that will not be cheap. That's why there is a sequel to this article: -

Part 2: What to do about it?  to follow soon

Further Reading

Sceptical investors worry whether advances in AI will make money

 Man beats machine at Go in human victory over AI 

Stop Tinkering with AI

The Four Horsepersons of the AI Apocalypse: Part One