Oops! Something went wrong while submitting the form.
Share this post
The power and pitfalls of large language models
The domain of artificial intelligence has been synonymous with exponential growth for the past 10 years. For an innovation to stand out in this field, where disruption is business-as-usual, it must be nothing short of spectacular. And spectacular is exactly how we can describe the recent progress in language understanding & generation through innovations such as ChatGPT and GPT-4.
All this is impressive but begs a simple question: what is in it for you?
In what follows, we will aim to answer exactly this question by discussing the disruptive potential of these technologies. We will also explore the limitations of these models and focus on how we can overcome these challenges. Because after all:
“The pessimist sees difficulty in every opportunity. The optimist sees opportunity in every difficulty”
ChatGPT and GPT-4 have taken the world of natural language processing by storm. Enthusiasts see the dawn of human-level AI, researchers see nothing special and skeptics see a reason to start stockpiling toilet paper.
Before we explore how we can leverage these breakthroughs to gain a strategic edge, let’s consider why we are seeing this progress in the first place.
It will be a surprise to many that the working principle behind ChatGPT and GPT-4, called the “transformer”, has been around since 2017. As the “4” in GPT-4 suggests, what we are seeing now is not so much a radical shift in how to generate text as it is the culmination of 6 years worth of incremental improvements on the same method.
In the words of Steve Jobs:
“If you look really closely, most overnight successes took a long time”
That said, it still begs the question of why now?
The answer to that question is threefold:
👉 Bigger and better 👉 Incorporating human feedback with RLHF 👉 People can’t buy what they don’t know
Bigger and better
A simple truth about understanding natural language is that you have to account for a vast number of subtle nuances. If GPT-4’s parameter count is to be believed, we are talking about 100 trillion of these nuances.
The bigger a model, the more capacity it has to understand finer and finer nuances. In turn, the evergrowing computational resources and available data on the internet allow us to leverage this capacity.
It is a well-known secret in AI that the same model but bigger will inevitably be better. ChatGPT and especially GPT-4 are much bigger than their predecessors which has significantly boosted their performance.
This trend has its limits though. Adding more capacity will fail to add value once we have reached enough parameters to capture even the finest nuances of human language. Needless to say, this limit is not yet in sight.
Incorporating human feedback with RLHF
The biggest difference between ChatGPT & GPT-4 and their predecessors is that they incorporate human feedback. The method used for this is Reinforcement Learning from Human Feedback (RLHF).
It is essentially a cycle of continuous improvement. The system generates a text; the user gives some implicit or explicit feedback about what it could do better (e.g., give a more detailed response); the system uses this information to become better.
The idea of RLHF has also been around since 2017. 2017 was a great year for 2023 breakthroughs. It has not been widely used until recently because collecting human feedback was always seen as a bottleneck in the era of big data. After all, a computer can churn through significant proportions of the internet within a matter of days whereas we humans get distracted after reading half of one article.
The only way to really leverage RLHF at scale would be if ChatGPT and GPT-4 had a massive user base that is providing constant feedback.
It looks like that is where you and I come in.
People can’t buy what they don’t know
While the rise of ChatGPT may have seemed to come out of left field for many, the NLP community has made tremendous breakthroughs over the past years, many of which had the potential to garner a similar level of interest from the general public.
Until now, however, these breakthroughs were never able to create real awareness among the wider public. In the words of Warren Buffett: “if you can’t communicate, it’s like winking at a girl in the dark — nothing happens”.
What OpenAI has done masterfully is (i) packaging its technology in an accessible & intuitive application and (ii) spreading mass awareness about what they are doing.
Partly to increase the adoption of AI in impactful use cases and position themselves as the frontrunner in the process. Partly to keep the RLHF flywheel spinning and make their products better and better.
The power of AI
Where AI models used to be targeted problem solvers, we are now seeing broad & versatile AI systems that have abilities ranging from creative writing to computer programming. We are moving from many individual tools in our AI toolbelt to a few AI Swiss Army knives.
This new paradigm unlocks a plethora of possibilities to gain a competitive advantage. Even though the release of ChatGPT and GPT-4 is barely in the rearview mirror, we can already see early adopters embracing these technologies and leveraging their disruptive potential to position themselves at the forefront of their respective industries.
Some concrete examples include:
Customer support: Intercom leverages ChatGPT as a writing aid for support operators to accelerate and improve the customer support process (link). Soon, it will launch its customer support chatbot Fin (link) built on GPT-4 which will allow them to provide a fast & qualitative solution to a major problem faced by many businesses.
Education: Duolingo aims to revolutionize the language learning process by offering its new Duolingo Max application (link). It leverages GPT-4’s (multi-)language capabilities to offer a highly personalized yet scalable learning platform which is at the core of their strategy.
Technology: Whimsical integrated the creative potential of large language models in their AI mind maps (link). They provide a creative and collaborative experience to their users — significantly facilitating and speeding up the brainstorming process.
Looking beyond current use cases, we foresee ChatGPT and GPT-4 making a big impact on knowledge discoverability.
We expect companies and governments to create a question-answering layer on top of their internal data to allow for the intuitive retrieval of knowledge present in the organization.
In particular, we foresee an especially major impact in knowledge-driven sectors such as legal, media and government.
To illustrate with some examples, this can:
Allow legal firms to use the knowledge accumulated in past rulings & cases to make the most informed decisions going forwards.
Allow media companies to enable their users to derive the concrete insights they are looking for from its media coverage, thus creating a more engaging and personalized user experience.
Allow governments to turn their legislation from archaic archived documents into an interactive knowledge base that the wider public can query to remain informed.
Is the sky the limit?
In order to fully understand the limitations of technologies like ChatGPT, it is important to understand their high-level working principle.
Winston Churchill once said: “out of intense complexities, intense simplicities emerge”. In the case of ChatGPT and GPT-4, we argue quite the opposite: “out of intense simplicities, intense complexities emerge”.
It will come as a surprise to many that really any modern text generation system works under a very simple premise. They are all systems that predict the next word in a sentence. Nothing more, nothing less.
They start chaining words together one after another with the aim to construct the most statistically likely sequence of words, given the original prompt.
At ML6, we like to visualize this process as a random walk of words.
This implies that technologies like ChatGPT never truly reason about the message they want to convey. This lack of explicit reasoning leads to limitations in terms of reliability, controllability and ethics.
Significant effort is being put into building guardrails around what these systems can produce in order to circumvent this fundamental issue. Another promising avenue is linking generated information to sources in order to allow for convenient fact-checking.
Nevertheless, it is important to keep in mind that such remedies, while quite effective, are still symptomatic solutions that aim to patch a fundamental limitation.
In the long term, we believe that we are still one paradigm shift away from having fundamentally trustworthy and reliable large language models; a paradigm where language is seen as more than a statistically likely sequence of words and text is generated starting from a preconceived idea.
Furthermore, the current dependence on OpenAI (and Microsoft) should not be underestimated. While we believe that open-source alternatives to OpenAI’s services will inevitably appear, we should be mindful of the new barriers to entry in the era of large language models. For one, open-source initiatives will be confronted with the need for more computational resources in order to train models with a comparable size to ChatGPT and GPT-4. In addition, the lack of access to the human feedback OpenAI has amassed over time will pose a challenge to approaching comparable performance.
What’s in it for you?
In conclusion, technologies like ChatGPT and GPT-4 are here to stay.
There will be businesses that perceive this as a threat to fight back against and who will inevitably get stuck in the status quo.
There will be businesses that perceive this as an opportunity and who will gain a strategic edge over their competitors. The first of which are already showing themselves.
We believe that the companies that:
(i) Recognize both the power & limitations of large language models and (ii) Embed these technologies into their core business
will succeed in establishing themselves as the frontrunners in the ever-evolving digital landscape.
We will leave you with a last quote (promised) from Disney’s CEO Bob Iger:
The riskiest thing we can do is just maintain the status quo