Why ChatGPT will only become stronger with GPT-4


Hardly a day goes by without a mention of ChatGPT, the wunderkind artificial intelligence (AI)-powered chatbot from OpenAI that has taken the online world by storm. Microsoft, for instance, announced on 16 January that it would add ChatGPT to its Azure cloud services in the near future. Microsoft chairman and CEO Satya Nadella tweeted, “ChatGPT is coming soon to the Azure OpenAI Service, which is now generally available, as we help customers apply the world’s most advanced AI models to their own business imperatives.

ChatGPT is powered by Microsoft Azure, and Microsoft is reportedly in talks to invest $10 billion in OpenAI, according to a report by online media outlet Semafor, even as it is mulling the integration of ChatGPT with its search engine Bing. Microsoft, incidentally, already invested $1 billion in OpenAI in 2019.

ChatGPT was released to the public on 30 November for testing and feedback, after which netizens were overwhelmed with this smart chatbot’s prowess at engaging with them while answering questions, writing code, poems, and essays, among other things. It is a GPT-3.5 series model. To be sure, even the third iteration of Generative Pre-trained Transformer (GPT-3) with 175 billion parameters impressed many with its potential to write human-like poems, articles, books, tweets, resumes, and even code.

Limitations GPT-3 is trained to predict the next word on a large dataset of internet text, but it can also generate untruthful and toxic comments, spread misinformation and spam, and write fraudulent academic essays. OpenAI, co-founded by Tesla, SpaceX and Twitter owner Elon Musk (who is no longer associated with OpenAI), is attempting to address these limitations with ChatGPT by using Reinforcement Learning from Human Feedback (RLHF) to make it “more truthful and less toxic” with the help of human supervisors.

ChatGPT does have limitations. OpenAI points out that the models may have knowledge of current events since the default models were trained on data till the end of 2021. OpenAI acknowledges that ChatGPT “sometimes writes plausible-sounding but incorrect or nonsensical answers”.

Regardless, content creators and voice actors have their work cut out with intelligent software mimicking their writings, art, voice, and emotions.

Consider these developments. If OpenAI’s DALL-E can generate realistic art and images from plain text prompts, and ChatGPT can write poems, articles, books and even code, Microsoft’s text-to-speech AI model, VALL-E, can simulate a person’s voice with just a 3-second recording.

Initial results show that VALL-E can also preserve the speaker’s emotional tone. According to the paper’s authors, VALL-E was pre-trained on 60,000 hours of English speech data, which the paper claims is “hundreds of times larger than existing systems”.

OpenAI’s WebGPT prototype uses a text-based browser to submit search queries, follow links, scroll web pages, and also cite sources.

How they work: Large language models (LLMs) like GPT-3 and chatbots like ChatGPT are trained on billions of words from sources like the internet, books, and sources, including Common Crawl and Wikipedia, which makes them more knowledgeable than most humans. LLMs use transformer neural networks to read many words (sentences and paragraphs, too) at a time, figure out how they relate, and predict the following word.

However, while LLMs such as GPT-3 and models like ChatGPT may outperform humans at some tasks, they do not understand what they read or write, unlike humans. Moreover, these models use human supervisors to make them more sensible and less toxic.

OpenAI explains on its website that ChatGPT is a sibling model to InstructGPT, which is trained to follow the instruction in a prompt and provide a detailed response. The model uses Reinforcement Learning from Human Feedback (RLHF), using the same methods as InstructGPT (from the GPT-3.5 series) but with tweaks to the data collection setup. ChatGPT finished training in early 2022. ChatGPT and GPT-3.5 were trained on an Azure AI supercomputing infrastructure. GPT-3.5 refers to the series of models trained on a mix of text and code before the fourth quarter of 2021, according to OpenAI.

ChatGPT, as I explained above, is based on the GPT-3.5 series. Alphabet-owned Deepmind (which Google acquired in 2014) has a similar AI-powered chatbot called Sparrow, which has not made waves. Sparrow, too, is described as a “dialogue agent that’s useful and reduces the risk of unsafe and inappropriate answers. The AI agent is designed to talk with a user, answer questions, and search the internet using Google when it’s helpful to look up evidence to inform its responses”.

DeepMind acknowledges that training a conversational AI is challenging because it’s hard to decipher what makes a dialogue effective. DeepMind used a form of reinforcement learning (RL) based on people’s feedback to address the issue. The preference feedback of the participants was used to train a model for the usefulness of its answer. The model was also trained to desist from making “threatening statements” and “hateful or insulting comments”. DeepMind researchers also provided rules around possibly harmful advice and not claiming to be a person.

Sparrow currently focuses on English, but it is evident that DeepMind researchers are working to ensure similar results across other languages and cultural contexts. And though it was released in September 2022, Sparrow is not well known as it has not been released for public feedback and testing as was ChatGPT (which is free for testing, but we may have to pay for it soon), which made the latter a darling of the masses.

What next? LLMs are increasing in size. Google’s BERT, for instance, was trained with 340 million parameters, while GPT-3 has 175 billion parameters. Megatron-Turing NLG, a model released in 2022 by Nvidia and Microsoft, is trained with 530 billion parameters. Google’s Pathways Language Model (PaLM) consists of 540 billion parameters, while Google Brain’s open-sourced ‘Switch Transformer’ natural language processing (NLP) AI model scales up to 1.6 trillion parameters. A team from OpenAI, creators of the GPT-3 model, found that NLP performance does scale with a number of parameters–essentially parts of the model learned from historical training data and those that define the skill of the model on a problem, such as generating text. A model can acquire more granular knowledge and improve its predictions with an increase in parameters.

All these developments indicate that such AI-powered chatbots will transform content creation methods even as they make many content creators redundant, especially if the latter does not figure out ways to complement and use these tools. Institutions and lecturers around the world have begun urging authorities to review the way in which courses are assessed over concerns that students are using ChatGPT to write papers that can be tantamount to plagiarising since there is no mention of sources and links. It’s also important to note that just as the internet rewired our brains, tools such as ChatGPT and Sparrow may do likewise. With the advent of calculators, for instance, few can do maths in their heads today.

The release of GPT-4, which is expected to significantly outperform GPT-3 with its rumoured 100 trillion parameters, will only raise more such questions by making tools like ChatGPT write better than most humans. But there will be room for original ideas and content creators. Well! At least for now.

You may also be interested in...