The development process and limitations of GPT models

AI Network
6 min readFeb 19, 2021

What is GPT?

A software engineer who participated in OPEN AI’s research expressed the difference between traditional software engineering and deep learning in the following illustration. So far, if a person develops the software and puts the input in, and the result is based on the program, deep learning trains the data and then creates the software itself after the person intervenes in the AI training phase. It was said that it was a big difference. The world is changing.

Source: OPEN AI

The Chinese AI robot ‘XIAOICS’ developed by Microsoft, has made more than 100 poems after learning thousands of works by 519 poets for 100 hours. The book is titled ‘Sunshine Misses Windows’ and ‘XIAOICS’ has become a service used by more than 600 million users in China.

We’re sure everyone has at least one experience of talking to an AI speaker at home when they’re bored. When I don’t understand what AI said, I have a lot of fun making fun of the AI speaker. What if artificial intelligence can speak and write like a human being? Maybe we can become friends who can share our daily lives.

Language is the biggest social tool that distinguishes humans from other animals. Artificial intelligence learns to speak?

There are two main ways for artificial intelligence to develop language models. 1) There’s a way to use human language as statistics and 2) there’s a way to use artificial neural networks. As you may have heard, the method using ‘artificial neural networks’ has shown better performance recently. Simply, it will be necessary to talk about what the language model of artificial intelligence means to humans, which is now at the level of writing and speaking like humans.

Today’s main theme is GPT, a language model using artificial neural networks. AI Writer created by AI Network is also based on GPT 2. It’s a model made and released by Open AI, a non-profit research foundation created by Elon Musk. After this was released, GPT was released for the second and third time following the first version, and there are already talks about GPT 4.

Timeline of GPT development

A language model called GPT is a conditional probability prediction tool that underlies natural language processing. GPT doesn’t understand and write sentences like humans do. It’s about analyzing the textual data to make the right sentences. It’s not a mechanism by which writings come out after intense consideration like humans.When we enter a keyword, GPT just collect all the data related to the keyword and create the text that people demand. There is a saying that the writing reveals the writers her/himself, but it doesn’t seem to work for artificial intelligence.

Nevertheless, the secret to writing indistinguishable from humans is the enormous amount of learning. First released by Open AI in 2018, GPT 1 was learned with 117 million parameters. A year later, in 2019, Open AI released GPT 2 on four occasions. Depending on the model size, the number of parameters increased from about 124 million to 1.5 billion, 10 times the previous version. It is said that human-like writing can produce a page of a book in just 10 seconds. GPT 3 has given people the fear that artificial intelligence is replacing human even in the field of writing . This wasn’t the end. GPT 3 has 175 billion parameters. The size of GPT 3 is 1,000 times bigger than GPT 1 and 100 times bigger compared with GPT 2. Also, performance has improved significantly in the meantime. GPT 3 can perform various language-related problem solving, random writing, simple four-pronged operation, translation, and simple web coding using given sentences.

Who are the competitors for GPT 3?

GPT 3 is based on a deep learning system called Transformers. The concept of Transformers was first introduced in the 2017 Google Brain report, ‘Attention is all you need’. Transformers have become the foundation for a variety of models that can learn vast datasets and can be compared efficiently. You can see from the fact that after the Google report was published, the competition to build a supermodel that can handle various language tasks began. Google’s BERT, Microsoft’s TuringNLG, and Open AI’s GPT 3 are both the latest models based on Transformers. Before GPT 3, the biggest language model was Turing NLG, introduced by Microsoft in 2020. 17 billion parameter language models. It is 10 times smaller than GPT 3. Microsoft gave up competition and gained exclusive licenses of GPT 3. (Of course, Musk criticized it.) Google also has a language model, BERT. It helps you understand the nuance and context of words and to produce accurate search results. ​

Why did GPT 3 go public with API?

In the meantime, OPEN AI has been reluctant to disclose GPT 3 models as open source as opposed to the purpose of establishment. This is what they said about the reason why it is released through the API. They think they can save money for AI development. and they want to collaborate with small and medium-sized companies by sharing huge resources together. They also think that disclosure through the API could reduce cases of misuse. It is because open-source can prevent malicious use cases that are difficult to prevent.

What is the limit of GPT 3?

But there is still a limit. First of all, the number of learning parameters has to increase. It’s a situation that requires a huge amounts of computing power. GPT 3 alone has learned 175 billion parameters. Compared to GPT 2, which learned 15 billion parameters, it’s almost 100 times more than that. It is important to think about whether collaboration in the true sense will be possible. Many parameters are evidence of tremendous performance, but they are difficult to learn and use.

For example, the Azure AI supercomputer, which Microsoft is offering OPEN AI, has more than 285,000 CPU cores and 10,000 GPUs are connected to a 400Gbps network. Since single large machine learning model performs better than a small individual AI model, it is inevitable that it will eventually move away from the development of ‘AI for all’.

People are worried and afraid that GPT 3 has emerged a huge replacement for human jobs. GPT 3 only produces the next word that most statistically matches a given word. There is also criticism that GPT 3 is not writing through understanding. Thinking and understanding are areas of philosophy. What is clear is that we humans did not learn languages by predicting the next word.

What about GPT 4?

Rex Friedman, a professor at MIT, predicted that it would cost 2.6 billion dollars to train artificial intelligence at the human brain level as of 2020, but it would be possible if it costs 80,000 dollars by 2040. After all, GPT 4, which will be released following GPT 3, is likely to be more surprising at a lower cost. Still, it is inevitable that training artificial intelligence will cost a lot of budget and huge resources.

What if it’s with AI network. The problem might be solved more easily than we thought. We can make OPEN AI in the real sense. It’s about creating a computer that connects the world. Wouldn’t it be possible for global developers and resource providers to work together to create computing power that surpasses Microsoft’s supercomputers? We could all collaborate to defeat the bad AI that OPEN AI is worried about with collective intelligence.
I’m curious about your opinion! What do you think? Please share your opinion.^^

AI network is a blockchain-based platform and aims to innovate the AI development environment. It represents a global back-end infrastructure with millions of open source projects deployed live.

If you want to know more about us,

--

--

AI Network

A decentralized AI development ecosystem built on its own blockchain, AI Network seeks to become the “Internet for AI” in the Web3 era.