Team Mar 27, 2023 No Comments
The never-ending rumors of OpenAI bringing out GPT-4 finally ended last week when the Microsoft-backed company released the much-awaited model. GPT-4 is being hailed as the company’s most advanced system yet and it promises to provide safer and more useful responses to its users. For now, GPT-4 is available on ChatGPT Plus and as an API for developers.
The newly launched GPT-4 can generate text and accept both image and text inputs. As per OpenAI, GPT-4 has been designed to perform at a level that can be compared to humans across several professional and academic benchmarks. The new ChatGPT-powered Bing runs on GPT-4. GPT-4 has been integrated with Duolingo, Khan Academy, Morgan Stanley, and Stripe, OpenAI added.
This announcement follows the success of ChatGPT, which became the fastest-growing consumer application in history just four months ago. During the developer live stream, Greg Brockman, President and Co-Founder of OpenAI Developer Livestream that OpenAI has been building GPT-4 since they opened the company.
OpenAI also mentioned that a lot of work still has to be done. The company is looking forward to improving the model “through the collective efforts of the community building on top of, exploring, and contributing to the model.”
So, what makes GPT-4 stand out from its predecessors? Let us find out:
One of the biggest upgrades for GPT-4 has been its multimodal abilities. This means that the model can process both text and image inputs seamlessly.
As per OpenAI, GPT-4 can interpret and comprehend images just like text prompts. Any specific type or image size does not bind this feature. The model can understand and process all kinds of images- from a hand-drawn sketch, a document containing text and images, or a screenshot.
OpenAI assessed the performance of GPT-4 on traditional benchmarks created for machine learning models. The findings have shown that GPT-4 surpasses existing large language models and even outperforms most state-of-the-art models.
As many ML benchmarks are written in English, OpenAI sought to evaluate GPT -4’s performance in other languages too. OpenAI informs that it used Azure Translate to translate the MMLU benchmark.
Image: OpenAI
OpenAI mentions that in 24 out of 26 languages tested, GPT-4 surpassed the English-language performance of GPT-3.5 and other large language models like Chinchilla and PaLM, including for low-resource languages like Latvian, Welsh, and Swahili.
To differentiate between the capabilities of GPT-4 and GPT-3.5, OpenAI conducted multiple benchmark tests, including simulating exams originally meant for human test-takers. The company utilized publicly available tests like Olympiads and AP free response questions and also obtained the 2022-2023 editions of practice exams. We did not provide any specific training for these tests.
Here are the results:
Image Source: OpenAI
OpenAI dedicated six months to enhancing GPT-4’s safety and alignment with the company’s policies. Here is what it came up with:
1. According to OpenAI, GPT-4 is 82% less likely to generate inappropriate or disallowed content in response to requests.
2. It is 29% more likely to respond to sensitive requests in a way that aligns with the company’s policies.
3. It is 40% more likely to provide factual responses compared to GPT-3.5.
OpenAI also mentioned that GPT-4 is not “infallible” and can “hallucinate.” It becomes incredibly important to not blindly rely on it.
OpenAI has been at the forefront of natural language processing advancements, starting with their GPT-1 language model in 2018. GPT-2 came in 2019. It was considered state-of-the-art at the time.
In 2020, OpenAI released its latest model, GPT-3 which was trained on a larger text dataset. It led to improved performance. Finally, ChatGPT came out a few months back.
Generative Pre-trained Transformers (GPT) are learning models that can produce text with a human-like capability. These models have a wide range of applications, including answering queries, creating summaries, translating text to various languages (even low-resource ones), generating code, and producing various types of content like blog posts, articles, and social media posts.
Leave a Reply