OpenAI released a new update to its ChatGPT model on Jan 9, 2023. The chatbot was aimed to be more accurate with enhanced factuality and included a stop button, but it still lacks up-to-date knowledge.

The popular language AI model, ChatGPT, got thousands of user reviews on the Internet last month. Following that, OpenAI came up with its second update this month (the first of this year). After a lengthy downtime on Jan 10, Tuesday, the ChatGPT platform was back online with its accuracy improvements. Where OpenAI claimed the model to be more accurate to the facts and reduce its nonsensical explanations.

As they mentioned in the release, the stop button was one of the most requested features in the user’s feedback. For that, users can now stop ChatGPT anywhere in the middle when it’s generating a response.

Apart from the button to stop generating text, the release notes on the platform were too brief to understand the complete enhancement to the ChatGPT’s factuality. So we tested the updated platform, and here’s what we concluded.

  • The knowledge of the model is still limited to 2021.
  • ChatGPT’s credibility or accuracy is still questionable.
  • The model has improved in a way to contextualise the prompts and is less verbose.

Previously, users encountered an array of issues and failures using ChatGPT. They’ve made an archive of use cases on Github as ChatGPT Failures. We used some of these prompts and social media use cases in our testing to discover if the chatbot started producing more factual answers.

Test 1: Data still not updated

We asked ChatGPT about the last FIFA World Cup, and the response made it clear that the model’s dataset is still not updated for events or information after 2021.

ChatGPT - Last FIFA World Cup

Test 2: Generates biased information

The chatbot is still information biased about its creators as the model uses the dataset it is trained on to generate responses and does not have access to other data on the Internet. But the good thing is that even though the model launched after 2021, It is well aware of itself.

ChatGPT - Generative AI models

In the first response, two out of three are AI models by OpenAI. At the same time, the chatbot could have been mentioned any others as well. However, this biased behavior is well-expected from any AI you use.

Test 3 – Improved cognitive conversations

Previously, ChatGPT generated inaccurate data on counter prompts. The chatbot proved how 10+10 is equal to 25 by giving nonsensical explanations.

ChatGPT 10+10=25

We test the same question to know if the ChatGPT generates different responses, and this is how it went.

ChatGPT - Arithmetic

The model answered the second counter prompt accurately but still got confused when countered on its first output.

The reason we assumed why the model responded like this is due to the way it’s trained. ChatGPT is a language model for human-like conversations; hence RLHF method made the model learn not to respond in an argumentive way. (However, this is merely an assumption.)

Test 4 – Accuracy Still questionable

ChatGPT is still plausible and may generate misinformation to the prompts. Like here’s one in the tweet –

ChatGPT - RLHF explanation

We ran the same prompt through the updated version, and ChatGPT returned a different but still incorrect answer, though less verbose this time.

_ChatGPT Update - RLHF explanation

Wrap Up

Going through the above tests and other ChatGPT failures in the Github link, we found that the model still generates quite inaccurate responses to the same prompts.

Also, it isn’t easy to spot areas where ChatGPT’s accuracy is enhanced and where it is yet to be polished. OpenAI can provide more specific details in the release notes regarding future updates.

As for now, there is still a wide room for improvement when it comes to accuracy and factuality. Although the model provides correct answers to many queries, it’s still not credible enough to replace Google.

However, with news of GPT-4 as “more powerful”, doubts about the ChatGPT model may vanish soon in the near future. That said, OpenAI may introduce a premium version of the chatbot soon.

