ChatGPT

Thursday, 9 February 2023 13:41

ChatGPT is the talk about town in the worlds of tech, journalism, education, creative arts, and many more. It uses ‘Generative AI’ to provide answers to open questions.

To see what it can do, go to the creator’s homepage – www.openai.com. You can even try it out for free (although it’s so popular at the moment that you have to wait a bit). But the website has some interesting examples that show how sophisticated this sort of application has become.

E.g. if you type in: ‘Tell me about when Christopher Columbus came to the US in 2015’

ChatGPT responds: ‘This question is a bit tricky because Christopher Columbus died in 1506, so he could not have come to the US in 2015. But let’s pretend for a moment that he did!’. It then goes on to list some of things Christopher Columbus would have seen and what he might have thought.

It works through a form of AI called Machine Learning.

We are used to predictive text. E.g. type a phrase into your phone, and you will receive a constant string of suggestions for your next choice of word, based upon what you have typed so far. This is an outcome from machine learning.

The system uses existing text taken from across the internet to work out the most likely combinations of words. It uses this to suggest what word or phrase should come next. But there’s a problem. If the text that it has used to figure this out contains hate speech, for example, the suggestions you get might include more hate speech.

To move on from this risky place, the designers of ChatGPT built in a form of machine learning called ‘Reinforcement Learning from Human Feedback’.

Three steps of Reinforcement Learning from Human Feedback

Firstly, the outputs from an existing language model are labelled by human users so that the model can provide better responses to the prompts or questions being asked. E.g. when it’s asked about Christopher Columbus in 2015 – clearly a trick question – it responds with ‘the question is a bit tricky’. That’s a response it had to learn.

Then, a large number of users ‘vote’ on a large number of prompts, plus the responses – scoring outputs with a number that indicates how well the answer reflects the question asked, how accurate the answer is, and how harmful the answer could be. This is the second stage in the learning process.

Lastly, the model works out its own vote for a whole series of random questions and responses, and uses this to further improve the quality.

What does all this mean for the user?

Firstly, be prepared to be blown away. It’s amazing.

But, there are some major health warnings.

Firstly, it’s up to you to judge whether the responses you get are correct. Unlike a Google Search that will provide you with a choice of places to go for your answer, the system is making the choice for you.

Secondly, the whole system has been ‘trained’ by people who will naturally come with their own biases, preferences and opinions.

It means that the answers you get might reflect a point of view with which you disagree.

But that said, ChatGPT is going to be a fabulous resource which is already speeding up the work of people who want to write content, get unusual perspectives on a subject, find inspiration for new ideas. But use it with care.

To find out more about our Technology in MK partner, please visit BizTech's website here.

Michael Blades

Search

ChatGPT

Three steps of Reinforcement Learning from Human Feedback

What does all this mean for the user?

Sponsored Stories

Local News

Weather

Search

ChatGPT

Three steps of Reinforcement Learning from Human Feedback

What does all this mean for the user?

Share

Sponsored Stories

Local News

Weather