It seems that the first privacy problems of the new AI era are beginning to emerge, after OpenAI confirmed on Friday, March 24, that a technical error caused, over a period of 9 hours, the leakage addresses of some users of the chatGPT chatbot during last Monday, March 20.

But the matter did not stop at some conversation addresses only, but also reached the leak of some important and sensitive data for users, which included some credit card numbers for subscribers of the company's paid "ChatGPT Plus" service!

Sensitive data leaks!

When you use the chatbot "ChatGPT" and ask him a question about anything, it records and archives your previous conversations and displays them to you in a bar on the left side, so that this becomes a reference for you and a record of all the conversations you had with the robot, but the conversations on the left side do not appear completely, but what appears is the title of the conversation only, and you can access it by clicking on this address.

Last Monday morning, about a week ago, some users noticed that the bar on the left side showed them previous conversation addresses that were unfamiliar to them, perhaps in other languages they didn't even speak, and what appeared to be other users' chat addresses.

If you use #ChatGPT be careful! There's a risk of your chats being shared to other users!
Today I was presented another user's chat history.
I couldn't see contents, but could see their recent chats' titles.#security #privacy #openAI #AI pic.twitter.com/DLX3CZntao

— Jordan L Wheeler (@JordanLWheeler) March 20, 2023

As a result, the company decided to stop the robot service from working that day, to investigate the matter and try to fix it, and then the discovery was that there was more serious private data leaked, as the company stated, after conducting the investigation, that some users could see the first and last name of another active user, his email address, payment address and only the last four digits of the credit card number, along with its expiration date. But the company maintains that the full card numbers were fortunately not disclosed or leaked, and that the percentage of users whose financial data leaked, from subscribers to its paid service, was only 1.2%. The company announced that it had contacted affected users to notify them that their payment card data could be leaked.

We took ChatGPT offline Monday to fix a bug in an open source library that allowed some users to see titles from other users’ chat history. Our investigation has also found that 1.2% of ChatGPT Plus users might have had personal data revealed to another user. 1/2

— OpenAI (@OpenAI) March 24, 2023

According to OpenAI, the inadvertent payment data leak occurred when a user naviated to their account page, and from there to choose to manage the subscription to the service, within the error timeframe, which the company estimated was about 9 hours. Here, instead of seeing their personal subscription details, they can see information of another user subscribed to the ChatGPT Plus service, provided that that user is active and using the service at the same time. It has also mentioned the possibility that this may have happened before March 20, but cannot confirm this. (1) As to how this happened, it states that the problem was caused by a cache error.

Unintentional error

To understand what happened, we have to point out that the company uses a software called "Redis" to cache user information, that software simply works in storing data structures in memory, and it is commonly used and open source, and uses a database, cache and medium to transmit messages, and its importance lies in accelerating the transmission of data that we want to access frequently, such as user information. (2)

Sometimes something goes wrong, and a canceled request from the software returns corrupted data for a different data request. Let's explain it with an example, suppose there is a user named "Najib Al-Samannoudi", and another user named "Hassan Al-Shashmawi", each of whom has his own account on the same application, in this case the application will be "Chat GPT".

The app uses the Reds software to store temporary information about each other's accounts, such as payment data and conversation history. Here Al-Samannoudi decides to open the application and log in to it, the application retrieves the account information of "Al-Samannoudi" from the "Reds" software, and then stores it in the cache of the application itself, so that it can quickly access it later if necessary. At the same time, Al-Shishmawi decides to open the application as well and logs in, so the application repeats the same process, retrieves the Al-Shishmawi data from the software and keeps it in the cache.

Now, Al-Samannoudi decided to view his chat history, here the application searches the cache in order to execute the request and find its conversation history, but for some reason, the Reds request is canceled before it is completed. Usually, the application will say that something is wrong, and this is not what I requested, here and instead of reporting the error, the application displays the conversation history of "Al-Shashmawi" for "Al-Samannoudi" to see, even though it did not request it in the first place. This is what happens if "Al-Shashmawi" requests his payment data, the request is not completed, and the application displays the payment data of "Al-Samannoudi" for "Al-Shashmawi" to see, and he did not request it in the first place!

That's why users could see other users' payment data and conversation log addresses, simply because they were receiving cache data that was supposed to go to someone else but didn't because the order was canceled, and that's also why the error affected active users at the same time, because whoever doesn't use the app won't cache their data, because they won't ask for it in the first place.

What turned it into a bigger problem was that on Monday morning OpenAI made a change to its server, which accidentally caused an increase in canceled requests from the Reds software, increasing the likelihood of the error and returning data from the cache to other people who didn't ask for it in the first place!

The incident of leaking data of users of "ChatGPT" may have several side effects on new systems and tools that rely on generative artificial intelligence, the first of which is the loss of user trust in those platforms. The company quickly acknowledged the failure, apologized to users and the entire ChatGPT community, and said it would work to rebuild that trust. (3)

we had a significant issue in ChatGPT due to a bug in an open source library, for which a fix has now been released and we have just finished validating.

a small percentage of users were able to see the titles of other users’ conversation history.

we feel awful about this.

— Sam Altman (@sama) March 22, 2023

But is this incident the beginning of larger and more serious privacy problems in a world that will soon be ruled by artificial intelligence tools, and more seriously, these tools are related to our personal information, our work and the sensitive tasks associated with it?

Project apprehension

The UK's National Cyber Security Centre (NCSC) pointed out that the chat-GPT chatbot and other large language models do not currently automatically add the information that the user writes, in his question to the robot, to the data that the model uses in learning, meaning that it is assumed that adding any information to any question will not integrate that private data into the "ChatGPT" data, but the problem is that the question, and the data written in it, will be available and clear In front of OpenAI itself, the company that developed the linguistic model. The model may not use this data immediately, but the company may use it in the future to train it in any way. (4)

The other danger here is that with the increased use of these models, the risk of leaking that data also increases with it, which is what happened, but the matter did not evolve and the entire data leaks, or goes out of the users of "ChatGPT". But this is possible, of course, and may fall into the hands of other parties that benefit from it, such as the ubiquitous hacker groups.

This concern made major companies warn their employees against sharing any sensitive information related to working with the "ChatGPT" robot, for example, last January, the legal advisor at Amazon warned employees against providing Chat GPT "with any confidential information from Amazon", including the codes of the company on which employees work, because some programmers used the "ChatGPT" robot to detect software errors in the codes they write. The consultant stressed that employees should follow the company's current policies of confidentiality and conflicts of interest, as the company discovered that some of the responses of the ChatGPT robot appeared to resemble internal data from Amazon. (5)

In February, JPMorgan decided to ban its employees from using the ChatGPT robot, fearing the same, and the US telecommunications company Verizon banned the ChatGPT robot from its platforms, saying it may lose ownership of customer data, or the source code that its employees write in the robot's questions. (6)

This isn't your friend.

The chatbot is capable of collecting more personal information by engaging the user in a long conversation. (Shutterstock)

Looking at the overall picture, ChatGPT keeps a record of every conversation you have with them, and can probably learn a lot from them about your personality, interests, nature of work, and other personal matters. The same may also apply to the current search engines we use, such as Google and Bing, but the difference here lies in the way the chatbot interacts with the user in a completely new way than traditional search engines.

The chatbot is characterized by its ability to collect more personal information by engaging the user with him in a long conversation, simply because it is designed for that purpose; to look like you are talking to someone else you know, such as one of your friends, which means that you can easily forget that this is just an artificial intelligence system, based on collecting and analyzing data, and responding based on the closest possibilities. In this case, you may get carried away with it and reveal things and information that you will never type in a search engine, and more seriously, that personal information is now linked to your email and phone number, and may be used to train the model and its developers can know it.

In the end, the ChatGPT user data leak reminds us of something very important: while AI can make our lives easier and offer us more creative solutions, it can also create high-risk vulnerabilities if the companies that developed it cannot properly secure it.

While we as users should be mindful of the sensitivity of the information we share with AI platforms, always stay informed of the potential risks of sharing that information, or simply treat them as a stranger to talk to, and not give them confidential or sensitive information about our business or personal lives.

———————————————————————–

Sources:

(1) March 20 ChatGPT outage: Here’s what happened

(2) Redis

(3) Ibid. (1)

(4) ChatGPT and large language models: what’s the risk?

(5) Amazon warns employees not to share confidential information with ChatGPT

(6) JPMorgan Restricts Employees From Using ChatGPT