Share this article on:
Australian researchers are currently trailing the dangerous potential for chatbots to spread misinformation.
With chatbots widely used in customer service across a number of sectors, such as telecommunications, banking and retail, researchers from Macquarie University have discovered that these “chit-chat bots” can be trained to spread false information.
The team, led by researcher Conor Atkins, discovered the issue after these chatbots, such as Meta’s BlenderBot 2 and BlenderBot 3, were updated to have long-term memory.
“One of the new developments in chit-chat bots is a long-term memory mechanism that remembers information from past conversations for increasing engagement and consistency of responses,” said the research paper titled “Those Aren’t Your Memories, They’re Somebody Else’s: Seeding Misinformation in Chat Bot Memories”.
“The bot is designed to extract knowledge of personal nature from their conversation partner, e.g., stating preference for a particular colour.”
The idea of adding long-term memory capabilities to chatbots is that chatbots would be able to have more human interactions and deliver a more natural experience that could last “days, weeks and months”, allowing for longer-term and serious issues to be dealt with.
Researchers have said that Blenderbot’s long-term memory capabilities can be infected with misinformation, which could then be called on again.
“[The vulnerability] exploits the design of the bot to remember certain types of information (personal information in examples we discuss), which can be cleverly mixed with misinformation contained in non-personal statements in order to trigger memorisation,” said the report.
The investigation sought to test if the issue, which researchers classed as a “vulnerability” but not a bug, would allow attackers to inject misinformation maliciously, “which is later produced by the chatbot as authoritative statements of fact”.
After generating 13,000 conversations with BlenderBot 2, the researchers concluded that this could be done.
“These memories can be injected by an attacker who has momentary, black-box access to the victim’s chatbot, e.g., a personal digital assistance, or a chatbot with shared memory over multiple users such as in a home, an office, on social media or customer service,” the report said.
The report names a number of “possible defences” against attackers using the bot maliciously to spread misinformation.
Currently, chatbots like BlenderBot 3 and OpenAI’s ChatGPT “use an NLP model to detect unsafe responses”. However, researchers said that this is limited as it will only detect misinformation if the filter knows what is being said is false.
Instead, the report suggests that prevention is a more secure method than detection. Researchers suggest that user authentication be adopted, making chatbot memories exclusive to the user, and preventing bad actors from infecting the information seen by specific users.
Researchers also noticed that in the instance that duplicate memories could be created with duplicate messages. The report suggests removing duplicate messages to reduce their influence of them and prevent the chatbots from learning false information.
The full report can be found here.