Powered by MOMENTUM MEDIA
cyber daily logo
Breaking news and updates daily. Subscribe to our Newsletter

Human Rights Watch finds photos of Aussie kids in AI training dataset

The candid images of nearly 200 Australian children were found being used to train AI – without their or their parents’ consent.

user icon David Hollingworth
Wed, 03 Jul 2024
Human Rights Watch reports photos of Aussie kids found in AI training dataset
expand image

The photos of 190 Australian children were found as part of a massive dataset being used to train artificial intelligence tools, according to a new report from Human Rights Watch.

The photos were scraped off various websites and services without the consent of either the children or their parents, Human Rights Watch said.

The analysis was focused on the LAION-5B dataset, used by the German non-profit organisation LAION to build a range of AI tools and built largely on scraping the entire internet.

============
============

In many cases, the photos are easily identifiable, as names, locations, and even the time the photo was taken are all included in either captions or associated URLs. For instance, one photo of two preschoolers includes their names and ages and the name of their preschool. More worryingly, while some photos can be found elsewhere on the internet, this particular one – and others – cannot be found anywhere else than in the LAION-5B dataset.

Other alarming images depict First Nations children from the Anangu, Arrernte, Pitjantjatjara, Pintupi, Tiwi, and Warlpiri peoples, which could have cultural implications given the cultural restrictions on displaying or reproducing images of the deceased during times of mourning. As the report noted, “current AI models cannot forget data they were trained on, even if the data was later removed from the training data set”.

The wider fear is that any tool trained on the data may be able to reproduce specific, individual data, such as a child’s identity. Human Rights Watch said that LAION-trained AI tools had already been used to create child sexual abuse material, “as well as explicit imagery of child survivors whose images of sexual abuse were scraped into LAION-5B”.

The organisation is also concerned that 190 images are just the tip of the iceberg, as Human Rights Watch was only able to review “0.0001 per cent of the 5.85 billion images and captions” within the dataset.

The Human Rights Watch report was published on 2 June, and LAION confirmed the existence of the photos a day before and promised to remove them. The company denied that AI trained on the LAION-5B dataset could reproduce the data it had ingested but also laid the responsibility for protecting the pictures of children on their parents and guardians. It’s up to them, LAION contested, to remove such pictures if their misuse is of concern.

It’s a long bow to pull, given that some of the 190 images are not publicly accessible anywhere online.

“Children should not have to live in fear that their photos might be stolen and weaponised against them,” Hye Jung Han, children’s rights and technology researcher and advocate at Human Rights Watch, said.

“The Australian government should urgently adopt laws to protect children’s data from AI-fueled misuse.”

Han further said: “Generative AI is still a nascent technology, and the associated harm that children are already experiencing is not inevitable.

“Protecting children’s data privacy now will help to shape the development of this technology into one that promotes, rather than violates, children’s rights.”

To that end, Human Rights Watch is calling on the Australian government to prohibit the scraping of the internet for the personal data of children as part of the Children’s Online Privacy Code and to include data privacy protections for everyone as part of any AI regulatory regime – particularly when it comes to protecting children.

David Hollingworth

David Hollingworth

David Hollingworth has been writing about technology for over 20 years, and has worked for a range of print and online titles in his career. He is enjoying getting to grips with cyber security, especially when it lets him talk about Lego.

newsletter
cyber daily subscribe
Be the first to hear the latest developments in the cyber industry.