6

There have been a lot of popular AI-generating image systems put out recently, with such systems as Midjourney and Dall-E catching attention with how well put-together many of the automatically generated images are.

However, there has been a lot of pushback to these systems. This is largely because apparently the training data they were fed included lots of art that was used without the creators' consent. Sites such as DeviantArt and other image-sharing platforms were apparently scraped for training data without regard for the consent of the original creators or licensing.

Since these images were used as training data without the appropriate licensing, the AI systems that is benefiting from and using that data could be used to generate images that are then used commercially, which would violate such licenses as CC BY-NC-SA, and no attribution or credit has been provided for the original artists.

How can this problem be avoided when training an AI system? How can you ethically compile a comprehensive training data set for AI image generation?

Mithical
  • 2,885
  • 5
  • 27
  • 39

0 Answers0