if public data is available freely online, can I use it in an AI system?

If Public Data is Available Freely Online, Can I Use it in an AI System?


If public data is freely available online, can we use it in an AI system?

Remember: much of what appears online is protected by copyright, even if it is freely accessible. Free access does not automatically grant you reuse rights for copying, training, or other AI use.

Web scraping, downloading, ingesting online content into an AI system creates copies that may infringe unless you have permission or a valid legal basis for doing so.

If a piece of work is in the public domain, or if a piece is available under open licenses that actually allow this kind of reuse, these could be instances where this type of use is acceptable.

But you need to verify that first before using it.

The big takeaway? Always check the terms of use or rely on trusted licensed sources before using online material in any AI workflow.

Topic:

Author: Roanie Levy

Roanie Levy, Licensing and Legal Advisor at CCC, combines over 20 years of intellectual property and copyright law expertise with a strong entrepreneurial and technological background. As Access Copyright's former President and CEO, Levy successfully navigated complex legal landscapes while driving innovation and growth. Her deep understanding of technology's impact on the creative industries informs her current focus on the ethical and responsible use of AI. At CCC, she supports initiatives to develop licensing frameworks that balance technological advancement with protecting creators' rights, ensuring that AI technologies are deployed transparently and fairly.