OPINION | Are You Training AI Without Knowing It
By Dr Santhosh Sivasubramani
Every time you click "I agree" on a terms of service document, you might be signing away more than you realise. That innocent-looking consent form could be giving companies permission to train their systems on your personal data, often without you fully understanding what that means.
Consider what happened when parents discovered their children's artwork, shared on popular creative platforms, was being used to train image generators. The platforms had quietly updated their terms to include language about using uploaded content for "machine learning purposes." Most users never noticed the change buried in lengthy policy updates. This scenario plays out millions of times daily across the internet. From the emails we write to the photos we share, from our search histories to our online shopping patterns, all of it potentially becomes training data for systems that may eventually provide services built on our own information.
"When you share data freely today, you're contributing to the evolution of technology that will transform how we interact with digital services," I often explain when discussing current development trends. Think about your last doctor's appointment where you filled out digital forms, or when you uploaded family photos to a cloud service, or those customer service chats where you explained your problem in detail. Many companies now include clauses in their privacy policies allowing them to use this data for training purposes.
The impact extends beyond abstract privacy concerns. Graphic designers report discovering that image generators can reproduce their distinctive techniques after they've shared their work online. Medical professionals see patient interaction data helping train diagnostic systems. Even casual social media posts contribute. Your viral tweet might train sentiment analysis tools. Product reviews teach systems to write convincing content. Vacation photos help generate travel imagery without ever visiting those destinations. Look at the recent "Nano Banana Saree" trend that went viral. These systems learned to create such images from countless photos people uploaded of their traditional clothing, cultural celebrations, and personal style. The communities sharing these cultural elements never explicitly consented to having their traditions digitised and replicated by commercial systems, raising questions about cultural representation and data ownership.
"The digital age promised free services in exchange for our data, but nobody mentioned we'd be training systems that fundamentally change how we access information and services," as I frequently discuss in technology policy courses. Companies benefiting from this arrangement argue they're building tools that help everyone. Sometimes they do. But we deserve transparency about how our data is used and a genuine choice about whether to participate. Fortunately, you can take practical steps to protect yourself. Start by actually reading those privacy updates, however tedious they seem. Look specifically for terms like "machine learning," "training," or "model development." Many services now offer opt-out options, but they're typically buried deep in settings menus.
Consider privacy-focused alternatives for sensitive activities. For creative work, watermarking serves a new purpose beyond attribution. It helps protect your content from unauthorised use. Some creators now use tools that make their work difficult for training algorithms to process without affecting how humans view it. Most platforms now offer granular privacy controls. Take time to explore your account settings. You'll often find options to limit how your data is used while still enjoying the service's core features. Many jurisdictions have stronger data protection laws that give you more control over your personal information.
"We're contributing our time and creativity to systems that could significantly change how digital services work in the future," I note when teaching about data ethics. This reality deserves more public attention and debate. Understanding these dynamics helps you make informed choices about which services to use and how to configure your privacy settings. The relationship between personal data and technological development represents an opportunity for informed participation rather than concern. Many organisations are developing approaches that respect privacy while enabling advancement. Some use techniques where systems improve without directly accessing your personal data.
The next time you're about to click "agree," pause and consider whether you understand what you're consenting to. Your data has value. Understanding the true cost of sharing it empowers you to make choices that align with your comfort level. The decisions we make today about our digital footprints will determine whether technological development remains transparent and respectful of user preferences. Your awareness and informed participation help ensure that progress benefits everyone while respecting individual privacy choices.
(The author is an IEEE Senior Member)
Disclaimer: The opinions, beliefs, and views expressed by the various authors and forum participants on this website are personal and do not reflect the opinions, beliefs, and views of ABP Network Pvt. Ltd.
blog