Data aggregators are third parties that collect and license data from various sources. They are responsible for ensuring thelawful collectionandproper usage rightsof the data they distribute — especially when such data is used to train foundational AI models.
From theAI Governance in Practice Report2025:
“As organizations have neither proximity to how third-party data was first collected nor direct control over the data governance practices of third parties, an organization can benefit from carrying out its own legal due diligence and third-party risk management.” (p. 19)
“Legal due diligence may include verification of the personal data's lawful collection by the databroker...” (p. 19)
This confirms thatdata aggregatorsbear the legal and ethical burden to verify that data has been lawfully collected and is appropriately licensed for use, including in AI training.
A. The marketing agencyandD. its clientmay use data, but they rely on upstream providers for its lawful origin.
B. The tech companymay train the model but depends on lawful sourcing by data aggregators.
Contribute your Thoughts:
Chosen Answer:
This is a voting comment (?). You can switch to a simple comment. It is better to Upvote an existing comment if you don't have anything to add.
Submit