Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connecting Trackers to Companies #307

Open
nxfxcom opened this issue Dec 14, 2022 · 2 comments
Open

Connecting Trackers to Companies #307

nxfxcom opened this issue Dec 14, 2022 · 2 comments

Comments

@nxfxcom
Copy link

nxfxcom commented Dec 14, 2022

On pages like this one:
https://whotracks.me/websites/pagesix.com.html

it shows the connection between a tracker to a company. How do I do that from the raw data? I don't see a company column in the trackers or sites_trackers

Thank you

@philipp-classen
Copy link
Member

Apart from the csv files, we also publish meta data. The format might change in the future, but it is currently available in the trackerdb.sql file that you can find here:
https://github.com/whotracksme/whotracks.me/tree/master/whotracksme/data/assets

For instance, it has mappings such as:

$ grep -i onetrust whotracksme/data/assets/trackerdb.sql
INSERT INTO "companies" VALUES('onetrust','OneTrust','Evolving data privacy regulations create consistent challenges for website owners. EU cookie laws based on the ePrivacy Directive and GDPR, require organizations to inform website visitors about the data that’s being collected from them and to provide them with the choice over sharing their information.\r\n \r\nOneTrust provides website owners with a transparent mechanism for obtaining required cookie consent from website visitors and respecting Do Not Track requests, helping organizations comply with EU Cookie Laws. Our comprehensive cookie compliance solution includes continuous website scanning against a 5.5M cookie database, flexible interface for managing visitor consent, and customizable visitor preferences center.','https://www.cookielaw.org/privacy-policy/','https://www.onetrust.com/','4754','us','[email protected]','3/9/17 CH: Vendor created. ');
INSERT INTO "tracker_domains" VALUES('onetrust','onetrust.com',NULL);
INSERT INTO "trackers" VALUES('onetrust','OneTrust',5,'https://www.onetrust.com/','onetrust','3740',NULL,NULL);
INSERT INTO "trackers" VALUES('optanaon','Optanaon by OneTrust',5,'https://www.cookielaw.org/','onetrust','3742',NULL,NULL);

@philipp-classen
Copy link
Member

We recently opened all the data: https://github.com/ghostery/trackerdb. It's now the recommended place to start and will replace the trackerdb.sql file (#315).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants