Common Crawl maintains a free,open repository of web crawl data that can be used by anyone.
Common Crawl is a 501(c)(3) non–profit founded in 2007. We make wholesale extraction, transformation and analysis of open web data accessible to researchers.
The Common Crawl team attended the 63rd Annual Meeting of the Association of Computational Linguistics in Vienna, presenting recent published work and strengthening links with the research community.
Laurie Burchell
Laurie is a Senior Research Engineer with Common Crawl.