Lumen
A research project collecting and publishing legal takedown notices for online content transparency
Last updated
Was this helpful?
A research project collecting and publishing legal takedown notices for online content transparency
Last updated
Was this helpful?
Lumen is a research project at Harvard University’s Berkman Klein Center for Internet & Society that, as of 2025, houses more than 35 million takedown notices and grows by roughly 40 000 each week. It aggregates legal and extralegal removal demands from platforms such as Google, GitHub, Reddit, Medium, Vimeo, Wikipedia, Meta (Facebook & Instagram) and others. (Twitter/X halted submissions in April 2023 while it reviews third-party data-sharing.) Formerly called “Chilling Effects”.
Notices Repository: Lumen keeps a collection of important notices, like DMCA claims, defamation issues, privacy concerns, trademark matters, and court orders. Each notice shares details about the sender and recipient, such as who requested the content removal and which hosting or search service was involved. It also provides a brief overview of the reasons for the request and includes the URLs of the content being questioned.
Search and Filtering: The website provides a powerful search interface. You can filter by notice type, date received, sender, recipient, and more. Casual users see truncated URLs for privacy, while researchers may request additional access to view full URLs and attachments.
API for Advanced Research Researchers and investigative journalists can obtain API credentials to automate queries for large-scale data analysis. The API supports:
Searching by keywords, date ranges, parties involved, etc.
Retrieving entire notices as JSON for customized analytics.
Programmatic data collection over time to identify trends in takedown requests.
Lumen’s overarching goal is to bring transparency to the ecosystem of online content removal requests. It aims to be an independent, research-driven clearinghouse of takedown notices, allowing journalists, researchers, and the public to see who is requesting that specific web content or links be taken down and why. Lumen does not validate or endorse these requests; rather, it archives them to facilitate academic study, watchdog journalism, and informed civic debate.
Investigating Patterns of Censorship or Overreach:
Example: A politician repeatedly using DMCA notices to remove critical blog posts.
Outcome: You can uncover if the same entity has filed multiple notices across different platforms to silence certain views.
Checking the Legitimacy of a Takedown Claim:
Example: You receive a tip that content on YouTube was flagged for copyright infringement, but you suspect it’s fair use.
Outcome: A search of Lumen might reveal a DMCA notice that either lacks a credible claim or is suspiciously similar to notices flagged as fraudulent.
Examining Geopolitical or Government Interventions:
Example: A ministry in Country X demands the removal of “defamatory” content from Google or a social network.
Outcome: Lumen’s archive can reveal the scope of government requests, including how often and for what reasons they are made.
Researching Corporate Takedown Practices:
Example: You want to see if a major film studio issues an unusually high number of DMCA requests for minor social-media posts.
Outcome: By aggregating data in Lumen, patterns may emerge, helping you question potentially excessive takedowns.
Full or partially redacted copies of notices:
Sender (often a rights-holder, law firm, or government agency)
Recipient (Google, Vimeo, Medium, or others)
Target URLs and reason (e.g. copyright, defamation, trademark)
Contextual metadata:
Date sent/received
Claimed legal grounds (DMCA, local law, court order)
Whether the recipient took action (partial or none)
Research Tools:
Searchable indexes and filtering (dates, keywords, notice type)
API access for data mining and in-depth analysis
Below is a short sample of how a journalist or researcher might retrieve data via Lumen’s API (assuming they have an API token and some familiarity with command-line tools like curl
). This example queries for notices that contain the term “fraud” in their text or metadata, from a date range, and returns the first page of JSON results.
Explanation:
term=fraud
: Searches for notices containing the word “fraud” in their text.
date_received_facet=1672531200000..1704067200000
: Limits results to notices received between two Unix timestamps (e.g., 01 Jan 2023 to 01 Jan 2024).
page=1
: Retrieves the first set of results.
-H "X-Authentication-Token: YOUR_API_TOKEN"
: Authenticates you as a recognized researcher.
Accept: application/json
: Ensures responses return in JSON, which is easily parsed by scripts or data-analysis tools.
You will receive a JSON object listing all matching takedown notices, each with fields like id
, title
, date_received
, sender_name
, and more. By iterating through page
values, or narrowing your date range, you can gather larger sets of notices over time.
Lumen is a free and open source project.
Beginner-Friendly (Web Search) to Moderate (API Research)
Basic searching requires only a web browser and minimal familiarity with search filters.
API usage or large-scale analysis requires intermediate data-handling or scripting skills (JSON, command line, etc.).
Website Usage: No registration to browse truncated info; an email-based request is needed to view unredacted URLs or attachments.
API Usage: Researchers must apply for an API key by emailing Lumen’s team with a brief description of intended usage.
Partial Coverage: Only notices shared voluntarily by participating platforms (Google, Vimeo, Medium, etc.) or directly submitted by third parties. Some sites do not contribute data, and others may stop contributing.
Redacted Fields: Personally identifying information and entire text explanations may be redacted. For unregistered visitors, full URLs are truncated.
No Bulk Export via Website: For large-scale or automated retrieval, you must use the API.
Date & Result Limits: Extremely large or unfiltered searches might be capped or require date-slicing.
No Guarantee of Accuracy: Lumen does not confirm or endorse the validity of a notice; some notices may be fraudulent or contain misinformation.
Coverage gaps: Twitter/X paused data-sharing on 15 Apr 2023; some other services (e.g. Stack Exchange) stopped years earlier.
Google omits sender names from defamation notices for privacy reasons, so those fields will read “Redacted”.
Although the software is GPL-2.0, individual notice texts remain under the terms set by the submitter; bulk redistribution of raw data may require permission.
Potential Privacy Risks: Notices sometimes include sensitive info (names, allegations, etc.). Even though Lumen redacts personal data, some details may still appear in the body or attachments. Handle carefully.
Possibility of Misuse: Some takedown requests are abusive or ‘fake DMCA’ attempts, aiming to silence speech or censor legitimate content.
Caution with Publication: If you cite Lumen notices, consider verifying with additional sources. A notice alone is not proof of wrongdoing or infringement.
*Certain elements of the user interface may have been altered since publishing
Current major contributors (2025): Google Search, YouTube, Meta (Facebook & Instagram), GitHub, Reddit, Wikipedia, Medium, Vimeo, DuckDuckGo, Wordpress, University of California systems. Twitter/X not currently contributing.
Martin Sona
Steve Vondran: *, Youtube.
Dan Morrill: , Youtube.
A research-focused center studying the intersection of law, technology, and society.