January 26 2023 GM

From TCU Wiki
Glitter Meetups

Glitter Meetup is the weekly town hall of the Internet Freedom community at the IF Square on the TCU Mattermost, at 9am EST / 2pm UTC. Do you need an invite? Learn how to get one here.

How to use the world's largest online library of secrets

We will introduce Distributed Denial of Secrets (DDoSecrets), a group publishing large datasets which come from either leaks or hacks. Most of DDoSEcrets' datasets are available to the public for download via peer-to-peer decentralized torrents. To protect the privacy of regular people who get caught up in data breaches, some of our data must first be requested. Since launching in 2018, DDoSecrets has published more than 100 terabytes of data that has taught us about international money laundering, fascist organizing, tax havens and the people who use them, military and police corruption, surveillance and attacks against journalists and activists, and much more.

Bio: Lorax Horne is an editor of the Spanish site of DDoSecrets

Notes

Could you introduce yourself to the members who are entering the channel?
  • Lorax (@lorax on Mattermost): I'm Lorax Horne. I am one of the journalist members of DDoSecrets, I used to be on the board of advisors and now I work full time answering queries and talking to other journalists who want to know how to use the archive. I've worked before in newsrooms in Ecuador and Canada, and I have been a freelancer for most of my journalism career
  • Annalise (@annalise on Mattermost): I'm Annalise, I serve on the advisory board for DDoSecrets. I joined the collective back in 2019, and I'm based in Washington, DC. My day job focuses on anticorruption/transparency so quite a few of the DDoSecrets datasets are very relevant to my work (check out our Bankers Box series!). I help with coordination and fundraising, and try to spread the word of our collective as much as possible.
What is DDoSecrets.com?
  • DDoSecrets is a non-profit publisher that started in 2018 to archive datasets that are released by whistleblowers and by hackers. A lot of data circulates on the internet without a stable place to find it. A lot of hacking forums exist where people can buy or trade for data.
  • We believe this data should be able to be used by the public and by journalists, who often can’t afford to pay for data that is being sold on forums. DDoSecrets.com is a stable repository for data that we think is useful for journalism and research. We work with sources to receive and process datasets.
  • Some examples of things that we have published over the years, includes things like the 32 terabytes of videos that were archived from the social media site Parler after the coup attempt on January 6, or most recently, the 90 megabyte spreadsheet of the No Fly List, with names of people who are banned from getting on airplanes because they are on an FBI terrorism watch list.
  • You can check out all of our releases here.
  • Our archive has been pretty active in recent years. Last year, a new hacktivist group emerged called Guacamaya, they attacked mining and extractivist companies in Latin America first and then focused on the military and police who are often the security forces of the mining and oil companies. Some of the journalism, Lorax says, I did in my early career was about oil extraction in Ecuador, trying to convince politicians to tell me what was really happening, struggling for years to get a  document or a straight answer from people who were destroying the rainforest. Guacamaya just handed us an archive of all the emails from the Ecuador mining ministry which decides which land parcels get sold off for mineral rights to international companies, as one example.
  • In a later Guacamaya release, we were seeing six terabytes of emails from the Mexican military, which uncovered things like the who was responsible for the massacre of the students in Azotzinapa, sexual assault in the military, and the use of Pegasus software to spy on journalists and activists…
About your release Blue Leaks, which covers information on local police across the US, do you know if it has caused any change in the system? Was there a collaboration with a local newspaper to talk about their police forces?
  • Annalise says: I think I mentioned before, I'm particularly fond of our Bankers Box series, which includes datasets from various company service providers, investments firms, and corporate registries once restricted from public access. It shines a light on the murky underworld of illicit finance, like our 29Leaks release - a company service provider (Formations House) in the UK that was setting up shell companies to facilitate money laundering. Also, the release Merit Kapital/Merit Servus release is extremely interesting, a few outlets have already released stories of how this Cypriot company service provider was creating layers of shell/holding companies for oligarchs like Roman Abramovich.
  • To this, Lorax adds: BlueLeaks had a few effects, but it was also heavily censored. Twitter banned all of our links after the release of Blueleaks, making it impossible still to share our releases on Twitter.com. And our public data server with our search engine was seized in Germany, after the release of BlueLeaks, which made it harder for people to use that dataset for research. Still, it has had some long term effects. A bill was put forward in Maine to shut down their fusion center, citing BlueLeaks documents. The bill didn't pass, but its not very often that you see legislators offering to shut down law enforcement structures, so that was definitely impactful, even if the bill did not pass.
Considering there is this recent fuzz around the twitter files and internal reporting on how censorship was working... Do you know if your case has been covered in some of these files?
  • Working in the anticorruption community, I've seen one document from BlueLeaks really make its rounds - it was an internal FBI memo stating that US private equity is at high risk for money laundering. It was one of the first definitive acknowledgements from a federal agency of the risk of private equity. It's been really helpful in my own advocacy work with trying to get anti-money laundering/due diligence rules passed for investment advisors.
  • The Twitter files reporting has not covered the banning of DDoSecrets at all.
Could you explain how to participate in DDoSecrets? How journalists can contact you for access to your archives or how a source could contact you?
  • We are a community project, so we are always looking for volunteers who might want to join our board to provide advice to the collective, and people who can download our torrents and seed them so that others can also download and use our data. And we are a non-profit, so people can donate to us using this link to help us continue to run.
  • Some of the long term projects we would like to do, include working on translating our documentation at ddosecrets.com into new languages, so that the datasets can reach the most appropriate local audiences. We would love to meet more people from Localization Lab, we know you do this work extremely well.
  • And we are always looking for new sources, so if you are a whistleblower or a security researcher and you come across a large collection of documents that you think the public should know about, get in touch.
  • So, it'd be good to explain that while we have a lot of datasets available for public access, some contain sensitive/personal identifiable info (PII) so we restrict access (limited distribution datasets). We ask that journalists/researchers/academics write to us explaining their request
  • Because of this issue we have had with censorship after BlueLeaks, we encourage most people who download our datasets to use the torrents. This is the best protection against censorship that we know of, if the crowd hosts the data and then contributes their copy of it, to help other people download it too.
You already mentioned Torrent, but what other tools or software do you use for sharing these information cases and keeping some safety?
  • We prefer end-to-end encryption when talking to sources, so the main point of contact for sources has been using Cwtch messenger, a project by Open Privacy. We also use Signal, PGP, and other encrypted chat services, depending on what the source requires.
  • We also use things like Dangerzone, to open PDFs, and OnionShare, to send documents over the Tor network.
  • Other tools we use and recommend are listed here.
  • For stuff like our Cyberwar category where there could be malicious links and such, it's always recommended to go through those files in a virtual box. But we partner with groups like OCCRP who host some of our datasets in their search engine, Aleph. We actually relaunched our instance of Aleph pretty recently, it OCRs the docs and makes it easy to search across multiple datasets simultaneously.
  • We have our version of Aleph hereit doesn't have all of our data loaded in it yet, mainly the most recent Guacamaya hacks, Blueleaks, and one or two of the Russian datasets, but it's growing.
What does the community of users look like around DDoSecrets? Do journalists consult it when working on a project/to find a potential story idea? Do you flag materials that might be of interest to your networks? All of the above + much more?
  • Lorax answers: I would say a lot of the people who use the datasets are journalists, also academic researchers, legal teams, government agencies, and regular people who publish research. We don't have a large capacity to be giving out suggestions to others for stories to find in the datasets, we encourage people to learn by doing, how to use the datasets.
  • Annalise adds: I've been able to flag a few leads for outlets but it's definitely been from just perusing through some of the datasets for my own curiosity. Not often though.
  • When there is a journalist who we know is looking at a particular topic, and we know that this topic is represented in a dataset, we might be able to flag it for them, but this isn't typical, as we often don't review every portion of a dataset before we list it in our archive.
  • We see ourselves as a library and we are like librarians... so, you wouldn't expect a librarian to have read every book and know every footnote, but they might be able to guide you in which book to check out.
  • I think, Annalise says,  it really is neat to see the global coverage of our datasets, and we've been able to work with groups like Justice for Myanmar after receiving datasets tied to the junta there - they've extensively covered a lot of corporate/financial ties between multinationals and  the junta . We also worked with a newspaper in Equatorial Guinea, Diario Rombe, on GQ Leaks
  • Some of the interesting datasets lately have been limited distribution, like the MeritKapital data, and the No Fly list. While we do publish most of our data publicly, we also have a category of data that is only available upon request by people who have a history of publishing. These requests can come from a journalist, or really from any  kind of researcher who has a history of publishing their research.
  • The best email for requests is press@ddosecrets.com or someone can also directly open a ticket at this link.
  • Stuff that has a lot of personal information, like the No Fly list for example, are only available in the request section. And someone doesn’t have to be a published journalist to make a request, but it does help if they have previously published some sort of research that illustrates their ability to handle data.
Do you have some new topics of interest that you could share with us?
  • The MeritKapital data came from a Cyprus firm that sets up shell companies for sanctioned oligarchs:
  • Cellebrite was also a pretty notable recent release:
Do you have datasets that are particularly interesting for people looking at digital security intrusions?
  • Probably the Cellebrite leak is a good one to look at, for that. It is recent and I have flagged it for the researchers at Amnesty who have covered Cellebrite's abuses in the past.
  • There are also smaller companies like ODIN that sells its apps to police agencies as an intelligence gathering tool.
Do you have recommendations of ways that newcomers can lend a hand with existing research projects as a way of learning to do this work? How would they connect?
  • We recommend people start with one of our public datasets, download it from a torrent, and see what interesting conclusions they can make from the dataset. Then maybe publish your research, and get in touch to request one of the more restricted datasets, these can often be even more rewarding to research, because they are not published so less people might be able to find out what is interesting in those datasets.
How the community could reach you or support your effort? Could you leave us your emails or social media where to follow you?
  • You can email Annalise: annalise.burkhart@ddosecrets.com