ProPublica

Journalism in the Public Interest

Cancel

The ProPublica Nerd Blog

Heart of Nerd Darkness: Why Updating Dollars for Docs Was so Difficult

.

Photo by Dan Nguyen/ProPublica

Last week we published a big update to Dollars for Docs, our interactive news application of payments made to U.S. healthcare providers by 15 pharmaceutical companies. Compared to when we launched the project in 2010, the amount of data we’re collecting has grown enormously: The list of payments increased from around 750,000 to almost 2 million, and the grand total of the payments grew from around $750,000 to just under $2 billion.

Compiling the data for it has been an enormous project right from the beginning. After we published the first version, the original developer on the project, Dan Nguyen, compiled all of the things he had to learn into a guide to scraping data. This year’s update took more than eight months of full-time work by me, working with other news-app developers, and at times with our CAR team, a researcher, two editors and two health care reporters. It was a massive effort and presented huge technical and journalistic challenges.

Read More »

No Windows. One Exit. Free Drinks: Casino-Driven Design for Crowdsourcing

ProPublica Honored with Best Map, Two Medals at Malofiej 21

Between Human and Machine: Thoughts on Malofiej 21 Day 2

Outsider CAR: Quick Thoughts on Malofiej 21 Day 1

Casey Thomas, P5 Resident

Other Ways to Serve an App

Everything You’ve Ever Wanted to Know About Our News App Tech

ProPublica’s News Apps and Data Guides

RIP EveryBlock

A New Way to ‘Check In’ on Education Inequality

How To Edit 52,000 Stories at Once

New Year’s Resolution: Learn to Code

Use Our Nursing Home Inspect Widget on Your Site

Why (and How) We Use Creative Commons for Our Stories

Anatomy of a News Map

Pair Programming Participant #2: Ricardo Brom

New Open Source Project: Daybreak, a Simple Key/Value Database for Ruby

P5 Project Application

Pair Programming Participant #1: Julius Troeger

Election Day Interactives We’re Watching

How ProPublica’s Message Machine Reverse Engineers Political Microtargeting

Introducing a Free the Files API

Get a Free the Files Widget

Adventure Awaits: Another ProPublica News Apps Fellowship

The ProPublica Pair Programming Project

Knight Foundation Grant to Support ProPublica’s News Applications Desk

How a Map That Wasn’t a Map Became a Map

Useful Code Snippets

Untangling a Web of FEC Data

Introducing StateFace

Some Thoughts on Timelines

Announcing Simpler Tiles

When Are 190 Emails Like Six Emails?

Showing You the Money (Faster)

Introducing Simple Tiles: Our New Mapping Library

Anatomy of a Stepper Graphic

SOPA Opera Update: Opposition Surges

SOPA Opera: Which Legislators Support SOPA and PIPA?

Announcing the ProPublica News Apps Fellowship

Adaptive Design, Fixed Widths and Tablets

Explore Sources: A New Feature to “Show Our Work”

How We Made ProPublica.org Look Better on Your Smartphone

Introducing DocDiver

Facebook for News Apps: How We Harnessed the Social Network for ‘The Opportunity Gap’

TimelineSetter: Easy Timelines From Spreadsheets, Now Open to All

TimelineSetter: A New Way to Display Timelines on the Web

Scraping for Journalism: A Guide for Collecting Data

The Coder’s Cause in “Dollars for Docs”

Chapter 3: Turning PDFs to Text

Chapter 2: Reading Data from Flash Sites

Colophon

Get Our Data and Reporting Tools

We frequently publish data, and reporting tools like tipsheets and guides.

Sign Up