Toolbox Newsletter, Issue VIII

Sept. 8, 2020

Data Scraping Resources

One of the more popular topics I teach in my journalism school and newsroom trainings is how to find and scrape data from websites, .PDFs and other documents.

It’s a valuable skill for journalists today, as many government officials love locking data in formats that render the information useless, yet still meet public records laws. Requesting it in a spreadsheet or other format may take weeks and even require a FOIA request.

Journalist’s Toolbox dedicates an entire section of its data journalism category to data scraping and cleaning. We also have a video training that teaches you how to scrape data from web pages using nothing more than a basic formula in Google Sheets:

And do you hate .PDFs with the power of a thousand suns like I do? This nine-minute video shows you how to scrape a .PDF using The benefit of downloading the free Tabula is you can scrape from your desktop rather than loading the document to a web tool like CometDocs or ScraperWiki. Tabula is very popular among investigative reporters and editors.

We also have a full video playlist of trainings on how to find, scrape and post data to the web.

Hey, Can I Use That Photo on My Site, Prof. Reilley?

My answer is usually no, that it’s protected by copyright law. Even with fair use, stealing photos off other sites is a dangerous game to play. But there are ways to find rights-free photos online, which I cover in this short video:

Around the Web

10 Journalism Jobs and a Photo of My Dog is a curated newsletter list of journalism jobs from across the U.S. with links back to the original source so you can follow up with potential employers. You also get a pic of a cute yellow lab with each newsletter, which drops on Monday mornings.

Other tools:

  • has a browser extension for fact-checking and flagging misinformation. Find more fact-checking tools.

  • Kapwing is a free and simple online video editor. Add subtitles to videos, turn articles into social media, combine clips together, edit recordings and turn audio into video.

  • The Journalism Education Association posted a handy little guide on how to conduct remote interviews.

In Quotes …

“We must not confuse dissent with disloyalty.”

— Edward R. Murrow


Follow us: @journtoolbox |  JournalistsToolbox | Subscribe on YouTube

Copyright 2020 | Society of Professional Journalists