Data Scraping Resources
One of the more popular topics I teach in my journalism school and newsroom trainings is how to find and scrape data from websites, .PDFs and other documents.
It’s a valuable skill for journalists today, as many government officials love locking data in formats that render the information useless, yet still meet public records laws. Requesting it in a spreadsheet or other format may take weeks and even require a FOIA request.
Journalist’s Toolbox dedicates an entire section of its data journalism category to data scraping and cleaning. We also have a video training that teaches you how to scrape data from web pages using nothing more than a basic formula in Google Sheets:
And do you hate .PDFs with the power of a thousand suns like I do? This nine-minute video shows you how to scrape a .PDF using Tabula.technology. The benefit of downloading the free Tabula is you can scrape from your desktop rather than loading the document to a web tool like CometDocs or ScraperWiki. Tabula is very popular among investigative reporters and editors.
We also have a full video playlist of trainings on how to find, scrape and post data to the web.
Hey, Can I Use That Photo on My Site, Prof. Reilley?
My answer is usually no, that it’s protected by copyright law. Even with fair use, stealing photos off other sites is a dangerous game to play. But there are ways to find rights-free photos online, which I cover in this short video:
Around the Web
10 Journalism Jobs and a Photo of My Dog is a curated newsletter list of journalism jobs from across the U.S. with links back to the original source so you can follow up with potential employers. You also get a pic of a cute yellow lab with each newsletter, which drops on Monday mornings.
Our.news has a browser extension for fact-checking and flagging misinformation. Find more fact-checking tools.
Kapwing is a free and simple online video editor. Add subtitles to videos, turn articles into social media, combine clips together, edit recordings and turn audio into video.
The Journalism Education Association posted a handy little guide on how to conduct remote interviews.
In Quotes …
“We must not confuse dissent with disloyalty.”
— Edward R. Murrow
Copyright 2020 | Society of Professional Journalists