Tips for Using AI When Writing Data Stories

May 12, 2026

Editor’s note: This is an excerpt from the upcoming 2nd edition of “Data + Journalism: As Story-Driven Approach to Learning Data Reporting”. You can preorder the book from Routledge this summer. The book is expected to drop in August, just before classes start.

Generative AI tools should be used with great caution in the reporting and editing process. But over the past few years, newsrooms have found innovative and economical workflows that produce sound data journalism.

An example: the Missing in Chicago project from The Invisible Institute and the Chicago City Bureau. This series of stories on missing and murdered Black women and girls in the Windy City earned the 2024 Pulitzer Prize for Local Reporting.

The newsrooms built a custom machine learning tool named “Judy” to comb through thousands of police misconduct records that showed bias and systematic failures. Moreover, it showcased AI’s power in investigative journalism by prompting legislation for a missing persons task force in Chicago.

According to a story by Harvard’s Nieman Lab, Trina Reynold-Tyler, the data director at the Invisible Institute, who shared the Pulitzer with City Bureau reporter Sarah Conway, began building the machine learning tool in 2021 using files that spanned from 2011 to 2015. She brought Chicago community members into Judy’s development process and eventually 200 volunteer workers read and manually labeled the misconduct files. Those volunteers created Judy’s training data.

Exercise: Brainstorm Story Ideas

Getting started: Ask a Large Language Model (LLM) to brainstorm story ideas with you. Assign it a specific role in the prompt:

You are an investigative reporter for an Oregon-based digital publication and you are developing hyperlocal story angles on how AI is draining water from Oregon rivers and lakes to cool AI data centers. Help develop four local angles for the story with links to background research.

Load that prompt into not one LLM, but several: ChatGPT, Gemini, Claude, Microsoft CoPilot and Perplexity. The free versions offer good results, but paid models will give deep research options that will produce more robust results.

Soon you’ll have a list of ideas, links to resources, even sources if you want them. For research, we strongly recommend Perplexity.ai, which includes five free “Pro” (deep research) searches in its free version daily. Perplexity by far produces the best research results, because it’s trained primarily as a research tool. It also lists footnotes and a “Sources” tab that provides links to all the answers and resources.

A word of caution: Always vet the sources when you are doing research. Never pull anything directly from an LLM and use it in a story. Apply your editorial standards and fact-check all unvetted material from AI.

Exercise: Interview Question Prompts

Reporters always develop a list of questions for experts and other sources they plan to interview, using research as a baseline. But once you have the list done, take it one step further: Write a prompt in a few LLMs (ChatGPT, Claude, etc.) asking it to generate questions for the source. Make sure your privacy settings are on in each of the LLMs, then build a base prompt from this template:

Base prompt template: You are a #role in #broadtopic. I need you to formulate some questions to ask someone to get them to think about #specifictopic. The questions should not be broad. The user must be able to easily give a specific answer. I need them to be asked in a way that helps them understand what #specifictopic actually is. Real-life examples need to be posed in the question

The completed, detailed prompt, which you can tailor to fit your role and story topic:

You are an investigative journalist for an Oregon-based digital newsroom covering environmental issues with AI. I need you to formulate some questions to ask an expert source to get them to think about AI draining environmental resources in Oregon to cool data centers. The questions should not be broad. The user must be able to easily give a specific answer. I need them to be asked in a way that helps them understand what environmental issues actually are. Real-life examples need to be posed in the question.

If you assign a role to the LLM and give it specific details about the topic and audience, you will get a robust result. Typically, my students and I will mine a few questions from each of the LLM results. Personally, I like the questions ChatGPT and Claude build the best. Then insert them in your interview questions.

I’ve had veteran reporters who attend my AI trainings tell me that using AI helps them step back and see a “larger picture” with the questions they craft, particularly beat reporters who have covered a specific topic for many years.

Video

Watch a training video on how to create quick instructional videos from your stories using NotebookLM, a free Google tool:

Sponsor

Be sure to check out the incredible production tools suite with our new sponsor at HeyNota.com

More Tools and Research

Sourcebase
Dashboard to search through curated collections of documents that cite directly to the source material and not the wider web.
AI Tools Box
Searchable listing of various AI tools.
Reference to Video
Turns your static images and source media into high-quality, controllable AI videos in minutes. Access state-of-the-art models like Kling, Sora, and Wan. No complex prompt engineering needed. Privacy-first, watermark-free downloads.
GIJN: New Tools for Geolocation, Collaboration, Illicit Finance, Deportation Data, and Bad-Actor Tracking from NICAR 26
AI Glossary
Database of dozens of AI terminology, written in plain language so we can understand it.

Textbooks

Data + Journalism, 2nd Edition

Samantha Sunne and I co-authored the 2nd Edition of the textbook, “Data + Journalism: A Story-Driven Approach to Learning Data Reporting” that will be available in August through Routledge and other booksellers (pre-order here starting in July). It’s an introductory- to intermediate-level guide to learning data storytelling from A to Z. The second edition features new tools, datasets, exercises and AI tools.

The Journalist’s Toolbox

My book, “The Journalist’s Toolbox A Guide to Digital Reporting and AI” was published by Routledge in 2023 and focuses on concepts and tools still used today. You can order it here.

In Quotes …

“By far, the greatest danger of Artificial Intelligence is that people conclude too early that they understand it.” – Eliezer Yudkowsky, co-founder and research fellow at the Machine Intelligence Research Institute

Follow me @itsmikereilley | @journtoolbox | Subscribe on YouTube | Subscribe to this newsletter

JournalistsToolbox.ai

Discussion about this post

Ready for more?