Getting Started in Data Journalism

My friends at PenPlusBytes asked me to speak at their second annual bootcamp for student journalists.  There are many people doing great work in this field, so I drew on their experience to build a short talk. I gave my thoughts, examples of techniques for data-driven journalism, and some tips & tricks.  Working with journalists has always been part of the Data Therapy project, but it was nice to get a chance to focus on it more!

Here is an audio recording of the talk, and the Prezi I used to show visuals.

 

 

Another Webinar Coming Up on Wed 2/29!

Register Now!

Is your data getting you down? Clunking up your reports? Stressing you and your colleagues out? If so and you’d like some help, we’d love to cure your data presentation woes at our “Data Therapy” webinar. MIT researcher Rahul Bhargava will walk you through some of the best practices for making creative presentations of your data.

Topics will include:

  • selecting an appropriate data presentation technique based on your goals and audience
  • developing criteria for evaluating data presentations
  • implementing compelling methods that will grab your audience’s attention
  • tips and tricks for designing your presentation

This webinar is brought to you by the MIT Center for Civic Media and the Regional Center for Healthy Communities (Metrowest).  After registering you will receive a confirmation email with details about the webinar.

Register Now!

Webinar Q&A

A few folks have requested the questions and answers from the webinar.  Since the recording failed (grrrrrrrrr), below are the questions and what I remember of the responses I gave.

Q: Interestingly, your diverging colors went from red to green, which itself is worth noting.

This is referring to the first chart in my presentation, which I did presented as a bar graph of a diverging dataset. This showed people’s responses to the question “How comfortable do you feel right now with your ability to present your data?”. I viewed “I’m lost” as a negative response, and “I’m expert” as a positive response, so I decided to color it from red to green, building on the widely used traffic light color scheme.  I got the specific colors from the ColorBrewer tool, where I found a nice palette for diverging data with five values.

Q: If we’d like to preserve graphs, we’d rather use open source applications. Do you know of any that will produce creative charts, for example.

I’m a fan of open source, but not addicted to it. There are many open-source options making charts and things for graphic-designers or graph-makers. You could use OpenOffice instead of Microsoft Excel. You could use InkScape instead of Adobe Illustrator. You could use gimp instead of Adobe Photoshop. If you’re a super data nerd, the R Project is totally open too (for complex data analysis tasks).  However, if your goal is to ensure you can open and edit files down the road you could think about open formats for saving the files.  For instance, Illustrator opens natively works well with the open EPS format (and everything supports EPS).

Q: How many people are in this webinar?

There were about 50 people total that moved in and out of the webinar.  Probably about 25 people were there for the whole time.

Q: Which technique do you think is more engaging for the general public? What makes people ask things and engage in conversation when presenting data?

It kind of depends what setting you’re engaging the public in (ie. knowing your audience).  I’m a big fan of physicalizing your data for public display.  Embedding your data in a physical object is a great way to make it something people can interact with and react it.  You need some kind of spark to get people interested. Of course doing an interactive game can also be fun – I have a healthy respect for the near-universal appeal of carnival games. A real key is that you have to design for multiple levels of engagement… one person may be interested in a poster, while another may want to speak with you. This is hard, but there are lots of lessons to be learned from museum exhibits, which have a similar problem. You have to design your presentation for 30 seconds, or 20 minutes of engagement.  Focusing on the first 30 seconds is key, because that wins you the time to elaborate to someone.  Don’t forget the crowd dynamics of everything too – people always go to where there is a crowd.  So once you have one of two people engaged with your data presentation, more will come over.

Q: what’s the source from that religion x writing proficiency level graph?

When talking about building to complexity, I reference an example from the OkCupid Blog. The specific post I pull from is called “The Real Stuff White People Like“.  At the bottom of that post you can find the graph about religious beliefs and writing proficiency.

Q: Does this color brewer is colorblind proof?

When talking about colors, I mentioned the amazing resource called ColorBrewer.  On their color selection tool you can see that they do have a checkbox for “color blind safe” palettes.  The tool was made for cartographers, but their excellent color palettes are useful for almost any visual data presentation.

Q: Sorry, what was the second color tool that you mentioned after colorbrewer?

For graphic design (ie. not data representation) color schemes, I like to use ColourLovers. You can think about it as twitter for colors.  Graphic designers post their favorite color palettes and then you can use them for getting nice color schemes in your designs.  This is one of the “shortcut” online tools I like to point to that helps alleviate the tip-of-the-iceberg problem, because data presentation does build on so many fields of study.  Steal a palette from ColourLovers and you don’t have to be a graphic design wiz!

Q: where does this mapping software come from?

The map example I show was made with the ManyEyes tool. It has a bit of a learning curve, but basically you can upload a spreadsheet and make a map out of the data it shows (among other things).

Q: when you think of creative charts, are you thinking that these are things to do with vector graphics? Or, that have to be done by hand?

As I mentioned a few times, I think that hand-made charts and pictures are vastly underrated in northern cultures. We have a tendency to think that unless graphics are “professional” looking, they can’t be accurate.  We need to get over this.  Hand drawn things are often easier to engage with, and can help break down any intimidation that a novice viewer may have. They work especially well for the general public, where you need to be concerned about the accessibility of your presentation to audiences with lots of different levels of visual literacy.

Q: any suggestions on map related tools beyond many-eyes?

It depends on the need.  For mapping things another great tool is BatchGeo.  One of the things that distinguishes it from other “geo-location” tools on the web is the it does more than just put points on a map.  If you zoom out it will aggregate the geo-located points based on some attributes you select from the data you upload.  This can be really useful for seeing geographic clusters or breakdowns of data.

Q: Perhaps i missed, but what tool have you used for your presentation?

I created the presentation with Prezi.  I find it to be a great tool for visuals to help with telling your story.  The Microsoft Powerpoint model of flipping between slides is rooted on the days of overhead projectors.  Many of the constraints it enforces aren’t valid anymore. I find Prezi to be a better fit for non-linear story telling, and sharing the path of a presentation with the viewers.  The animations can sometimes make you dizzy, but it is otherwise easy to learn and free.

Q: What about http://geocommons.com ? Is is to complicated?

I like geocommons a lot!  I think it lets you make maps quickly and is very powerful.  That said, I find that the quicker path for Massachuetts maps by town is ManyEyes – they have a special feature just for that.  I recommend GeoCommons for people that need to do more deep analysis – not just make a quick map.

Q: Have you seen any of these techniques work better than others when presenting data in a written report (e.g. grant report)?

I’ll stress again here that it is really about your audience.  If you are writing a grant report for your funders, you can assume that they are already invested in your project, but don’t have a lot of time.  Some successful examples I’ve seen recently have mixed qualitative and quantitative data (which is always a good thing).  For instance, you could share the story of one person impacted by some program and include their picture, and underneath you could have some high level stats about the total population you’ve reached out too. These types of combinations can make scanning easier, but also allow for deeper dives into the material for those that are interested.  To summarize, start by identifying your audience and seeing where they are coming from, then come up with something creative!

Measuring Our Impact

Like many organizations, we continue to work on how to best evaluate the impact of the trainings we do.  As a baseline at any workshop we do a pre/post survey asking people about their comfort with presenting their data.  Here are the results of our recent webinar and summer workshops.

November Webinar

This is a little crayon chart of people’s answers to the question “How comfortable do you feel right now with your ability to present your data?”.  Their response before the webinar is on top, and their answers after the webinar are below.

You can see that a lot of people moved from the “I’m lost” side towards the “I’m an expert side”! So our webinar increased people’s comfort level in their ability to creatively present their data.  This is really helpful feedback for us.  Longer term impacts are harder to judge, because folks haven’t taken the ideas back to their data presentation problems yet.

July Workshops

That said, after holding the July workshops we followed up with the same question in August – so got people’s answers to the same question before the workshop, after it, and one month later.  Here are those results, again to the question “How comfortable do you feel right now with your ability to present your data?”:

Same result – people were more comfortable with the idea of presenting their data after the workshop.  On top of that, we are seeing the impact being “sticky” – even one month later people still feel more comfortable.

Concrete Examples of Impact?

Of course, measuring people’s confidence is just one way to think about this.  I’m particularly interested in it because other data tells me that comfort and confidence is a big barrier to trying out some of these techniques.  However, this question doesn’t address the concrete impacts of the training.

Towards that goal we tried to collect some success stories to hear about the  impact on real data presentations… but we only got a few response.  One participant got inspiration from the evocative image technique, saying:

I was struck by the example in the workshop where the boring health pamphlet stand was juxtaposed with the attractive ice cream machine, and used a similar approach for designing our advocacy piece. This piece will involve showing sets of side-by-side pictures comparing the environments of housed and homeless children, punctuated each time by a line graph that progressively shows how each environmental factor widens the gap in educational preparedness.

Doing assessment is always a challenge, but we feel like we’re off to a good start by integrating a variety of simple forms into our work already.  If we think of other novel ways to collect and present this you’ll hear about it for sure!

PS: for reference, here is my diverging-data crayon color pallet of choice:

  • I’m an expert – shamrock
  • I’m pretty good – magic mint
  • I’m ok, but could use more help – orange
  • I’m a little lost – purple pizazz
  • I’m completely lost – maroon

Webinar Follow-Up

Thanks to all those who attended – I think the webinar was a success!  We had about 50 people join us, and discussed a lot of great questions.  It went well, except the computer we setup to record it crashed!  So I’m upset that we don’t have a recording. In any case, below is some followup information.

Presentation

You can click to see the presentation I made on prezi.com, but here it is too:

Please note that this isn’t meant to stand alone as a presentation, but can serve as a handy reference if you want to remind yourself or share the ideas with someone.

Tools & Resources

Here are links to the things I mentioned:

  • Prezi: Present content in a way that more closely models how presenters and audiences understand things
  • BatchGeo: Quickly go from a spreadsheet of addresses to a map, including options to aggregate content by some field of your data
  • ColorBrewer2: Pick color palettes appropriately based on the type of data you are showing (then use those colors in Excel)
  • ColourLovers: Find nice color palettes for your graphic design projects (I use this one all the time!)
  • Many Eyes: Upload your data and create lots of different types of interactive visualizations of it
  • ComicLifeSuperLame: Create comic strips by adding talking bubbles to your photos
  • Handmade Visualizations: Use regular craft tools to bring your data representation back into the real world
  • TagxedoWordle: Create a picture out of a large body of text, where the size of each word is determined by the number of times that word is used in the text (read some more thoughts I have about this)
  • Jing: Make narrated videos of your visualizations, and mark up screenshots with text and arrows
  • Google Fusion Tables: Upload your data in spreadsheet form and visualize it in lots of different ways
  • Visualizing Information for Advocacy: Guide to creating info-graphics to support activism
  • Tools for Online Storytelling: Big list of websites that help you tell your story online with various media

Feedback

If you attended, please leave a comment with any feedback or ideas.  This was our first webinar, so we’re open to suggestions about the format, content, or anything else.  We’re trying to measure our impact, so I’ll have another post coming soon about our assessment results from this webinar and previous workshops.

Upcoming Webinar

Register Now!

Is your data getting you down? Clunking up your reports? Stressing you and your colleagues out? If so and you’d like some help, we’d love to cure your data presentation woes at our “Data Therapy” webinar. MIT researcher Rahul Bhargava will walk you through some of the best practices for making creative presentations of your data.

Topics will include:

  • selecting an appropriate data presentation technique based on your goals and audience
  • developing criteria for evaluating data presentations
  • implementing compelling methods that will grab your audience’s attention
  • tips and tricks for designing your presentation

This webinar is brought to you by the MIT Center for Civic Media and the Regional Center for Healthy Communities (Metrowest).  After registering you will receive a confirmation email with details about the webinar.

Register Now!