Storytelling Lecture in Brazil

I just returned from a fascinating week in Belo Horizonte (Brazil), where we ran multiple workshops to build capacity to work with data in creative ways.  The trip was organized by the Office of Strategic Priorities of the State of Minas Gerais (they are members of the MIT Media Lab).  This post is one in a series about the workshops we ran there.

Data Therapy is usually about small hands-on workshops, but the “Storytelling with Data” workshop we scheduled in Belo Horizonte had 700 people sign up!  This event is part of a series of “Conecta” lectures the government has been hosting with guests from the MIT Media Lab.  Since the signups were so strong, we scrambled to find a larger venue and turned it into a lecture!  Clearly there is a a need to to start conversations and build community around the idea of data-driven storytelling.

IMG_6304-20

Content

Connecting the Media Lab’s approach, we introduced the ideas of sketching and playing with data as the way to empower people.  We framed it as opportunities to improve your work, the help your colleagues, and help your community.   We ran through the pieces of the process:

  1. asking yourself some questions to define your audience, goals, etc. (handout)
  2. asking your data some questions, to explore what it is telling you
  3. finding stories in your data, based on some templates of types of stories (handout)
  4. picking a data presentation technique, based on all the previous steps (handout)

Sounds boring when I write it like that, but in fact we have hands-on activities that make it fun along the way. More importantly, these activities open the process to non-data people in empowering ways (ie. building the concept of “popular data“).  However, because this was a lecture, we were only able to sprinkle in short pair-and-share activities along the way.  These actually got the participants talking to each other:

IMG_7402

Here is the English copy of the presentation:

Reflections

I’m not a huge fan of the lecture format – it sort of makes you feel more important than you really are.  One of our key goals was to improve connections within the community in Belo interested in this topic.  It turned out the short pair-and-shares that we did after each section of the talk worked super-well for this… so well that we had a hard time bringing people back after each!

In addition, people responded well to the you, your organization, your community framing we laid out.  It let folks that weren’t specifically focused on empowerment still connect to the content.

Movimento Minas wrote up some notes and linked to the presentation in Portuguese.  Raquel Carmargo also posted some pictures on Facebook.

Data-Driven Storytelling for Entrepreneurs

I just returned from a fascinating week in Belo Horizonte (Brazil), where we ran multiple workshops to build capacity to work with data in creative ways.  The trip was organized by the Office of Strategic Priorities of the State of Minas Gerais (they are members of the MIT Media Lab).  This post is one in a series about the workshops we ran there.

We’ve always included a diverse set of audiences in the Data Therapy workshops.  However, I’ve never had a chance to really connect with the entrepreneurial and startup communities.  This changed last Monday in Belo Horizonte, when we had a chance to run a workshops for members of the first class of startups accepted to the the SEED accelerator program run by the state (@SeedMG).  The government created the program to foster an ecosystem of innovation.  They host the in a startup-y co-working space, give them a little money, and offer them access to mentors and such for about 6 months.

IMG_3091

Content

We introduced them to our basic Data Therapy brand of story-finding and story-telling.  In addition, we introduced some evocative examples to think about a few ways business use data:

  • getting useful feedback
  • understanding product use
  • improving product use
  • validating assumptions
  • improving process
  • surprising and delighting customers

This workshop included two participatory components (both learnt from the Tactical Tech Collective last summer):

  • visualization reverse engineering – we hang up visualizations and have small groups walk around trying to identify things like audience, visual technique, data used, goals, etcIMG_3096
  • convince me – we introduce some sample data, select volunteers to play personas (like CEO, funder, potential customer), and have everyone try to use the data to convince them of goalIMG_3090

Here’s the presentation content, for any folks that might be interested.

 

Reflections

The group of about 15 entrepreneurs enjoyed the chance to focus on their data problems. In particular, the “convince me” activity sparked a great discussion about how and why data can be used to talk to different audiences.  This connected really well with their natural entrepreneurial instincts to hone in on customer personas and narrow focus.  A handful were particularly interested because they had presentations to make to potential investors that day!

IMG_3088

The SEED blog has a short writeup of our workshop in portuguese.

PS: You can tell it is a startup space, because they have ridiculous things like a giant pool of plastic balls you can play in!

IMG_3119

Thoughts on “Big Data” & “Small Data”

I’ve seen a lot of writing lately on Big Data vs. Small Data.  I know this is something I should pay attention to, because they are capitalizing words that you usually don’t capitalize! Here are some still-forming thoughts…

Rufus Pollock, Director of the Open Knowledge Foundation, recently wrote on Al Jazeera that:

Size doesn’t matter.  What matters is having the data, of whatever size, that helps us solve a problem of addresses the question we have – and for many problems and questions, Small Data is enough

He argues that Small Data is about the enabling potential of the laptop computer, combined with the communicative ability unleashed by the internet. I was sparked by his post, and others, to jot down some of my own thoughts about these newly capitalized things.

How do I Define Big Data?

Big Data is getting loads of press.  Supporters are focusing in on the idea that ginormous sets of data reveal hidden patterns and truths otherwise impossible to see.  Many critics respond that they are missing inherent biases, ignoring ethical considerations, and remind that the data never holds absolute truths.  In any case, data literacy is on people’s minds, and getting funding.

My working definition of what Big Data is focused more on the “how” of it all.  For one, most Big Data projects run on implicit, unknown, or purposely full hidden, data collection.  Cell phone providers don’t exactly advertise that they are tracking everywhere you go.  Another aspect of the “how” of Big Data is that the datasets are large enough that they require computer-assisted analysis.  You can’t sit down and draw raw Big Data on a piece of paper on a wall.  You have to use tools that perform algorithmic computations on the raw data for you.  And what do people use these tools for?  They try to describe what is going on, and they try to predict what might happen next.

So What Does Small Data Mean to Me?

Small Data is the new term many are using to argue against Big Data – as such it has a malleable definition based on each person’s goal!  For me, Small Data is the thing that community groups have always used to do their work better in a few ways:

  1. Evaluate: Groups use Small Data to evaluate programs so they can improve them
  2. Communicate: Groups use Small Data to communicate about their programs and topics with the public and the communities they serve
  3. Advocate: Groups use Small Data to make evidence-based arguments to those in power

The “how” of Small Data is very different than the ideas I laid out for Big Data.  Small Data runs on explicitely collected data – the data is collected in the open, with notice, and on purpose.  Small Data can be analyzed by interested layman.  Small Data doesn’t depend on technology-assisted analysis, but can engage it as appropriate.

So What?

Do my definitions present a useful distinction?  I imagine that is what you’re thinking right now.  Well, for me the primary difference is around the activities I can do to empower people to play with data.  My workshops and projects focus on finding stories, and telling stories, with data.  With Small Data, I have techniques for doing both.  With Big Data, I don’t have good hands-on activities for understanding how to find stories.

I connect this primarily to the fact that Big Data relies on algorithmic investigations, and I haven’t thought about how to get around that.  Algorithms aren’t hands-on.  You can do engaging activities to understand how they work, but not to actually do them.  In addition –  most of the community groups, organizations, and local governments I work with don’t have Big Data problems.

Put those two things together and you’ll see why I don’t focus on Big Data in my work. Philosophically, I want to empower people to use information to make the change they want, and right now that means using Small Data.  That’s my current thought, and guides my current focus.

Workshops in Brazil

Data Therapy is heading off to Brazil for a series of workshops next week!

We’re excited to announce the partnership with the government of Minas Gerais, to deliver a series of workshops and paint a mural in Belo Horizonte.

conecta-banner-siteepe1We’ve worked with our partners in the office of Strategic Priorities and put together and agenda that include multiple workshops, public lectures, and a Data Mural (with youth at the PlugMinas school)!

Know someone in Brazil?  Send them this information and see if they can join us!

Being the Data (ie. data & body syntonicity)

Recently I’ve seen a number of new examples of physically-embodied data presentations – examples where each person participates with their body representing the data that they are.  Using your body to act as the data in this way is not only fun, but reminds me of the work I used to do with the concept of “body syntonicity” here at the MIT Media Lab’s Lifelong Kindergarten group.  Seymour Papert coined this term to describe how children would program and predict a LOGO Turtle’s motion by imagining they were the Turtle (1).

Some kids kick it old school with a real LOGO Turtle at the MIT AI Lab!

A Corporate Example

The first connection I saw recently was a video ad for Prudential while I listened to Pandora Radio.  They are trying to tell a data story about how long people live after retirement, with the goal of getting them to set up a retirement plan with Prudential. The campaign is very appealing from a data-presentation point of view.  In one ad they asked people how much money they thought they needed for retirement, then gave each a length of ribbon, and had them walk from the center of a circle to the length of the ribbon:

Another let people put a sticker on a big chart to build a histogram of the oldest person they knew:

These are cool, and look fun.  Letting people be the data connects them with the information in a real, body-syntonic way.  I’m sure this makes the people more likely to be interested in Prudential’s product offerings and planning services.

An Academic Example

In the academic realm – my colleague Nathan recently went to the Computer Support Collaborative Work conference, where he learned about the MyPosition project from Nina Valkanova, Robert Walker and others.  Her recent work revolves around concepts of presenting information in public spaces.  Here’s an academic paper describing the MyPosition project.  It allows people stand in front of a projected poll and add their vote by holding up their hand:

Their findings in the paper around social pressure are interesting, as is the fact that people got around the fancy tech to actually engage in the question they were polling.  Also the idea that people used it more when it showed real people’s faces is interesting.  All in all, it presents a fascinating example and some usable insights into how to design these types of public interactive data presentations.

A Community Example

My colleague Sasha Costanza-Chock recently pointed me at the Crossing Boundaries project from the local Urbano Project.  Artists Alison Kotin and Risa Horn worked with 10 local high school students to gather data about local transit and create art pieces that told the data stories they found.

Their pieces are embodied data sculptures – wearable objects that represented the data story they want to tell.  This example is fantastic empowerment, data literacy, and art work.  I enjoy it in so many ways and look forward to talking with the creators sometime in the future.

Be the Turtle

So what’s the takeaway?  As a young participant in a robotics workshop I ran years ago said – “Be the turtle”.  Think about ways you can engage people to actively be the data in the story you’re trying to tell.

(1) Papert built on Freud’s notion of “ego syntonicity”, which concerned the mind.  This presentation I found online digs into this more in relation to computer programming.

Sketching Data Driven Stories

Last fall Ethan Zuckerman and I were invited to co-teach a “data acquisition and visualization” module for graduate students in the MIT Comparative Media Students program.  The module was five sessions covering the following topics:

  1. Data visualization, from acquisition to storytelling. Deep dive on data “shopping” – the process of acquisition and interrogation
  2. Workshop on five methods of data scraping and cleaning
  3. A taxonomy of visualization, and sketches of the story we want to tell with data
  4. Workshop on Fusion Tables/Maps and Tableau, and lots of other tools for data visualization
  5. Mapping and Unmapping, the politics of data, and presentation of student work

For session 3, I wanted to dig in on ways to sketch a data-driven story and was reminded of an activity I participated in last summer at the 2013 Info-Activism Camp.  There was a track of visual presentation workshops facilitated by Angela MorelliTom Halsør and Mushon Zer Aviv.  One of the activities they ran felt perfect for this class – they led an activity to create a short paper story book that sketched out the data story.  The goal was to flesh out your story before doing all the work to make a final version.

They started by folding a piece of paper in a ‘zine.  Click here for the best instructions I’ve found online.  Here’s one I made based on data about where I spent my time at the Tactical Tech Camp:

As you can see, the folding created a short book.  Writing out my data story in this low-tech way forced me to focus on the narrative structure of my data story, rather than the data details or computational options.  Even if you know you’re going to present it as a single graphic, teasing apart the narrative is a crucial step in crafting a strong story.  This hands-on sketching exercise if one of the best ways I’ve seen the do that.

So I totally copied their technique and used it in session 3 of the module I was describing earlier (thanks you all!).  The students dove right into it, drawing the concepts they saw in their own data stories.  It worked pretty well, helping them pull apart the crucial way points in their story.

class

At a deeper level, this activity is another one for learning data literacy and data presentation that pulls from the world of the arts.  I’m constantly trying to build the toolbox of activities that can be used for my Data Therapy workshops, and this one just got added!  I strongly believe the arts is the most fertile ground to borrow from when creating engaging data literacy activities.  My underlying motivation is to push forward a better description of my half-formed “Popular Data” concept.

Have you tried this activity?  Are there others that spring to mind?

Focusing in on the Mural part of Data Murals

Last year we finished our Data Mural pilot projects, and have been very happy with how the evaluation has looked.  We’ve judged things against our logic model, to assess how we’re doing against our desired outcomes. One of the outcomes we listed was a “more beautiful community”.  Now, of course that is subjective, but if you look at the murals we painted I think most would say they are nice community art pieces.

pics

That said, we focused so much on the capacity building outcomes that we didn’t make time to innovate on the artistic output.

I’ve recently been wondering if we can bring some new technologies back into this in a useful way.  One idea was to explore conductive paints in the mural.  I think there’d some really cool interactions we could make to help tell the data story.  Bringing some of my museum exhibit design experience to bear in this space would be fun. Maybe by picking a handprint to touch on the mural you light up some part of the mural that applies to you?

I dug around and the best example I can find of using conductive paint in a mural is the Light of Human Kindness project (here’s a great video about it).  The community project collected stories of good deeds and built them into a mural and website.  The mural is on a big wall with tons of lights embedded in it, each representing a story.  When you submit a story online one of the lights blinks.  They added an interesting interaction around the idea of people holding hands in front of it.  If a lot of folks hold hands, with one person in the chain touching the mural, then the lights play a pattern.

Yes, it is a lightweight interaction, but it still carries that sense of magic.  I love that they linked the overall message of the mural with the interaction (ie. holding hands and helping each other).  I think it’s time to start doing some experiments here, because conductive paint feels like a way to innovate in support of the data story but still be true to the mural form.

Building Your Toolbox of Techniques

One of the things I emphasize in my workshop is building a toolbox of presentation techniques.  With a toolbox ready at hand, it is a lot less intimidating to pick an appropriate technique for a specific audience and goal.  I’ve defined my own list of techniques, but it by no means the only way to slice up the space.

One other particularly useful list comes from a classic academic paper called “Narrative Visualization:  Telling Stories with Data” by Edward Segel, Jeffrey Heer (download it here).  The paper meticulously reviews about 60 online visualization, mostly from newspapers, to define some recurring genres.  If you can stomach the academic prose, the paper is worth a read.

genres-of-narrative-vis

Their “genres” focus on 2-d visual presentations of data stories, to be expected based on the title of the paper and the examples they pull from.  However, within that space it is a particularly wonderful list:

  • magazine style: “an image embedded in a page of text”
  • annotated chart: a traditional chart of graph with textual callouts highlighting specific data points
  • partitioned poster: a “multiple view visualization”
  • flow chart: a directed series of pieces of information
  • comic strip: multiple frames in a linear path
  • slide show: a series of visuals presented one at a time to assemble a narrative
  • film/video/animation: fairly self-descriptive

These vary based on the number of “frames” (visuals) presented, and how they are shown over time.  This list breaks down the set of techniques differently than I usually do, and that’s a nice thing so I thought I’d share it!

From there they move to a discussion of author- vs. reader-driven approaches.  That’s a wonderful reminder to decide early on whether you are building an exploratory or explanatory presentation.  Are you trying to tell a strong narrative, or showing information and letting the viewer take away a story?

Towards a Concept of “Popular Data”

I was recently invited to give a Skype keynote for the first hackathon hosted by the state of Minas Gerais in Brazil.  The talk was a wonderful provocation to revisit the writing of another Brazilian I used to study – Paulo Freire and his vision of popular education.  This led me to wonder… what would a model of “popular data” look like? Answering this requires an agreement that there is a problem, and agreement that the problem merits a popular education approach.  This post is an exploration, so I end by proposing a few grounding principles for a concept of “popular data”.  Is this a useful concept?

The Problem

Governments large and small are speaking of open-data platforms and data-informed decision making.  They share with us a vision of responding to citizen concerns more accurately and efficiently based on data.  These governments are using the language of data.  Data is a language governments are speaking, but most people don’t understand This is the core problem that I address with my Data Therapy project.

speak data?

Can Popular Education Help?

If you don’t speak the language used by your government to make decisions, then you can’t participate in those decisions.  This disempowers people, and popular education is an approach for rectifying disempowering situations.  The city I live in, Somerville, MA, has a a program called “ResiStat” that is intended to 

bring data-driven discussions and decision-making to residents and promote civic engagement via the internet and regular community meetings

This data-centered effort can only engage those that already understand the charts, graphs, and terms they use.  Don’t get me wrong – they don’t deliver a dry academic lecture at their community meetings.  However, they do rapidly run through reams of data analysis with an expectation that most in the audience can handle the information-centered explanation.  This leaves out the many residents who don’t speak data at all.

What is Popular Education?

Philosophical definitions are always debated, but here are a few guiding principles most practitioners of popular education would adhere to:

  • participation from all parties
  • learner guided explorations
  • facilitation over teaching
  • accessibility to a diverse set of learners
  • focus on real problems in the community

If you consider this list a litmus test for governmental data programs, few (if any) would pass.  So how do we change this?

Popular Data?

Now that you’re (hopefully) on board with my problem statement, and the idea that popular education can help, lets play out how. Popular data is my name for engaging, participatory approaches to data-driven presentation and decision-making.  Not a great name, but from an academic point of view it puts my work in the right family tree so I’ll use it for now. How do you structure data programs to practice popular data? Lets run through each of the tenants listed above and look at some examples.

Participation from All Parties

Popular Data suggests a “big tent” approach; you should get everyone at the table.  For instance, far too many open-data initiatives end at the release of the data.  The smart ones realize they are the scaffolding for larger efforts, and make a strong effort to convene non-profits, constituents, and the data makers to the table in order to encourage activity around the data.  Sometimes this looks like a hackathon that makes sure to invite lots of segments of society (ie White House hackathon). Sometimes this looks like a presentation of results back to the people the data is about (ie. Somerville’s ResiStat meetings).  There are lots of ways to involve those in power positions and those outside of them.

Learner Guided Explorations

Most data presentations are about as engaging as a conversation with your dentist! You kind of have to do it, but it’s booooring. Flipping the model invites your audience to find their own stories in the data. My Data Murals work does just that – our initial “story-finding” workshop shares a small portion of the data about a topic and then lets teams of participants find stories they want to tell.  Participants own these stories and advocate for them.  That is an empowerment story – our evaluations show people come away feeling more capable of finding stories in data, and are less intimidated by data in general.

Facilitation Over Teaching

In my Data Therapy workshops I use a number of activities for building visual literacy. All of these are ways to facilitate a discussion of data presentation, and build a shared language for describing data.  When data scientists introduce ideas they too often fall back on big words.  These words alienate those who haven’t studied data.  My first step is to use language a normal person would use.  Then I help the group construct their own language for describing data, which they fully understand.

Accessibility to a Diverse Set of Learners

I spent years designing interactive museum exhibits. Museums are the hardest setting I’ve ever designed for.  At a museum you know nothing about your audience; your object has to support 30 second interactions with a single person, but also 1 hour interactions facilitated by a knowledgeable docent.  This is hard.  Really, really hard.  Data presentations and activities need to be designed the same way. I address this by starting simple, and building to complexity.  In data presentations I do break into small groups and seed each with one person that does speak data to help the other folks understand technical issues.

Focus on Real Problems in the Community

This one is easy! Make the data you are working with or presenting relevant to the communities you are working with. In the workshops I lead in the Boston area, I use the Somerville happiness survey as my silly example data set.  I wouldn’t do that for a group of public health wonks (I’d use something from the WHO).  People are naturally inclined to be engaged about the community they live in – no need to introduce data from some far off community they have no relation to.

Is this Useful?

Ok, so I’ve made my argument – I see every dataset as an opportunity for engagement.  Engagement with the public, the people the data is about, the people whole collected it, everyone. If you’re reading this, it’s up to you to use a Popular Data approach to seize the opportunity for engagement a dataset gives you.  I find this framework useful for structuring my data presentations and workshops.  Let me know what you think!  Am I just naming something obvious? Am I being too academic?

crossposted to my Civic Media blog