Tuesday, September 9, 2014

Considering Data

In Asimov's Foundation novels, the character Hari Seldon develops a new science called 'psychohistory' which allows him to predict the future for large scale events. When I first read those books, they seemed fantastical - no closer to reality than Gulliver's Lilliputians. In 2008, Nate Silver's fivethirtyeight.com called not only the presidential election, but every Senate seat. How did he do it? Data. Months before the election Silver had a statistical model that proved eerily accurate and weeks before the election had predictions that would give even Asimov goose-flesh.

Where does fivethirtyeight.com think the next generation of data miners will come from? UCLA's DataFest. Give students 48 hours locked away with data and have them try to make sense of it. To sweeten the deal, have the data owners present.  To quote fivethirtyeight.com's article, “It’s really rare to get really current data actually being used in the real-live corporate world, so that makes it really special,” Robert Gould, DataFest’s founder and a professor of statistics at UCLA, the host for this weekend’s event, said in a phone interview. “Somehow it’s just not that thrilling for the students to learn all we’ve done is point them to a public data set. There’s something really special to have someone who owns the data present the challenge. It makes the students feel they’re being paid attention to and listened to.

What a fantastic idea.  Create a relatively low pressure environment, provide loads of data, and just see what undergraduate students can make of it.  I've been working with data for so many years, and in such predictable ways - I wonder sometimes if I can still see outside of the box.  Big data can show us so much, for good and bad.  We've all read the story of Target figuring out a teen was pregnant before her father even knew after reviewing data on what newly expectant mothers were likely to buy.  That's the creepier side of big data.  Imagine aggregating statistics on childhood cancers, incidents of asthma, the spread of disease, or electricity usage in a region.  The next few years promise to be exciting times.
 

No comments:

Post a Comment