Searching for Sadness in New York: Is the Foursquare API Living Up to Its Potential?
As explained in this blog post, Foursquare needed a way for its business staff to run reports based on its data without slowing down production servers and without learning technologies such as Scala and MongoDB. The company decided to make its data available to business staff through a Hadoop cluster hosted by Amazon Web Services. Foursquare’s data miners could then query it using Hive, which provides a SQL-like query language for Hadoop.
As a proof-of-concept the company has produced a report on the rudest cities in the world, based on the number of tips that contain profanity. Which is pretty cool (apart from the assumption that profanity use = rudeness). But it makes me realize just how under-utilized geolocation APIs are.
Here are the results of Foursquare’s profanity-mining:
And here’s how Foursquare’s data analysis system works:
Some more practical applications, from a business standpoint, for data mining staff might include determining:
Which venues are fakes or duplicates (so we can delete them), what areas of the country are drawn to which kinds of venues (so we can help them promote themselves), and what are the demographics of our users in Belgium (so we can surface useful information)?
Of course, this sort of check-in data is solely in the hands of Foursquare’s internal use. But it makes me wonder whether you could pull together information like this through the Foursquare API if you built your own data warehouse for analysis.
I wonder what services like Fourwhere (which we covered here) could learn if they cached all the data they retrieved from various APIs and ran sentiment analysis on it. What could MisoTrendy (coverage) tell us about a venue based the long-term trend patterns? Is there something in Foursquare’s terms of service that prevents people from doing this? I guess we’re back to that old question what would you do with the massive data sets from persistent location tracking?
This feels like it could be the first steps towards accomplishing what was described in the opening lines of the Headmap Manifesto:
there are notes in boxes that are empty
every room has an accessible history
every place has emotional attachments you can open and
you can search for sadness in new york
- Following In Foursquare’s Footsteps, Yelp Rolls Out Check-In Offers
- As Polls Close, Foursquare Reaches 50K Voting Venue Check-Ins
- Foursquare Hits 5 Million Users
- Foursquare Now Six Million Users Strong, Hit 381M Total Check-Ins In 2010
- Apple is Using Hadoop to Analyze the iOS Experience, Power iAds and More