Methodology
Here are the basic steps I took to build this recommendation engine. Initially I wanted to search Twitter for everyone who mentioned SXSW over the course of the festival, but was limited by the number search results returned (1500). So instead I searched Twitter for SXSW plus each of 1,984 bands listed on sxsw.com.That resulted in 13,735 unique Twitter users 34,569 messages with bands being mentioned 41,337 times (some messages reference more than one band).
Often times I forgot to add SXSW to a message, so I didn't want to limit myself to just these messages. So next I searched Twitter and got all messages during SXSW for each of the 13,735 Twitter users. This took quite a while, I ran the search at night to (hopefully) limit the impact on Twitter. This gave me over a million messages. Next I ran searches across those million messages for each of the bands. End result has 371,501 messages referencing SXSW bands 470,575 times. Messages not referencing any bands are being ignored by the Recommendation Engine.
Limitations
There are a few limitations with my process.
- * All band references are considered equal. So if someone likes band X and does not like band Y they are still considered related within the Recommendation Engine
- * Mis-spellings or incomplete band names are not represented. So if someone didn't type out Shout Out Out Out their reference is currently ignored.
- * Anyone who didn't include SXSW with any bands they twittered about are left out, but I figure this scenario is pretty rare. Unfortunately Twitter doesn't allow cross-message searching, at least not that I'm aware of.
- * Retweets are currently counted twice. I'm looking into removing retweets.
- * The cell phone networks were severely bogged down during SXSW which left me retrying some messages, and sometimes inadvertently posting the same message multiple times. Just like with retweets, I'm going to look into removing duplicates.
Coming Soon:
In the next 2 weeks you'll be able to search by band hometown, such as Austin, TX or Portland, OR.
You'll also be able to browse Twitter messages by user, so if you like what someone has written you can jump directly to what they've written about other bands.