What follows is one of the project/topics that I have the most fun with. Today being election day, I decided to pull some quick data on elections via Twitter. Some interesting facts, I started pulling tweets containing the word “election” at 6:29 am central, ending at 3:43 pm central. there was a 40 minute gap around 9:00 am where I don’t have any data, in the end we have about 8.5 hours of tweets total.
In that time there were 85,457 tweets from 63,440 people, indicating vast majority of folks made one tweet. There were 44 accounts that made more than 20 “election” tweets and 2,000 accounts that had more than 4 tweets.
Moving on to the tweets themselves, they contained around 1.5 million words. Looking at some of the more popular words (outside of things like the, and, etc.)
The Democrat/Republican numbers are an interesting analysis themselves, as mentioning the term in no way indicates vote preferences (if this does lots of folks are going to be VERY surprised later this evening)
In total we count 93,730 different words used in these tweets. Course that includes things like links, each mention as a word, mis-spellings, etc..
In the end, a quick fun project on election day. Most of my work of this nature has been around agriculture and agvocacy efforts, I think the real power lies in the massive amount of data that social media makes available, and the speed and ease at which it can be processed by personal computers. No longer is “data analysis” something confined to deep dark corners of large corporations or government agencies, but fitting for the new “open” web world we live in, available as well to simple Kansas farm folk like me!