It's safe to assume that most people agree that alcohol and tweeting is probably not a good combination, but researchers at the University of Rochester have examined this practice even further and created a machine-learning algorithm that can spot alcohol-related tweets. The team, lead by researcher Nabil Hossain, hopes that their creation will have a positive impact on the way we understand and respond to alcohol-related pubic health issues and activities.
The algorithm was trained to pinpoint tweets that relate to alcohol, as well as those sent by people drinking at the time, allowing it to find Twitter users' home locations with a degree of accuracy never before accomplished, ultimately determining if the they are drinking at home.
Hossain and his team gathered geotagged tweets sent throughout the year up to July 2014 from New York City and Monroe County on the northern border of the state and separated the tweets that were alcohol-related from those that weren't. Afterwards, they enlisted the help of workers on Amazon's Mechanical Turk crowdsourcing service in order to examine the tweets in more detail and determine if the tweeters were drinking and if they sent messages at the same time as their alcohol consumption.
Using approximately 11,000 geolocated tweets that were deemed to be associated with alcohol, the team trained a machine-learning algorithm to spot these kinds of tweets on its own. They then determined a list of words and phrases that were likely to be used by tweeters at home and combined this information with the previous alcohol-related findings in order to help the algorithm determine when and where people are drinking.
The results of the study reveal that people in New York City send alcohol related tweets more often than those in Monroe County.
"One possible explanation is that a crowded city such as NYC with highly dense alcohol outlets and many people socializing is likely to have a higher rate of drinking," the team said.
In addition, the team found that more people in New York City drink at home than in Monroe County, where people are more likely to drink more than a kilometer from their home.
The algorithm results also revealed patterns in alcohol usage that the team hopes can be of help in identifying regions with unusual drinking activities and alcohol consumption.
"Such analyses can teach us who is and isn't referencing alcohol on Twitter, and in what settings, to evaluate the degree of self-reporting biases, and also help to create a tool for improving a community's health, given social networks can become a resource to spread positive health behavior," the team concludes.
The findings were published in the March 10 issue of the pre-print server arXiv.