MIT researchers have created a new big-data analysis system that can outperform most human teams.
Big-data analysis is driven by a search for patterns that have some sort of "predictive power," but choosing which features of the data to analyze has always required human intuition. For example, in a database containing the beginning and end dates of sales promotions and weekly profits the important information could be the spans between the dates as opposed to the dates themselves. This new system has the ability to search for patterns and design feature sets.
To test their "Data Science Machine" prototype, the researchers enrolled it in three data science competitions that allowed it to compete against humans. Out of the 906 teams involved in the competitions, the computer system performed better than 615 of them. In two of the three competitions, the computer's predictions were 94 percent and 96 percent as accurate as the leading submissions. It took most data teams months to complete data analyses that were completed by the computer in between two and 12 hours.
"We view the Data Science Machine as a natural complement to human intelligence," said Max Kanter, whose MIT master's thesis in computer science is the basis of the Data Science Machine. "There's so much data out there to be analyzed. And right now it's just sitting there not doing anything. So maybe we can come up with a solution that will at least get us started on it, at least get us moving."
The findings were presented at the IEEE International Conference on Data Science and Advanced Analytics.