The Ethics of Data Science: Where is the Line Drawn?
(Photo : Pexels)

Data science is a booming industry. The market for big data is estimated to reach $103 billion by 2023, with over 97% of companies investing in the technology. Users create 2.5 quintillion bytes of data daily - approximately 1.7 megabytes per second, per person.

Data science - the field of studying and analyzing that data - is on the rise as well, with data scientists being #3 in the US and UK in terms of job demand, with a massive 37% annual growth, and enrollments in data science degree programs have steadily gone up in recent years.

But data science can have a dark side as well. In the wrong hands, or if used poorly, big data can create ethical problems, which leads to the question - where is the line drawn?

The Rise of Big Data

Data is being generated and collected faster than ever before. 2.5 quintillion bytes of data are produced daily, and are gathered by companies from sources like social media networks, online streaming media, transactional data from online purchases, geographic data from mobile phones, and a myriad of other sources.

Benefits of Data Science

Because so much of this collected data comes in raw, unorganized form, the capability to analyze and glean meaning from it is critically important. Data is useless until it's organized and rendered understandable - and that's where data science comes in.

Data science is the process of mining large datasets and identifying hidden patterns, in order to gain insight, make predictions, and inform future decisions based on that data.

Data science is used to target and understand the behavior of customers to create predictive models for future behavior. Data scientists can use similar models to optimize political campaigns. Data science can also help optimize business processes like stock market investment, supply chains and delivery routes, or talent acquisition.

Big data and data science have also had a profound impact on health, from the implementation of personal health technology like fitness trackers, to the ability to decode DNA strings, predict patterns of disease, and monitor phenomena like flu outbreaks in real time from existing medical records and social media analytics. Big data is also at work in places like the CERN Large Hadron Collider, where the massive amounts of data generated are analyzed by thousands of computers.

Ethical Questions Around Data Science

There are plenty of benefits to data science, when it's used properly. But there are also some serious ethical concerns:

Who owns your data? There's a common assumption that an individual's data "belongs" to that person, but in practice, that isn't always the case. Under to EU law, personal data can't truly be "owned." Users are often informed of what rights they have pertaining to their data - sometimes hidden in the EULAs few people read.

Privacy and security concerns. There are also questions about what data is collected, whether or not we're informed about its collection, and how secure that data is. In theory, most companies offer users the chance to give informed consent about their data, but in practice, users may not know what they're giving away.

Manipulation of public opinion. Big data, especially within the framework of social media, can also shape public opinion in unforeseen ways, such as during the 2016 Presidential election.

Manipulating information presented to users. In 2014, it came to light that Facebook was flatly manipulating its users' emotions, experimenting on the populace without their knowledge or permission and using that data to inform business decisions.

Sharing of data with third parties. Many services dutifully inform users of any third-party sharing, or offer the user a chance to opt out - but this behavior isn't universal, and that opting-out may only apply to certain data and not the whole.

Stolen and misused data. Last but not least, there are valid concerns about what happens to your information in case of a data breach, or if it's sold or given to an unethical third party.

Laws and Ethics in the Age of Data Science

So what protections are in place for users and their personal data? There's the aforementioned informed consent, which many companies offer in the form of a license agreement or simple checkbox in an app or piece of software. A good example of this are the "accept cookies" prompts now seen on websites everywhere; a phenomenon created by the passing of the CCPA (California Consumer Privacy Act).

But not all data is collected with consent. Public surveillance cameras, tracking of mobile phones, data collected from social media trends, etc. don't typically offer a chance to opt out of one's data being used, sold, or given away.

Further, not all sensitive or private data is handled ethically by companies. A more infamous recent example is that of ride share company Uber's "god view", which allowed the private data of customers to not only be tracked, but used for entertainment at company parties.

Privacy and data laws have come a long way toward addressing some of these ethical concerns, but there is still much work to be done, and the companies that profit off the ethically questionable handling of big data are unlikely to change unless forced.