The Power Of Big Data

BigDataWordBubbleWith the growing use of technology in recent decades, companies have gained access to a vast amount of information. “Big data” is the new buzzword in multinational corporations all over the world and people who can manage that data are in higher demand than ever before. Analysis of customer purchasing decisions and relevant social media trends gives companies a competitive advantage over their rivals.

Companies can use data in many ways. Netflix, for example, uses piracy statistics to determine which shows are the most sought after by the general consumer. In doing so, they gain insight into which new additions to their streaming collection may net the most attention and profit [1]. Similarly, Amazon uses customer data to simplify customer service calls – they’re able to eliminate the need for stating name, address, etc. and can predict what recent purchases may be giving you trouble [2]. Even Starbucks uses customer habits to select new store locations, leading to the phenomenon of successful locations within mere miles of each other. [3]

However, when companies go too far in their collection and usage of data, their strategies can backfire. Although most people know that their data is being collected in some manner, few truly care until a company strays into “creepy” territory. In 2012, Target received some flak after sending a high school student coupons for diapers and other baby products. Her father complained that the company was “trying to encourage her to get pregnant” before later realizing that she already was. The father later apologized to Target. It turns out the company used purchasing patterns to maintain a list of customers with a high probability of being pregnant and sent relevant ads in response. This list also contained due dates for each customer estimated through further analysis of purchasing history and information purchased from other companies. [4]

An analyst from Target responded to customer concerns by saying that they were following all privacy laws, but noted that “even if you’re following the law, you can do things where people get queasy.” In discussing the situation with the pregnant student, he revealed that the company has since revised it’s coupon distribution techniques and now mixes in relevant discounts with irrelevant ones. “We’d put an ad for a lawn mower next to diapers. That way, it looked like all the products were chosen by chance. As long as we don’t spook her, it works.” [5]

Target Logo

Target: As long as their advertising doesn’t spook you, it works.

This marks a change in how companies have to use their wealth of information on their customers. Nobody wants Big Brother looking over their purchasing or web browsing habits, especially when it’s a company trying to squeeze out some more profit. Everyone has had the experience of looking at one product on Amazon and having it follow you in advertisements all across the internet – the marketing stops being effective and just becomes downright creepy.

As such, these companies have to be careful in how much they reveal to their customers. Once a corporation is labeled as having disturbing data collection practices, the PR disaster can affect sales heavily.

Aside from data obtained in-house, access to real time analysis of social media can give companies advance warning of developments in virtually any topic. Dataminr, a “leading real-time information discovery company,” capitalizes on this brand new style of corporate analyis. They purport to be able to alert users about breaking news “5 to 10 minutes” before any conventional news source.

In the past, Dataminr has warned stock holders about an upcoming downfall in Apple’s stock prices after finding negative tweets about the company. They also managed to send alerts about the death of Osama Bin Laden nearly half an hour before a single news network caught wind of it. [6]

For a company, problems as absurd and unforeseen as Starbucks’ “red cup controversy” can pop up at any time.

Similarly, a company could use the service to warn about potential PR disasters and address them early. As social media has become popular, angry customers have increasingly turned to news outlets and social media to create a viral outrage in the hopes of receiving better service. Especially for the more popular users of social media, a bad review of a product (even made in passing) can be devastating for a company’s public image. The earlier a company can quell these complaints, the better for their public image.


IBM pioneers many advances in the use of big data analysis, particularly in the medical field.

Clearly, big data has great potential for growth in the near future. Imagine being able to pick up on the next big trend in the movie industry or an upcoming fashion fad. Such prediction tools might be a reality in the near future. Engineers at IBM have already developed methods to predict potentially fatal infections in premature babies. They monitor vital signs thousands of times per second to detect any deviations from healthy standards.[7]

This technique is an implementation of “predictive analytics” and it has great potential for corporations all over the globe. Computers are able to pick up on trends to predict huge breakthroughs or disasters where humans simply cannot. If this up-and-coming technology is used along with current analysis tools, it could give companies a “crystal ball” to the near future. Obviously, such a tool would immensely useful, but only time will tell if companies can become the fortune tellers of the modern age.

User Data: A New Commodity for an Interconnected Age

Earlier this year in a bankruptcy deal, Radioshack sold its assets for $26.2 million dollars.[1] Some of this was for the user data of 67 million of its customers. Initially, this included credit-card data, Social Security numbers, dates of birth, and phone numbers for 117 million people, but various courts reduced access to only seven of 170 fields of data.[2] Among these reduced fields are names, addresses, purchasing history, and email addresses.

In today’s world, such user data is well sought after by companies, and for good reason analysis of customer behavior can allow for more affective advertising, the creation of more successful product lines, a greater understanding of customer satisfaction, and much more. As a result, the purchase and sale of such data is more commonplace than ever before, especially on the internet. This has only been spurred on by the abundance of free entertainment available on the web – websites that provide content for free must make a profit somehow and it turns out that the cliché “If you’re not paying for it, you are the product” is a reality all across the web.

Lightbeam for Firefox

A screenshot of Lightbeam, an addon for Firefox that lets you see what third party sites you’ve connected to during your web browsing. After opening the frontpages of Fox News, Buzzfeed, CNN, and The Washington Post, we’ve been connected to 206 third party sites. White lines indicate new connections and purple lines indicate new browsing cookies. Click to enlarge.

For example, merely by accessing the frontpage of Buzzfeed, a user is connected to about 30 third party websites, including Google, Facebook, Twitter, Adobe, and over a dozen sites devoted to gathering user information for advertising purposes. Ten of these websites add cookies that continue to track a user’s web browsing habits long after they navigate away from the page (and many never bother to remove them). One of these sites belongs to Lotame, a company that allows anyone to sell user data and pay for “instant access to a pool of more than three billion cookies and a billion mobile device IDs.”[3]

This data may not contain personally identifying information, but it can certainly be traced back to a specific, anonymous user. Even without the use of cookies, nearly everyone has a fairly unique “fingerprint” through their web browser. Click here for a demo of this from the Electronic Frontier Foundation.

Beyond this “passive” tracking of users, a huge amount of information is available through social media. On Twitter alone, it’s estimated that about 500 million tweets are sent per day[4] (that’s 6000 every second!) and as such, the vast majority of them are never seen by anyone.[5] With access to the “Twitter Firehose,” an expensive developer tool, you gain access to the data on every single tweet. This allows a company to scrape the internet, searching for positive or negative reactions to certain products or a recent announcement. It also lets them preempt any possible PR disasters by starting damage control early.

Montly Active Users -- Facebook

Facebook is home to more than 1.5 billion active monthly users and has become a prime target for advertisers.

Monthly Active Users -- Twitter

Twitter’s now boasts over 300 million active users, all posting data that could be potentially valuable for companies.

Realistically, smaller companies don’t need to pay for the “Twitter Firehose.” Even though Twitter’s site only gives access to as little as 1% of tweets in realtime, scraping Twitter, Facebook, Instagram, Tumblr, and other social media sites can provide all the consumer opinions a company could ever wish for. After all, if billions of users are willingly putting their valuable data out in the open, why not capitalize on it?

