Little thoughts on Big Data

Bryan Dollery

by Bryan Dollery

The amount of data we collect increases exponentially, year after year. This made us wonder: does bigger data really mean better data?

Everyone is excited about big data, and there is reason to be. Big data can mean big potential in the right hands, hands capable of sifting through huge volumes of info before fishing out anything useful. There are two main reasons why we should be thinking about collecting the right data, rather than just collecting more:

Less could be more

On average, only 5% of collected data gets analysed. It’s pretty safe to assume that we’re missing out on something useful lost amongst that 95%. Not only that, but think of the wasted resources used collecting that data in the first place. This month saw an article from the Credit Union Journal, pointing out that “the downside to collecting big data is the inability to effectively analyse the information”. It's almost refreshing to hear this kind of thinking amidst the hype around big data and its value. So, what can be done?


Visualise a big data framework. Doing this will give you a strong idea of what kinds of data there are, and which ones are relevant to you. This example from Ivey Business Journal shows four areas, each of which falls into two of four categories: Non-transactional; Transactional; Measurement; and Experimentation. Understanding which types of data are valuable to you is key when forming your data collection strategy.


The internet is slowing down

While trackers are valuable tools for collecting data, people are looking for ways to speed up their browsing experience, and they’re turning to tools like Ghostery. The free plugin lets users disable selected trackers, and currently has millions of users and a database of over 2,000 trackers. The aim of the tool, however, doesn’t advocate an “all or nothing” approach. With the functionality to blacklist or whitelist individual trackers and entire sites, the idea is simply to give users more control. So not only could an overabundance of trackers hamper your site’s performance, it could also mean that users in the know might turn them off altogether.



  1. The average website size has doubled in three years.(1)
  2. In mid-2015, over 90% of the world's data had been generated in the previous two years.(2)
  3. Around 0.5% of all data is currently ever processed.(2)





An interesting approach to the problem of trackers is emerging in the form of a browser called Brave. The idea was cooked up by the same mind behind Firefox, Brendan Eich, who is setting out to improve the system of the ad-funded web. Currently in its beta stage, Brave blocks trackers and ads altogether, greatly improving overall browsing speed. Eventually, Brave will collect and offer history and usage data to advertisers and publishers, meaning better targeted ads for the user coupled with faster load times. If Brave catches on, it could be equally appealing to advertisers and end-users alike.

While the amount of data we produce continues to increase, we’ll be keeping a look out for any trends on making the collection of that data more efficient. Watch this space.