Data is the blueprint of innovation. It is no longer a passive entity used for filling up the archives, but the most powerful tool utilised by organisations to make far-reaching, fact-based decisions. It is the platform on which realities are formed. Humans are generating large volumes of data daily by interacting with each other through various electronic channels. These may come in the form of purchase records from stores and retail outlets, phone calls, self-administered surveys, field observations, interviews and experiments.
Big Data has proven to be an asset to both tech and non-tech setups. In environments where information technology does not form the core of their business, Big Data is explored by carrying out predictive analysis to forecast future opportunities and risks. Telecom firms deploy this technique constantly to detect subscribers that are most likely to churn their network. Insurance companies rely heavily on Regression Analysis to estimate the credit standing of policyholders and a possible number of claims to expect in each period. In the banking industry, regression analysis is used to segmentize customers according to their likelihood of repaying loans.
Information management is vital for a Data Analyst to be able to organise data into an understandable and manageable form. This is also required to extract relevant and useful data from a large pool that is available and to standardise the data. With proper information management, data can be standardised in a fixed form. These Standardised data can then be used to find underlying patterns and trends. Data extraction is now faster and less cumbersome with the seamless combination of IoT(Internet of Things) and Big data. The value of data is revolutionary and increases by the day with companies working specifically to collect and sell data. Research shows that accurate interpretation of Big Data can improve retail operating margins by as much as 60%.
Machines are fitted with Algorithms and fed continuously with data to create insights from patterns that will emerge; this is how models are built. Machine Learning models are developed using either Supervised, Unsupervised or Reinforcement Learning algorithm.
A model is purely a mathematical function (created as a product of training) which can take instances of a dataset and predict their class membership. Here, several variables are fed into the model to learn from the “known” to the “unknown.” At this point, the resulting model is said to have “learned” the best way of making sense of the data.
Securing data poses a significant challenge in the world today. Our activities are monitored continuously and recorded, i.e. shopping, social media discussions, digital content consumption behaviour etc. by people we may not know.
Organisations are set up with the sole aim of gathering and trading data with partners who use these acquired data for business purposes. However, many infractions have been witnessed over the years.
In 2013, Edward Snowden a former employee of the NSA (US National Security Agency) leaked details of how supervised extensive internet and phone surveillance of citizens were carried out by the organisation without the public’s knowledge, which was a severe privacy breach. His disclosures revealed numerous global surveillance programs run by the NSA and the Five Eyes Intelligence Alliance (an anglophone intelligence alliance comprising Australia, Canada, New Zealand, the United Kingdom and the United States) with the cooperation of telecommunication companies and European governments, and prompted a cultural discussion about national security and individual privacy.
In July 2019, The United States Federal Trade Commission approved a fine of about $5 billion against Facebook for mishandling users’ personal information, in what would be a landmark settlement that signals a newly aggressive stance by regulators toward the country’s most influential technology companies. This case involved the connivance of the social media giants with Cambridge Analytica (a British political consulting firm) where the personal information of more than 87 million users was collected against their knowledge. It is claimed that this data collected from 2014 was instrumental in the 2016 US presidential campaign which led to the victory of Donald Trump.
Using data against the knowledge of the owner of the data is a breach of trust, yet it is done rampantly. It is therefore essential to create an appropriate data-driven world where privacy is given more priority.
Big Data is real, and it has come to stay. No nation or organisation can afford to shut its eyes and ignore this phenomenon. It requires a vast pool of talents and industries well positioned to join the ‘Big Data platform’ bandwagon. The phenomenon of the Big Data explosion has not only realised massive growth in data but also has stimulated ingenious innovations that will hold the world spell-bound forever.
The total number of new database platforms introduced in the last few years is mind-blowing. Though it may seem daunting to stay abreast with this rate of technological advancement, the innovation itself is nothing but encouraging for the data ecosystem. Investment in Big Data solutions has yielded an extensive range of technology options for anyone to consider, and yet it is still evolving.
In the words of Clive Humby, UK Mathematician and Architect of Tesco’s Clubcard, 2006 (widely credited as the first to coin the phrase): “Data is the new oil. It’s valuable, but if unrefined it cannot really be used. It has to be changed into gas, plastic, chemicals, etc to create a valuable entity that drives profitable activity; therefore, data must be broken down, analysed for it to have value.”
- Paschal Anuforo