Big Data Value Chain
Posted: May 5th, 2020
Big Data Value Chain
Companies and organizations use software and computer applications to store and retrieve data sets that are way too voluminous for traditional databases. Big data has great potential to resolve huge problems within an organization hence creating transformation benefits. Big data value chain gives a description of the flow of information within any system of big data as steps which required in the generation of value from the data. For big data to be effective, organizations have to be careful with the technical procedures that have to be followed within the value chain.
The first step of big data is big data acquisition which is the process of gathering, filtering, and cleaning the data prior to storage. The process is guided by the volume, velocity, value, and variety of the data. In most scenarios, assumptions are made of the data to be of high volume, high variety, high velocity, and of low value which makes it essential to have meaningful gathering, filtering, and cleaning up of algorithms. The procedure ensures that only data fragments that are high vale are processed. For the acquisition to be effective, the infrastructure supporting it has to deliver low and predictable latency in capturing and executing problems and must be able to handle large data volumes in a distributed environment.
Second step is data analysis which involves inspection, cleansing, and transformation of raw data with an aim of gathering meaningful information that can be use in informing conclusions and making decisions. Analytical and statistical tools are used in the analysis where some of the methods used include data mining, text analytics, data visualization, and business intelligence. Data mining is a technique based on knowledge discovery for prediction purposes. Data visualization is basically crating and studying data that has been visually represented through statistical graphs, information graphics, and plots. Business intelligence makes a coverage of the analysis that heavily relies on aggregation whose main focus is business related information.
The third step is data curation which is all about management of the data to ensure that it is sufficient and that it meets the requirements needed for usage. A variety of activities are involved such as classification, validation, transformation, and preservation of the data. Data is managed through out its lifecycle from creation point to deletion. The main aim of curation is to ensure that data is easily available for retrieval when need arises. Data curators are responsible for trustworthy, accessible, retrievable, and reusable data.
The fourth step is data storage which involves consistence and management of the data in a manner that meets the needs of the applications that in need of quick data access. For over forty years, Relational Database Management Systems (RDBMS) has been the major solution to data storage. Data storage integrates both software and hardware systems which includes information found in applications, data warehouses, back up devices, archives, and in cloud storage.
The last step is data usage which involves the business nature of data activities that require access to the data, its analysis, and the tools useful in the integration of the analysis of the activities. Decision making in businesses makes requires usage of data to improve the competitiveness of the enterprise. Decision making may be used to decide on cost reduction, increase in added value, and other parameters measurable against performance criteria.
In conclusion, the big data value chain is useful in describing the flow of data within the bid data system for the purpose of generating meaningful insight.
Reference
Curry, E. (2016). The big data value chain: definitions, concepts, and theoretical approaches. In New horizons for a data-driven economy (pp. 29-37). Springer, Cham.