There are many definitions on BIG Data, ranging from the definition of Tim Kraska in which they consider the Big Data as the data class on the actual technology in use is not able to obtain in cost, time and quality responses to the exploitation of The same Going by the definition of the McKinsey Global Institute where it refers to the Big Data as the data set of excessive size the capabilities of the database applications to capture, store, manage and analyze them; Even reaching the IDC that focuses on obtaining data value by extending the concept of Big data to the set of new technologies and architectures designed to obtain value of large volumes and variety of data in a quick way, facilitating Its Capture, processing and analysis. Perhaps this last sea is the one that best represents the concept of Big data when putting technologies and data for the obtaining of value, characterizing the data by the volume, the variety and the velocity of the generation. The three v´s (3) For which have just been characterized by Big data.
The volume dimension is perhaps the most characteristic feature of the Big Data concept. Increased estimates of the data generated indicate unprecedented growth, due to the social networks and mobility that facilitate wireless networks and mobile telephony. This increase in data will determine a scale change from terabytes to petabytes and zetabytes of information, making it difficult to store and analyze. However much of this information according to the type of use, can happen to have a life cycle of its very short value, passing an obsolete very quickly. This type of valuation is linked to the Velocity dimension.
The velocity with data of the creeds has increased considerably, requiring an adequate response to its processing and analysis. This response velocity is required to cope with data obsolescence due to its rapid generation capacity, rendering obsolete what instants before was valid; Hence the distributed and parallel sea processing of technologies supported by the Big Data concept. On the other hand the need for a data analyst to identify for each application the data of its very short sea life cycle of a mayor life cycle, determination as fundamental when renting and optimizing the appropriate use thereof Increasing the accuracy and quality of the results.
Variety in Big Data is based on the diversity of data types and the different sources of data collection. Thus, data types are structured, semi-structured or unstructured, and their sources come from text and image files, web data, tweets, sensor data, audio, video, click streams, log files, etc. . The wealth that the Big Data concept entails. However, this potential wealth increases the degree of complexity both in its storage and in its processing and analysis.
One of the characteristics associated with data quality is the veracity of the data. Truthfulness can be understood as the degree of trust that is found in the data of a use. Within the characterization of the large data The determination of the truth in its fourth dimension, and is of great importance for a data analyst, since the veracity of the same determines the quality of the results and the confidence in them. Therefore a high volume of information that creates a very fast speed and based on structured and unstructured data and coming from a great variety of sources, make it inevitable to doubt the degree of veracity of the same. Therefore, depending on the application that is given, their veracity may be essential or become an act of trust without becoming vital
From the point of view of harvesting and exploitation, the Value dimension represents the most relevant aspect of the big data. It is observed that as the volume and complexity of the data increases, its marginal value decreases considerably, due to its difficulty of exploitation. (There should be a graph here which I find impossible to upload).
Marginal value of the data
Facilitating the exploitation of data to obtain value remains the fundamental objective of Business Intelligence and now of Big Data technologies. Increasing the marginal value of data is one of the current challenges from the point of view of technology, t in a fast, immediate and precise way ahead of the competition. So the evolution of the dimensions of Big Data passes through an academic interpretation of three dimensions (volume, variety and speed), a view of the analyst where the truth of the data is presented as a fundamental dimension facing the quality of Results, to the vision of the manager where the interpretation of the value becomes basic face to the decision making.
Finally, social networks, coupled with the immediacy of wireless networks and mobile telephony, new cloud storage services, etc., have led to an increasing volume of data, and very fast , Coming from few or many sources of information, whose truth is difficult to verify, and whose validity time may not be very great. Given these types of scenarios, as evidenced by the experience of Internet-based companies, getting to see them, not as a difficulty, but as a competitive advantage is one of the current challenges of implementing the technology associated with the Big Data concept .