Big data - the new challenge facing business

Today there are many sources that create large volumes of data. On the one hand, these are the people themselves, who are able to generate information on their own using the Internet and social networks. On the other hand, with the advent of the Internet of Things it is not only users that generate...

Full description

Saved in:
Bibliographic Details
Published in:Економічний вісник Донбасу
Date:2016
Main Author: Tashkova, M.
Format: Article
Language:English
Published: Інститут економіки промисловості НАН України 2016
Subjects:
Online Access:https://nasplib.isofts.kiev.ua/handle/123456789/114921
Tags: Add Tag
No Tags, Be the first to tag this record!
Journal Title:Digital Library of Periodicals of National Academy of Sciences of Ukraine
Cite this:Big data - the new challenge facing business / M. Tashkova // Економічний вісник Донбасу. — 2016. — № 4 (46). — С. 164–167. — Бібліогр.: 7 назв. — англ.

Institution

Digital Library of Periodicals of National Academy of Sciences of Ukraine
_version_ 1859829298908626944
author Tashkova, M.
author_facet Tashkova, M.
citation_txt Big data - the new challenge facing business / M. Tashkova // Економічний вісник Донбасу. — 2016. — № 4 (46). — С. 164–167. — Бібліогр.: 7 назв. — англ.
collection DSpace DC
container_title Економічний вісник Донбасу
description Today there are many sources that create large volumes of data. On the one hand, these are the people themselves, who are able to generate information on their own using the Internet and social networks. On the other hand, with the advent of the Internet of Things it is not only users that generate information, but also most of the devices that are used daily. The power of big data technologies involves working and making business decisions using the maximum amount of data. The more data, the more substantiated analyses and forecasts. The challenge now is - what technologies these large volumes of data could be managed with. Сьогодні існує багато джерел, які створюють великі обсяги даних. З одного боку, - це самі люди, які здатні генерувати інформацію самостійно, використовуючи Інтернет і соціальні мережі. З іншого боку, з появою Інтернету речей ними стають не тільки користувачі, які генерують інформацію, але й більшість пристроїв, які використовуються користувачами щодня. Сила технологій переробки великих даних передбачає роботу і прийняття бізнес-рішень, використовуючи максимальну кількість даних. Чим більше даних, тим більше обґрунтовані аналізи і прогнози. Виклик полягає у такому: за допомогою яких технологій переробка цих великих обсягів даних може здійснюватися. Сегодня существует множество источников, которые создают большие объемы данных. С одной стороны, это сами люди, которые способны генерировать информацию самостоятельно, используя Интернет и социальные сети. С другой стороны, с появлением Интернета вещей ими становятся не только пользователи, которые генерируют информацию, но и большинство устройств, которые используются пользователями ежедневно. Мощь технологий обработки больших данных подразумевает работу и принятие бизнес-решений, используя максимальное количество данных. Чем больше данных, тем более обоснованными будут анализы и прогнозы. Вызов заключается в следующем: с помощью каких технологий переработка этих больших объемов данных может осуществляться.
first_indexed 2025-12-07T15:31:50Z
format Article
fulltext M. Tashkova 164 Економічний вісник Донбасу № 4(46), 2016 UDC 311.21:004 M. Tashkova, PhD (Economics), Head Assistant, D. A. Tsenov Academy of Economics, Bulgaria BIG DATA – THE NEW CHALLENGE FACING BUSINESS It is almost certain that in the following years Big Data will be among the leading information technolo- gies when discussing IT trends. Moreover, it is one of the pillars of the ‘third platform’1 as defined by IDG. While in a sense this term is new, large volumes of data are not infrequently handled in computer processing. Companies are aware that data – internal and external – is an important source of self-knowledge, which would help them to improve their business processes and productivity. Therefore, they are looking for technolo- gies that will enable them to collect and analyze this data. This makes the issue of information technologies designed to manage large volumes of data both topical and important in the future. Specifics of Big Data The term ‘Big Data’ has been in use quite recently. ‘Big Data’ was first mentioned in 1997 in the paper ‘Ap- plication-Controlled Demand Paging for Out-of-Core Visualization’, presented at the eighth Conference on Visualization organized by IEEE2. In a very short time it gained popularity, but big data and related technolo- gies became popular after 2008. The catalyst of this pop- ularity is the thematic issue of the Nature Journal in 2008, which was entirely devoted to big data. The initi- ative was followed by a number of prominent journals which came out with separate issues devoted to Big Data3. In 2011, Gartner analysts emphasized Big Data in their traditional report. In it they noted that big data technologies focus attention, but even the IT industry it- self cannot predict what potential lies in them. Several years later, these technologies turned from emerging into promising for the world of technology. Today there are many sources that create large vol- umes of data. On the one hand, these are the people themselves, who are able to generate information on their own using the Internet and social networks. An 1 The ‘Third Platform’ concept was presented by the IDG analysts in 2012 and the aim was to highlight the global trans- formation of information technologies. The four technologies gaining popularity – mobile applications and devices, cloud ser- vices, analyses of large volumes of data and social networks – underlie the platform. 2 The idea of ‘Big Data’, however, emerged considerably earlier. In 1975 the Japanese Ministry of Posts and Telecommu- nications started a quantitative study of the information flow in Japan, but the idea for this quantification was proposed as early as 1969. 3 These were CACM (2008); TheEconomist (2010); Science (2011). 4 McKinsey& Company, Big data: The next frontier for innovation, competition, and productivity [Electronic resource]. – Available at: http://www.mckinsey.com/insights/business_technology/big_data_the_next_frontier_for_innovation. 5 ЕВ – exabyte of data, where 1 EB =1018 B. 6 The Library of Congress of the USA stores 235 TB of data. example of how much information is created today and at that globally is the fact that the information generated all through 2000 was less than the information generated in one minute during the past 2016, and nearly 90% of the files available in the world were created in the last two years. On the other hand, with the advent of the Internet of Things it is not only users that generate information, but also most of the devices that are used daily. Sensors are a great source of data. Only in half an hour the sen- sors of jet engines generate about 10 TB of data. Roughly the same data streams are generated by sensors installed on drilling oil rigs. In order to make the above stated more vivid, it should be added that worldwide4: − corporate users store nearly 7 EB5 of infor- mation and personal users – 6 EB annually; − 30 billion new sources of information are pub- lished in the social network Facebook each month; − one of the leading sources of big data is the Twitter short message service, through which data of about 8 TB per day is created, despite the restrictions on message length (140 characters); − the mobile operators networks worldwide serve more than 5 billion phones and manage the audio-video stream generated by them; − only US companies from 15 sectors create data that is larger than the data in the Library of Congress6. All this logically leads to the question – what is meant by the term ‘big data’? A simple question with many answers, since different users see ‘big data’ dif- ferently. For one company, a large volume is 10 TB, while for another it may be 100 TB. And does only the volume determine whether certain data is big? In order to give a simple definition of the nature of big data the paradigm Big Data ‘5V’ is usually referred M. Tashkova 165 Економічний вісник Донбасу № 4(46), 2016 to. According to it data that has simultaneously at least two of the following specific features1 is considered big data: − Volume of data. The possibility to handle large volumes of information is a key feature of Big Data, and the sources for generating information are different – business transactions, social media, information from sensors, data transmitted from machine to machine; − Velocity of data accumulation. No less im- portant parameter in big data is the frequency of their creation and change, as well as the speed of processing them and obtaining the results in near real-time; − Variety of data. Big data is not always struc- tured. It can also include video, audio, email, unstruc- tured documents, messages from social services and me- dia. That is why organizing it in relational databases is in fact already very difficult; − Veracity of data. This refers to the purity and authenticity of the information generated. Big data must have the necessary reliability in order for the analysis to be accurate. This is achieved by pre-filtering of data to overcome the noise and anomalies in it; − Value of data. This feature renders how effec- tive data processing is in relation to investments made since the realization of infrastructure, the systems for storage and processing of big data are a relatively large expense for companies. In a synthesized form, however, big data could be defined as hardware and software products for pro- cessing huge volumes of structured and unstructured data, which are characterized by great diversity. Using different methods of processing and analyzing data new hypotheses and models are discovered, which could be used effectively to optimize the business processes of companies. IT for big data management It is a fact that large volumes of data are generated today and this is an upward trend. The challenge now is – what technologies these large volumes of data could be managed with. The popular technologies for handling big data involve: • NoSQL databases NoSQL technologies gained recognition after the large IT manufacturers – Google with Big Table; Face- book with Cassandra; Amazon with SimpleDB; Twitter and SourceForge with MongoDB; Yahoo with Sherpa; Adobe with Hbase – got interested in them. These IT manufacturers practically need to overcome the limita- tions of relational databases and to move to a new model of data organization and storage, as unlike relational da- tabases, NoSQL is characterized by: 1 The paradigm passed through the 3V and then 4V stages, and in less than ten years two new features characterizing big data were added. In its study ‘Big and open data: A growth engine or a missed opportunity’ the Warsaw Institute for Economic Research added these two new features in order to measure the need for the ‘Big Data’ accumulation in the EU economy. − absence of a predefined schema. With NoSQL there is no need to know in advance what data will be stored in the database and hence it is not compulsory to create a schema. This allows NoSQL to maintain dy- namic data changes as it is not tied to a specific struc- ture; − easy scalability. NoSQL has distributed and fault-tolerant architecture as the information sites are stored on several servers. The advantage of NoSQL is that it can easily be expanded horizontally by adding ad- ditional servers without using additional logic applica- tions for this; − finer control over available information. According to their purpose NoSQL databases can be: − Key-value. These databases provide generally one operation – retrieving a unit value through its key. − Column. In these databases the records are stored in columns rather than in rows as in relational da- tabases. − Graph. Here each element can be connected to an unlimited number of relationships; − Document. In document-based databases the values in the primary key value are referred to as a doc- ument. An identifier is used in place of the key. NoSQL databases share common principles, but solve various needs. Therefore a certain NoSQL is as- signed to perform one or more specific tasks, but the core functionality is implemented with a relational data- base or another NoSQL database. • Apache Hadoop The Hadoop platform is designed to organize dis- tributed processing of large volumes of data, and uses the model of separation and collection, i.e. each task is divided into smaller parts and each part of the set is per- formed on a separate node of the cluster. Hadoop is writ- ten in Java and consists of different components. The main ones are: − MapReduce is a combination of two interre- lated functions performing data processing on a particu- lar totality. First the Map-function is performed, which reads data from an input file, performs the necessary fil- tering and transformation, then generates a set of input records consisting of data and its assigned keys. Each of the Map programs operates independently of the others on its node in the cluster. Its task is to retrieve data, search and sort. Reduce has the opposite task – to unite, summarize, filter or modify the data processed and to record the results. − Hadoop Distributed File System (HDFS) is a distributed file system for data storage. HDFS is de- signed to store very large files with streaming access M. Tashkova 166 Економічний вісник Донбасу № 4(46), 2016 data patterns. It is also designed to work on clusters of inexpensive hardware, ensuring data availability with- out losses, even when some servers do not work. − Pig and Hive – SQL-like languages for con- structing MapReduce applications on large volumes of data. Pig requires MapReduce and HDFS, while Hive – Data Warehouse and HDFS. The use of Hadoop technologies in handling large volumes of data reduces the time for their processing and their equipment costs, increases sustainability and performs painless horizontal scalability. Applied aspects of big data in business A year ago, the major users of big data technolo- gies were IT giants such as Facebook or Yahoo, willing to analyze their Web users’ routing information. Today these technologies are used by any company success- fully positioned in the market. Modern electronic users readily leave their personal data in the digital space. Their purpose is to obtain certain information or to gain access to the desired content. Through forums, social networks, online polls, they make their likes and dislikes public. Using this data by means of the big data tools allows for carrying out various business analyses and forecasts. The major economic industries that are in- creasingly interested in the possibilities of big data at the moment are: • commerce. Or rather retailing in realizing sales to the end user. Here, the use of big data is directed to: − automatic forecasting of consumer demand; − optimal planning of promotional campaigns; − conducting an effective pricing policy and mar- keting strategies; − responding to any market fluctuation if neces- sary. • banks. In banks big data is used in: − assessing the creditworthiness of a bank’s cus- tomer; − offering banking products personally; − receiving information promptly; − preventing suspicious transactions. • telecommunications. Here big data is used for: – customer segmentation; – studying the preferences and assessing the prof- itability of different user groups; – managing customer loyalty. The power of big data technologies involves work- ing and making business decisions using the maximum amount of data. The more data, the more substantiated analyses and forecasts. In this regard some good prac- tices in the use of big data can be presented. • Amazon. Creating and changing business mod- els. 1 The grounds for this are the patents, which are registered by Amazon. 2 Weaknesses of random sampling – up to 3 % error; lack of detail. 3 Data warehouse (DW) is often called online analytical processing (OLAP). Until recently shopping from Amazon was the typ- ical online shopping. The power of big data, however, made it possible to offer customers an electronic shop assistant through which information about each transac- tion with each buyer is stored. This information is used in real time to identify and offer what the customer needs. Amazon is expected to provide a service for pre- delivery of goods1. It is used to send goods for which the Big Data analyst has forecasted that the customer will also pay. • Facebook and Google. Data collection and analysis. Facebook and Google have turned big data collec- tion and analysis into business models. Both giants use the available data to attract analysts and advertisers. Much of the revenues of the companies are namely from advertising and sale of data to carry out market research. The difference between the business models of the two IT giants is in the data source. With Facebook this is the personal information posted by users in the social net- work. With Google the data is from the free services that the corporation offers its users such as GoogleSearch, Gmail, YouTube, GoogleTalk, GoogleDocs, Google+, GoogleMaps. Conclusion The problem is not that organizations create huge volumes of data, but the fact that they cannot use them optimally. A large part of this data is not structured and the traditional relational databases do not have tools to process them effectively. Taking into account the fre- quent data update, it could be concluded that an alterna- tive to the traditional methods of data analysis today is the technologies to manage big data. Knowing and using those leads to the following advantages: • the possibility to analyze huge volumes of in- formation, unlike the currently known approach to draw conclusions on limited data (excerpts)2; • management of information in its actual state and not of purified (ideal) data3; • finding out the interdependences in the data and not looking for a specific causality. Big data is and will continue to be important for the business and the IT sector. The combination of social data, mobile applications and CRM records, allows to create forecasts and successful models of corporate be- haviour. Using the technologies for processing big data, companies get the IT tools which make it possible to: • analyze large volumes of data related to: – profitability and customer behaviour; – operational analyses. • create predictive models for: – the customer’s market interests; M. Tashkova 167 Економічний вісник Донбасу № 4(46), 2016 – the volumes of production and sales of goods and services; – the creditworthiness of corporate and personal customers. • optimize business processes based on predic- tive models. References 1. Atanasova, G. et al. Tehnologichni aspekti na modela ‘3Vs’ za predstavyane na golemi obemi ot danni. // Scientific Works of the University of Ruse, 2014, Vol. 53, series 6.1. 2. Krasteva, N. Platformata Apache Hadoop v1.0 priema predizvikatelstvoto na ‘go- lemite danni’ // CIO, Issue 2-2012. 3. Leboeuf, K. The 5 vs of Big Data: predictions for 2016. 4. http://www.ex- celacom.com/resources/blog/the-5-vs-of-big-data-pre- dictions-for-2016. 5. McKinsey & Company, Big data: The next frontier for innovation, competition, and productivity http://www.mckinsey.com/insights/busi- ness_technology/big_data_the_next_frontier_for_inno- vation. 6. Yossi, A. What happens in an Internet minute? How to capitalize on the Big Data explosion, 2015 http://www.excelacom.com/resources/blog/what-hap- pens-in-an-internet-minute-how-to-capitalize-on-the- big-data-explosion. 7. Apache Hadoop, http://ha- doop.apache.org/. Ташкова М. А. Великі дані – новий виклик, що стоїть перед бізнесом Сьогодні існує багато джерел, які створюють великі обсяги даних. З одного боку, – це самі люди, які здатні генерувати інформацію самостійно, вико- ристовуючи Інтернет і соціальні мережі. З іншого боку, з появою Інтернету речей ними стають не тільки користувачі, які генерують інформацію, але й більшість пристроїв, які використовуються користу- вачами щодня. Сила технологій переробки великих даних передбачає роботу і прийняття бізнес-рішень, використовуючи максимальну кількість даних. Чим більше даних, тим більше обґрунтовані аналізи і прогнози. Виклик полягає у такому: за допомогою яких технологій переробка цих великих обсягів да- них може здійснюватися. Ключові слова: великі дані, обсяг, швидкість, різноманітність, достовірність, вартість. Ташкова М. А. Большие данные – новый вы- зов, стоящий перед бизнесом Сегодня существует множество источников, которые создают большие объемы данных. С одной стороны, это сами люди, которые способны генери- ровать информацию самостоятельно, используя Ин- тернет и социальные сети. С другой стороны, с по- явлением Интернета вещей ими становятся не только пользователи, которые генерируют инфор- мацию, но и большинство устройств, которые ис- пользуются пользователями ежедневно. Мощь тех- нологий обработки больших данных подразумевает работу и принятие бизнес-решений, используя мак- симальное количество данных. Чем больше данных, тем более обоснованными будут анализы и прог- нозы. Вызов заключается в следующем: с помощью каких технологий переработка этих больших объе- мов данных может осуществляться. Ключевые Слова: большие данные, объем, ско- рость, разнообразие, достоверность, стоимость. Tashkova M. A. Big data – the new challenge facing business Today there are many sources that create large vol- umes of data. On the one hand, these are the people themselves, who are able to generate information on their own using the Internet and social networks. On the other hand, with the advent of the Internet of Things it is not only users that generate information, but also most of the devices that are used daily. The power of big data technologies involves working and making business de- cisions using the maximum amount of data. The more data, the more substantiated analyses and forecasts. The challenge now is – what technologies these large vol- umes of data could be managed with. Keywords: big data, NoSQL, hadoop, volume, ve- locity, variety, veracity, value. Received by the editors: 02.12.2016 and final form 28.12.2016
id nasplib_isofts_kiev_ua-123456789-114921
institution Digital Library of Periodicals of National Academy of Sciences of Ukraine
issn 1817-3772
language English
last_indexed 2025-12-07T15:31:50Z
publishDate 2016
publisher Інститут економіки промисловості НАН України
record_format dspace
spelling Tashkova, M.
2017-03-19T12:24:45Z
2017-03-19T12:24:45Z
2016
Big data - the new challenge facing business / M. Tashkova // Економічний вісник Донбасу. — 2016. — № 4 (46). — С. 164–167. — Бібліогр.: 7 назв. — англ.
1817-3772
https://nasplib.isofts.kiev.ua/handle/123456789/114921
311.21:004
Today there are many sources that create large volumes of data. On the one hand, these are the people themselves, who are able to generate information on their own using the Internet and social networks. On the other hand, with the advent of the Internet of Things it is not only users that generate information, but also most of the devices that are used daily. The power of big data technologies involves working and making business decisions using the maximum amount of data. The more data, the more substantiated analyses and forecasts. The challenge now is - what technologies these large volumes of data could be managed with.
Сьогодні існує багато джерел, які створюють великі обсяги даних. З одного боку, - це самі люди, які здатні генерувати інформацію самостійно, використовуючи Інтернет і соціальні мережі. З іншого боку, з появою Інтернету речей ними стають не тільки користувачі, які генерують інформацію, але й більшість пристроїв, які використовуються користувачами щодня. Сила технологій переробки великих даних передбачає роботу і прийняття бізнес-рішень, використовуючи максимальну кількість даних. Чим більше даних, тим більше обґрунтовані аналізи і прогнози. Виклик полягає у такому: за допомогою яких технологій переробка цих великих обсягів даних може здійснюватися.
Сегодня существует множество источников, которые создают большие объемы данных. С одной стороны, это сами люди, которые способны генерировать информацию самостоятельно, используя Интернет и социальные сети. С другой стороны, с появлением Интернета вещей ими становятся не только пользователи, которые генерируют информацию, но и большинство устройств, которые используются пользователями ежедневно. Мощь технологий обработки больших данных подразумевает работу и принятие бизнес-решений, используя максимальное количество данных. Чем больше данных, тем более обоснованными будут анализы и прогнозы. Вызов заключается в следующем: с помощью каких технологий переработка этих больших объемов данных может осуществляться.
en
Інститут економіки промисловості НАН України
Економічний вісник Донбасу
Management of Innovations
Big data - the new challenge facing business
Великі дані - новий виклик, що стоїть перед бізнесом
Большие данные - новый вызов, стоящий перед бизнесом
Article
published earlier
spellingShingle Big data - the new challenge facing business
Tashkova, M.
Management of Innovations
title Big data - the new challenge facing business
title_alt Великі дані - новий виклик, що стоїть перед бізнесом
Большие данные - новый вызов, стоящий перед бизнесом
title_full Big data - the new challenge facing business
title_fullStr Big data - the new challenge facing business
title_full_unstemmed Big data - the new challenge facing business
title_short Big data - the new challenge facing business
title_sort big data - the new challenge facing business
topic Management of Innovations
topic_facet Management of Innovations
url https://nasplib.isofts.kiev.ua/handle/123456789/114921
work_keys_str_mv AT tashkovam bigdatathenewchallengefacingbusiness
AT tashkovam velikídanínoviiviklikŝostoítʹperedbíznesom
AT tashkovam bolʹšiedannyenovyivyzovstoâŝiiperedbiznesom