Where Big Data analytics systems are used and how to implement them

Let us note the areas of activity with the highest demand for data analytics, both descriptive and prescriptive:

  • Medicine – making a diagnosis based on disease symptoms, identifying factors that provoke disease, determining the propensity to become ill in the future, forming recommendations and prescribing drugs to treat and prevent illnesses.
  • Advertising and marketing – determining the effectiveness of promotional campaigns, identifying the most effective channels and forms of presenting information (personalized targeting), building referral systems, creating demand based on user interests and his behavior in the network, predicting and preventing customer churn (Churn Rate), and optimizing pricing.
  • Insurance and crediting – determining the exact amount of compensation or credit, scoring the client. For example, it is already implemented in a joint project between banks and Yandex, when banks evaluate the solvency of a potential borrower based on the history of his requests in the search engine.
  • Industry – identifying key factors that affect product quality and the performance of production processes, predicting equipment failures, scheduling preventive inspections and equipment repairs, forecasting product demand, optimizing production capacity utilization and warning of future emergencies.
  • Finance and security – detection and prevention of fraudulent operations (anti-fraud systems), detection of malicious programs and data leakage cases.
  • Human resource management (HR) – identifying key factors that influence employee competencies, creating a professional competency model, forecasting layoffs, preventing professional burnout and workplace conflicts.

Implementation of analytical Big Data systems is a complex step-by-step project that is often performed as part of business digitalization. Predictive analytics is at the top of the pyramid and relies on the previous levels: predictive, diagnostic, and descriptive. Therefore, in order to form optimal management decisions based on data, it is necessary, first of all, to accumulate a relevant amount of this information, sufficient to correctly train Machine Learning algorithms. Some analytical tasks are solved with the help of modern BI-tools, such as commercial platforms like Oracle Data Mining, SAP BusinessObjects Predictive Analysis, SAP Predictive Maintenance and Service, IBM Predictive Insights or open-source solutions (KNIME, Orange, RapidMiner). In practice, many enterprises that have embarked on the path of digital transformation, create their own systems of big data analytics. They use a variety of Big Data technologies, for example, Apache Hadoop – for storing information (in HDFS or HBase), Kafka – for collecting data from various sources, and Spark or Storm – for fast analytical processing of streaming information. In particular, this is how the recommendation system of the streaming service Spotify is implemented, which we described here. Thus, the organization of predictive and, even more so, prescriptive data analytics is one of the key challenges of digitalization of business.