Predictive Analytics translates Big Data into meaningful, usable business information. Numerous techniques can be used for Predictive Analytics but there are 10 that lead the way. The Data Warehousing Institute surveyed 373 companies to analyse what are the most used Predictive Analytics models and algorithms. The result is:
Decision trees and linear regression are the most common. Both methods are relatively easy to understand and straightforward. Linear regression,widely used in statistics, tries to model the relationship between variables by fitting a line to the observed data. Linear regression looks at the past relationship between variables to model the future. For example, the price of a product might be related to demand.
Decision trees are often used for prediction because they are also fairly easy to understand, even by a non-statistician. A decision tree is a supervised learning approach that uses a branching or tree-like approach to model specific target variables or outcomes of interest. For instance, an outcome might be leave or stay; respond to or ignore a promotion; buy or not buy.
Clustering and time series are also popular. Clustering and time series analysis are also popular techniques for predictive analytics. In fact, time series analysis seems to be more popular than clustering among those investigating the technology than by those actually using it. This might be because clustering is very useful in market segmentation, and marketing and sales are popular areas for predictive analytics among current users.
Ensemble modeling is not widely used (yet). In ensemble modeling, predictions from a group of models are used to generate more accurate results. Only a small percentage of respondents cited ensemble learning as the most popular technique used, but it can be powerful and should garner more attention in the future.
Association Rule Learning