Data Science and Analytics Defined

I’ve spoken to many a seasoned market research professional and there seems to be some confusion on what these terms mean and how they are applied.  My attempt here is to simplify the definition and application of data science and analytics for the benefit of market researchers.

Analytics is the discovery and communication of meaningful patterns in data. It makes use of information technology, statistics and mathematical algorithms to develop knowledge, to quantify performance or to make predictions. It uses the insights gained from this process to recommend action or to guide decision making.

Analytics is best thought of as a research procedure for decision making, not simply as isolated tools or steps in a process.

Data science is, in general terms, is the extraction of knowledge from data through the use of mathematics, statistics, and computer science.

For operational purposes, it is sometimes helpful to break down analytic procedure into eight basic components:

  1. Defining Objectives
  2. Data Collection
  3. Data Preparation and Cleaning
  4. Model Building
  5. Model Evaluation
  6. Interpretation
  7. Scoring New Data or Simulations Using the Model
  8. Communication of Results and Implications to Decision Makers

Another and perhaps more common way to look at analytics is as statistical procedures, essentially steps 4-7 in the process outlined above.

There are countless methods for analysing data.  The following list provides some general categories of statistical methods, (with brief illustrations given in parentheses):

  • Descriptive and Exploratory Analysis (frequencies, means, bar charts)
  • Models that Predict (predicting consumption frequency of new customers)
  • Models that Explain (identifying brand choice drivers)
  • Analysis of Cross Sectional Data (data collected at one period in time)
  • Analysis of Longitudinal or Time Series Data (data collected at several periods in time)
  • Models with Quantitative Dependent Variables (monthly spend)
  • Models with Categorical Dependent Variables (product user/non user)
  • Time to Event Models (customer churn analysis)
  • Methods that Group Variables (factor analysis of attribute ratings)
  • Methods that Group Cases (cluster analysis of consumers)
  • Text Mining (analysis of social media conversations)
  • Simulations and Forecasts (sales forecasts under various marketing mix scenarios)

I trust you will find the information above helpful and look forward to your input.

This information is attributed to Kevin Gray, of Data Science for Marketing and Jeff Leek, author of “Simply Statistics”.