Skip to Main Content

CPG Companies Fishing for Answers in Data Lakes


What is a data lake? A data lake is a centralized repository that allows you to store all structured and unstructured data at any scale.

CPGs are looking for better ways to turn their data into insights, bringing them to explore data lakes.


Data Lakes

Data lakes are a massive repository for all organizational data.

Data lakes create the ability to understand what data is in the lake through crawling, cataloging, and indexing. Businesses that extract value from data lakes will outperform peers.

A 2017 Aberdeen study showed businesses that implemented data lakes outperformed similar companies by 9% in organic revenue growth.

Data lakes are not prearranged, instead data lakes are in raw formats as they enter company systems. Because data remains in its native format, a bigger (and quicker) stream of data is available for analysis. Data lakes’ flexibility and size allow for much more storage of raw data streams.

For example, data scientists may not know exactly what they are looking for, but can find and access data quickly, regardless of format. Data can be collected and later sampled for ideas and tapped for real-time analytics.

Deloitte notes that data lakes are special, in part, because they provide business users with direct access to raw data without significant IT involvement. Because they store the full spectrum of an enterprise’s data, data lakes can break down the challenge of data silos that often confuse internal users.

Given the overwhelming volume of data available, it can take some serious manual work to fish out any insights. The problem is that data lakes become stagnant data repositories – polluted by the continued addition of big data – instead of meaningful business growth sources.

Like any environment, to make data lakes a productive growth source requires regular cleansing and movement.

Analytics Screenshot Blacksmith TPM

CPG organizations must have a way to refresh, extract and harmonize data into meaningful, easily usable segments for analysis and future decision making.

Since that process has historically been resource intensive, many CPGs are hesitant to adopt practices and solutions that use data for advanced analytical analysis or predictive planning. While these hesitations are understandable, the reality is that solutions can automate resources for harmonization, easy-to-use cleansing functionality, and experience-based guidance to your team.

Promotion analysis and planning are areas with great revenue-generating potential largely impacted by data quality and usage challenges.

Any forecasting or ROI calculation is highly dependent on 3rd party syndicated data, which is considered unreliable by most. Furthermore, the inability to easily cleanse and harmonize your syndicated data with shipment and spending data leaves companies with a flawed understanding of their (costly) trade investment.

This is in part because, while the data lake may have the correct data sets, there is no tool pulling together the data in a operable and understandable way. A trade promotion optimization solution centralizes all of your data to provide a single version of the truth. That singular view serves as the foundation of analysis, predictive planning and constraint-based modeling and will really impact the revenue generation for your organization.

CPGs that investment in data lake technology have taken an important step in their journey to better understanding their business.

But, the sheer volume of data makes quickly understanding and applying learnings almost impossible. CPGs are prioritizing data-driven decision making, predictive analytics, and AI – all potentially revenue-impacting objectives – but are paralyzed by their inability to turn data lakes into a stream of intelligence.




As too many companies push off investment in advanced analytics capabilities because “the data is already in order,” they miss opportunities to make smarter decisions, be more competitive, and sustain better results. In reality, most companies don’t have a data problem; they have a problem turning data into understandable, accurate and actionable intelligence.

While it is easy to see yourself drowning in a data lake, it’s equally as easy to invest in a optimization solution that will enable you to ride the waves to optimal outcomes.