Goldilocks Criteria: Customer Data Platforms

This is the second in a series of posts designed to help managers think about business requirements for selecting enterprise vendors and software.  Please also check out my first post on Business Intelligence platforms.

Customer Data Platforms (CDPs) inspire a lot of confusion.  Best to begin with what they are and what they are not.

CDPs are:

  • A centralized platform for storing all of the user data about all of your users
  • A platform that can be used by non technical employees to activate / action upon user data
  • An safe-haven for secure user data management, compliant with the latest regulations and best practices
  • A bridge to combine your user data with external data sets
  • A rules engine for user segment management.  Want to build cohorts of users who opened an email and clicked on a Facebook ad – no problem
  • A platform for collaboration, breaking down individual business unit data silos

CDPs are not:

  • CRM solutions designed for sales or support teams to manage intricate customer interactions and workflows
  • DMP solutions focused only on anonymous cookied / IDed users (though they are coming close to covering this feature set)
  • Tag Management solutions designed to wire up various vendor libraries and SDKs.  Many CDPs were Tag Managers, but I think the historic focus on tag management is a disadvantage to be a best of breed CDP.  Just because you were a horse, it doesn’t make you a better car

And why do people integrate Customer Data Platforms?  Centralizing user data, strengthening the intelligence around it, and democratizing access to use it should impact business goals across the board from decreases systems costs to improved conversion rates.

The basic ins and outs of a CDP.

Given all of this, let’s review my Goldilocks (“just right”) criteria for picking a Customer Data Platform:

Connectivity and I/O

Customer Data Platforms are only as good as the pipes that bring data in and out of them.  You want many different roads into the platform from plug and play SDKs / libraries to full read / write APIs.  You also want pre built connectors into the most popular data sources (CRM, event ticketing platforms, etc) and data activation endpoints (ad networks, social media channels, email service providers, etc).

Security and Compliance

As we’ve learned over and over recently, user data security and governance is no easy tasks.  Outsourcing this to a vendor may be a hard decision to make, but it’s often much harder managing and maintaining secure and compliant user data solutions internally.  You want a partner with a tract record of secure data management, comparable customers that you trust and no fear of security audits from your team or others. You also want a partner that is quick to update to changing industry rules and regulations (ex. GDPR).  Internally, you want robust rules, roles and permission settings to partition off sensitive data for specific users and use cases.

Administrative Usability

CDPs are designed to democratize data-driven activities for non-technical users.  As such, you should require a modern, usable UX for non-engineers to get busy with the data.  Some providers require light scripting for segment creation or segment activation. No good. Best to trail the administrative user experience with some of your least technical colleagues before pulling the trigger on a vendor solution.

Identity Management and Identity Resolution

There are a number of features in this functionality bucket, but in short, you want your CDP to consolidate literally all of your available user data into a singular user profile.  This might mean partnering with a device or identity-graph provider to stitch emails to cookies.

This also means flexible data storage limits so that you don’t have to discard potentially valuable user data limits.  At Viacom, a certain % of the US population visits our sites / websites or volunteers their email addresses. That said, our TV signals reach the homes and mobile devices of a much larger user base.  We need systems to allow us to pull all of our data together without worry about a vendor’s storage costs or historic architectural limits.

Real Time Segmentation Updates

You user’s profiles and segments should update in real time as they take actions on and offline.  Many CDPs update segments hourly – which is no bueno. If a user views / interacts with your website or an online ad, their profile should update immediately so they can activate to the next event in your funnel.  Many of the CDPs who came from legacy industries (again, Tag Management) are just not architectured to support real time updates. This is of growing importance.

Integrated and Automated Machine Learning

The next generation CDPs go further than data storage and segment storage.  The best support unstructured data and use machine learning to automatically create useful user segments.  Some even crawl and categorize your content (pages, emails, posts) to find interesting patterns and apply those as dynamic segments to your users.  This is the type of thinking you want to see from your Customer Data Platform partners.

The platform should also support custom data science models – whether run internally within the CDP or through easy and performant read / write APIs.

ML fanboy alert – this is one of my very top considerations when reviewing partners.

Smart Orchestration

Getting your users through a funnel from start to conversion is never easy.  Your CDP should monitor and track your progress and where possible add dynamic intelligence to usher users through funnel events and towards your target goal.  The alternative is intricate manual workflow creation and management, which is hard to set up and even harder to manage against other initiatives.

This dynamic orchestration allows for truly personalized, omni-channel user journeys – experiences and messages that change based on the individual user’s profile properties and the best likelihood of conversion.

Industry Momentum

There is a ton of investment in the CDP space right now.  You’ll want to pick a horse with recent major funding from venture capital or a strategic investors.  Many of these companies will not be in business in two year’s time.


Goldilocks Criteria: Selecting Modern Business Intelligence (BI) Platforms

This is the first of a new series of posts dedicated to helping people select data tools and infrastructure. I’ve listed out the ‘perfect’ feature set for a dream product. Of course, these features rarely exist in a single solution, but if they did, I’d use it! First up: business intelligence platforms.

For the purposes of this discussion, let’s define Business Intelligence (BI) platforms as data platforms that non-technical business users to explore, prepare, and present data germane to their work. BI tools are crucial for making informed decisions.

Turns out this is ancient stuff. BI platforms we first utilized back in the ’60s – the 1860s – when Richard Devens coined the term in his Cyclopædia of commercial and business anecdotes when Devens used the term to describe how Sir Henry Furnese, a banker, gained an advantage over his competitors by using and acting upon the information surrounding him.

We’ve also come a long way since the 1960s, evolving from cumbersome on-premise solutions to nimble, cloud-based platforms. Today’s BI tools are designed for accessibility, allowing professionals without a technical background to glean insights easily.

BI tools serve a specific purpose: analyzing and reporting on data consolidated from various sources. Some BI platforms sit on top of separate data warehouses and some modern platforms serve as the data aggregator/data store as well. BI tools pack a ton of functionality but are typically narrow-scoped. They don’t execute tasks but rather provide insights that inform actions taken on other platforms.

You will also see BI in the form of Embedded Analytics within various tools – like your CRM system or your Web Analytics platform. Generally, Embedded Analytics helps inform micro insights like, which email subject performed best, as opposed to providing a holistic data view of data across multiple use cases. 

So how does this work in practice? A great use case for BI platforms is to create easy-to-digest OKR dashboards for your company, teams, and individuals. Your BI platform should allow teammates from different business units to pull up live views of their progress towards their outcomes/goals… anytime/anywhere… on their phones… without support from business analysis or IT.

OK, enough preamble. Here are the goldilocks (aka “just right”) criteria I look for in BI platforms:

Integrated data warehouse

Traditionally, BI tools sit on top of separate data platforms managed by engineering teams. More recently, a new class of products has emerged that allows you to upload/connect to your data without engineering support. I find this to be a huge advantage as it allows moderately technical users to get up and running without distracting/relying on external resources. (Self-service also leads to challenges with data governance but that’s another story.)

As an example, imagine easily joining together all of the spreadsheets you store in Google Drive / Dropbox with live data connections to Google Analytics / Meta Analytics / financial data / more and then exploring and visualizing this data as you choose. That’s what these new platforms do all without the help of data engineering resources.

Data engineering for dummies

Some of the best data scientists I’ve worked with estimate that they spend 80-90% of their time on data hygiene before they can begin analysis and exploration.

What does that mean for BI tools? Any functionality that supports easy data manipulation for the sake of improved clarity is awesome. That means – joining data together via drag and drop, changing data types with a click, and deduplicating rows without writing SQL is all a huge value add, extending the range of users who can go deep with the data without external assistance.

Live data! From the cloud! On your phone!

Data that arrives attached to an email is DOA. This is one of my absolute pet peeves. Further, once people begin offline discussion and editing of the data, the risk of multiple inaccurate versions/views of the same data set is commonplace. 

BI tools need to pull from a live backend at all times. When I pull up a link to view a dashboard the data should be (pseudo) real-time, up-to-date, and time stamped clearly with the data last run.

This also means the platform should be mobile-centric. Old-timers still want their landscape printouts, but there is nothing more powerful than conversing with colleagues and pulling up live data views on your phone à la minute. 

AI / ML ready

I don’t want to overstate this one as we’re in the very earliest of innings, but your platform should have the foundation of supporting automated machine-learning-driven insights. You may not find these immediately valuable (they rarely are out of the box) but in a few years, you be getting voice alerts when your data spikes unpredictably in ways you may not have imagined. There is no sense in investing in a platform that is not actively working on automated data insights.

As a start, I’d like to see my platform present basic statistics around the data that I’ve onboarded. This means simple distribution and correlation reports. As you play with these statistics you’ll be able to more easily wrap your arms around the data at hand, steering deeper analysis and insights. Simple predictive analytics is another good baby step before full-blown AI.

This all said, you separately need to invest in training your teams to take advantage of these statistical insights. Leveling up the data fluency of your team is always more valuable than standing up a wiz-bang technology solution.

Narrative & collaboration focused

A perfect platform would support metrics-backed storytelling – and not just the sharing of pie charts. That means as a product owner, I can use a BI platform to explore a set of data and then build a coherent, sharable narrative around it. That could manifest itself as an online presentation with live data at different altitudes, supported by text, images, video, and other added insights. It also means that I should be able to draw / pin annotations within the data itself.

Further, the presentation should support active conversation around what’s being presented. Unlimited named user accounts, threaded comments, open annotations, creating next-step action item, @ mentions and more are a natural fit here.

Governance gone wild

Sad to say, this is critical. Like supercritical. Like, as soon as you create your second dashboard you need extreme governance otherwise you’ll never find it again or know if the data set that powers it is up to date, approved, and official. 

I’ve seen smart approaches here and they center around clear labeling of the data, its origins, similar/duplicative data, and more. Having an easy way to validate data as “best” or “official” helps too. Ultimately, ML/AI will be a huge help in this arena.

An integrated, dynamic “data catalog” that shows you the breadth of your data, its lineage, stamps of approval, and error reporting is also a must-have.

User-level data FTW

BI tools typically play in the aggregated, anonymous altitude. You can see how all your site visitors behave, customer acquisition by location, sales by campaign, etc. Data is viewed on the content, page, campaign, and location level – rarely at a user level. In a perfect world, a graph model would be deployed at the atomic data layer allowing pivots by the above altitudes but also on the user level.

A new breed of system called Customer Data Platforms is jumping into the fray here, promising a single view of the user. These CDPs are being leveraged today by Marketing and Sales team but the application of this granular view to more typical BI use cases is immense. Perhaps CDPs are the topic of the next post in this series…