Unlock Hidden Revenue with AI-Powered Upsell Scoring and Census GPT Columns

Katy Yuan
6 December 2024

What if you could use AI to identify which customers have the potential to spend more — before they even know it themselves? With Census GPT Columns, this is easy to do in just a few steps, and we’re giving you the full GPT prompts and SQL to do it yourself. Let’s dive in!

Check out the video walkthrough below, or read on for a step by step:

Overview

In our example, we’re a gym with four tiers of membership, with $29.99 being the lowest and $129.99 being the highest. Our goal is to identify which of our customers have potential to become premium customers, aka high-paying customers, so we can unlock hidden revenue. 

What makes premium customers unique? And more importantly, which lower-paying customers are exhibiting identical patterns and have potential to upgrade?

First, we need to understand premium customer characteristics like age, location, visit frequency, and personal training usage to create an Upsell Score, then we can send targeted marketing communications to individuals with the highest scores. 

Process

  1. Identify our highest paying premium customers and the traits commonly associated with those customers.
  2. Analyze all of our customers by comparing each individual’s traits to create an Upsell Score.
  3. Operationalize the score and our customers' interests by sending them to marketing channels like email, SMS, and advertising.

Step 1: Identifying Traits in Premium Customers

The first step is straightforward - understanding what distinguishes premium customers from others. Using SQL and Census GPT columns, here's how to structure data to figure out the likelihood of each trait being associated with the highest tier plan:


WITH user_data AS (
         SELECT
             user_id,
             LOWER(gender) AS gender,
             age,
             age_range,
             LOWER(location) AS location,
             membership_plan,
             membership_price,
             fitness_goal,
             personal_training_usage,
             visit_frequency,
             preferred_workout_type,
             loyalty_status,
             signup_date
         FROM census_on_census.production.gym_users_data
         WHERE signup_date BETWEEN
             DATEADD('day', -120, CURRENT_DATE())
             AND DATEADD('day', -90, CURRENT_DATE())
      ),


      grouped_user_data AS (
         SELECT
             membership_price,
             TO_JSON(
                 ARRAY_AGG(
                     OBJECT_CONSTRUCT(
                         'user_id', user_id,
                         'attributes', OBJECT_CONSTRUCT(
                             'gender', gender,
                             'age_range', age_range,
                             'location', location,
                             'fitness_goal', fitness_goal,
                             'personal_training_usage', personal_training_usage,
                             'visit_frequency', visit_frequency,
                             'preferred_workout_type', preferred_workout_type,
                             'loyalty_status', loyalty_status,
                             'signup_date', signup_date
                         )
                     )
                 )
             ) AS user_traits_json
         FROM user_data
         GROUP BY membership_price
      ),


      pivoted_data AS (
         SELECT
             1 AS join_key,
             MAX(CASE WHEN membership_price = 49.99 THEN user_traits_json END) AS "49.99 Plan",
             MAX(CASE WHEN membership_price = 79.99 THEN user_traits_json END) AS "79.99 Plan",
             MAX(CASE WHEN membership_price = 29.99 THEN user_traits_json END) AS "29.99 Plan",
             MAX(CASE WHEN membership_price = 129.99 THEN user_traits_json END) AS "129.99 Plan"
         FROM grouped_user_data
      )


      SELECT *
      FROM pivoted_data;
Copy

Then create a Census GPT column to analyze these traits:


You are a data analyst. Using the customer data from the highest plan  plan and the lowest plan , create a trait-by-trait analysis showing the likelihood of being a high-value customer.


For each trait in the data:
1. List each possible value within that trait
2. Calculate the ratio of that value appearing in the high-tier vs low-tier plan
3. Sort traits by their predictive strength (strongest correlation to weakest)


Format your output as:


TRAIT: [Name of Trait]
- VALUE: [Specific Value]
 - High Plan: [%]
 - Low Plan: [%]
 - LIKELIHOOD RATIO: [X]x more likely to be high-value
 Only include traits where there is at least a 10% difference between plans.
Sort values within each trait by likelihood ratio (highest to lowest).
```


This should generate output like:


```
TRAIT: Age Range
- VALUE: 18-25
 - High Plan: 40%
 - Low Plan: 28%
 - LIKELIHOOD RATIO: 1.43x more likely to be high-value


TRAIT: Visit Frequency
- VALUE: Daily
 - High Plan: 38%
 - Low Plan: 24%
 - LIKELIHOOD RATIO: 1.58x more likely to be high-value
```
Copy
 

The technical setup is simple, but the insights it enables are powerful. When analyzing customer traits – from preferences to typical behaviors – clear patterns emerge. These aren't just random characteristics; they're powerful predictors of which customers would be most interested in and likely to benefit from premium offerings.

Step 2: Creating an Upsell Score

These insights can be transformed into a scoring system that automatically identifies customers showing premium patterns. By using GPT Columns to analyze traits and behaviors against known indicators of premium preferences, we'll create two columns for each customer: an Upsell Score and an Upsell Top Trait that indicates their top interest.

This automated analysis runs continuously, ensuring opportunities aren't missed as customer behaviors evolve. 

First we want to join the trait analysis we created in the last step to our Users table to compare each individual’s traits against their propensity to upgrade. Here's the query structure:


WITH analysis AS (
    SELECT 
        census_third_party_unique_id_column AS join_key, 
        census_third_party_result_column AS upsell_score_analysis 
    FROM 
        census.census.dataset_column_gpt_280cae77_ca31_4248_8cd5_abe1c249c1c5
),

traits AS (
    SELECT
        1 AS join_key,
        user_id,
        LOWER(gender) AS gender,
        age,
        age_range,
        LOWER(location) AS location,
        membership_plan,
        membership_price, 
        fitness_goal,
        personal_training_usage,
        visit_frequency,
        preferred_workout_type,
        loyalty_status,
        signup_date
    FROM 
        census_on_census.production.gym_users_data
    WHERE 
        membership_price != 129.99
)

SELECT 
    * 
FROM 
    traits 
LEFT JOIN 
    analysis 
USING (join_key);
Copy

 

And the GPT prompt that creates the Upsell Score and Upsell Top Trait columns for each individual:


Using only the following benchmark data, calculate the customer's upsell score by adding the likelihood ratios of their matching traits found in this analysis:


Using the traits:
AGE: 
AGE_RANGE: 
FITNESS_GOAL: 
GENDER: 
LOCATION: 
LOYALTY_STATUS: 
MEMBERSHIP_PLAN: 
MEMBERSHIP_PRICE: 
PERSONAL_TRAINING_USAGE: 
PREFERRED_WORKOUT_TYPE: 
SIGNUP_DATE: 
USER_ID: 
VISIT_FREQUENCY: 
	1.	Sum the exact likelihood ratios for each matching trait.
	2.	Identify the top matching trait from the list based on the highest likelihood ratio.

Then output only the following:
	•	Upsell Score: one word: High, Medium, or Low
	•	Top Trait: the name of the top matching trait. The complete list is:
	1.	Visit Frequency
	2.	Personal Training Usage
	3.	Loyalty Status
	4.	Gender
	5.	Fitness Goal
	6.	Age Range
	7.	Location

Output format:

Upsell Score: [High/Medium/Low]  
Top Trait: [Top_Trait_Name] 
Copy

 

Operationalizing Upsell Score in Marketing Campaigns

After you’ve identified the customers with the highest upsell potential, you want to get them into the hands of your marketing team who will send thoughtful, targeted communications.

Simply create a Census Sync to connect to 200+ marketing automation and advertising tools, such as HubSpot, Braze, Iterable, Attentive, Google Ads, and Meta Ads.

Transforming Infinite Data into Actionable Insights

Every business has valuable data about their customers, and turning oceans of data into insights is even easier with tools like LLMs and Census GPT Columns. Think of it as having a clear window into your customer base - one that reveals patterns and opportunities that would otherwise remain hidden.

Instead of casting a wide net with broad campaigns, sales and marketing teams can precisely target customers whose traits signal readiness for premium offerings. These targeted interactions lead to more meaningful conversations based on actual characteristics that indicate interest, rather than guesswork.

Beyond immediate marketing benefits, the process uncovers valuable insights about customer traits and patterns. Businesses discover common characteristics of high-value customers, identify key indicators of premium potential, and better understand what truly differentiates various customer segments. 

Looking Ahead

Today we covered a consumer fitness example, but upsell scoring can be used for any industry or business model. For a B2B company, this might be a customer who’s ready to upgrade to a higher tier plan or add on more features. For a B2C company, this might be a consumer who purchases a more expensive product or service, or adds more items to a purchase. 

Ready to try it out for yourself? Census users can get 1 million free OpenAI credits to use with GPT Columns 🎉 See if you qualify by applying here.

Start a free trial of Census or get a demo today.