Customer Value & Retention Intelligence

Case Objective:
To segment customers based on behavior and revenue contribution, track how their engagement changes over time, and visualize these dynamics using SQL, Power BI, and Python.

Technologies
- PostgreSQL (CTE, Window Functions)
- Cloud DB: Neon
- Power BI (Power Query, DAX metrics)
- Python (Pandas, NumPy, Matplotlib)
Business Value

Identify, retain and grow your most valuable customer segments. Map cohort-based retention strategies and revenue dynamics across lifecycle.
Dashboard Highlights
- RFM 3D Segment Explorer
- Chord Diagram of Segment Transitions
- Cohort Growth Ribbon
- Bubbles: Size × Growth × Revenue

Pipeline

Neon.tech

Cloud Infrastructure

Deployed an Azure-backed serverless PostgreSQL instance on Neon.
Imported a synthetic orders dataset and predefined logic for 10 RFM segments with DBeaver.

PostgreSQL

SQL Views

Defined customer cohorts by assigning each user to their first purchase quarter.
Calculated current RFM segments based on the last 6 months of customer activity.
Tracked historical RFM segments at the end of each past quarter.

Power BI

Interactive Dashboard

Connected to PostgreSQL and loaded pre-aggregated SQL views via Power Query.
Built a dynamic date dimension table for time intelligence.
Created DAX measures (L3M growth, distinct counts, and cohort-based KPIs).
Designed visual dashboards using Ribbon, Bubble, Treemap, and Chord charts.
Published the report to PBI Service with public web access and a live DB connection.

Python

Custom Visualization

Developed 3D RFM segment matrix visualizations using matplotlib.voxels.

A structured view of the full data pipeline — from CSV data ingestion to PostgreSQL view modeling and integration with PBI.

Data Flow Architecture

Quarterly Cohort in PostgreSQL

This SQL view assigns each customer to a cohort based on the quarter of their first purchase. It uses MIN(order_date) combined with DATE_TRUNC('quarter') and TO_CHAR() to label cohorts in a readable format like 2024q1_Cohort.

-- =============================================
-- View: public.customer_cohort_quarterly
-- Purpose: Assigns each customer to the quarter of their first purchase.
--          Provides both a timestamp and a formatted cohort label.
-- =============================================

CREATE OR REPLACE VIEW public.customer_cohort_quarterly AS
SELECT
    customer_id,

    -- The start of the quarter when the customer made their first purchase
    DATE_TRUNC('quarter', MIN(order_date)) AS cohort_start_date,

    -- A formatted label like "2024q1_Cohort" for use in reporting and visualizations
    TO_CHAR(DATE_TRUNC('quarter', MIN(order_date)), 'YYYY"q"Q') || '_Cohort' AS cohort_label

FROM orders
GROUP BY customer_id;

RFM (historical) in PostgreSQL

This SQL view calculates Recency, Frequency, and Monetary scores for each customer at the end of past quarters. Using a rolling 6-month window, it captures historical segment dynamics and assigns each customer to a segment based on RFM scoring logic. The result enables longitudinal analysis of customer value evolution over time.

-- =============================================
-- View: public.rfm_historical_segments
-- Purpose: Historical RFM segmentation per customer at end of each quarter
-- =============================================

CREATE OR REPLACE VIEW public.rfm_historical_segments AS

WITH
-- 🔹 Define fixed snapshot dates (end of quarters)
quarter_snapshots AS (
    SELECT DATE '2023-12-31' AS snapshot_date
    UNION ALL
    SELECT DATE '2024-03-31'
),

-- 🔹 Raw RFM metrics per customer within 6 months before each snapshot
rfm_historical_raw AS (
    SELECT
        o.customer_id,
        q.snapshot_date,
        COUNT(o.order_id) AS frequency,
        SUM(o.amount) AS monetary,
        DATE_PART(
            'day',
            AGE(q.snapshot_date::timestamp, MAX(o.order_date)::timestamp)
        ) AS recency
    FROM orders o
    JOIN quarter_snapshots q
      ON o.order_date BETWEEN (q.snapshot_date - INTERVAL '6 months') AND q.snapshot_date
    GROUP BY o.customer_id, q.snapshot_date
),

-- 🔹 Apply R, F, M scoring using NTILE partitioned by snapshot
rfm_scored AS (
    SELECT
        *,
        NTILE(3) OVER (PARTITION BY snapshot_date ORDER BY recency)        AS r_score,
        NTILE(3) OVER (PARTITION BY snapshot_date ORDER BY frequency DESC) AS f_score,
        NTILE(3) OVER (PARTITION BY snapshot_date ORDER BY monetary DESC)  AS m_score
    FROM rfm_historical_raw
),

-- 🔹 Combine scores and assign RFM code
rfm_final AS (
    SELECT
        customer_id,
        snapshot_date,
        'historical'::text AS segment_type,
        recency,
        frequency,
        monetary,
        r_score,
        f_score,
        m_score,
        CONCAT(r_score, f_score, m_score) AS rfm_code
    FROM rfm_scored
)

-- 🔹 Final output: RFM metrics and mapped segment name
SELECT
    rf.customer_id,
    rf.snapshot_date,
    rf.segment_type,
    rf.rfm_code,
    rf.recency,
    rf.frequency,
    rf.monetary,
    rf.r_score,
    rf.f_score,
    rf.m_score,
    sd.segment
FROM rfm_final rf
LEFT JOIN segment_definitions sd ON rf.rfm_code = sd.rfm_code;

Samples of Visualization in Power BI

Explore interactive dashboards that reflect real-time segment dynamics, cohort behavior, and revenue trends.

RFM Segment Matrix Rendered with Python

This matrix-style visualization of RFM segments was created using Python and Matplotlib. It provides a clear spatial view of how customer segments are distributed across Recency, Frequency, and Monetary scores.

Python Script for Custom Visuals in PBI

# The following code to create a dataframe and remove duplicated rows is always executed and acts as a preamble for your script: 

# dataset = pandas.DataFrame(F, M, R, segment)
# dataset = dataset.drop_duplicates()

# Paste or type your script code here:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from matplotlib.patches import Patch

#Segments to display
df = dataset[
    (dataset['segment'] == 'Champions') |
    (dataset['segment'] == 'Loyal Customers') |
    (dataset['segment'] == 'Potential Loyalists') |
    (dataset['segment'] == 'New Customers')
]

# Define the 3D grid dimensions
grid_size = (3, 3, 3)
voxels = np.zeros(grid_size, dtype=bool)
colors = np.empty(grid_size, dtype=object)

# Define segment colors
segment_color_map = {
    'Champions': '#FFEE00',
    'Loyal Customers': '#B0D0E0',
    'Potential Loyalists': '#CD7F32',
    'New Customers': '#50C878'
}

# Fill voxel grid and assign colors
for _, row in df.iterrows():
    r, f, m, segm = row['R'], row['F'], row['M'], row['segment']
    voxels[r-1, f-1, m-1] = True  # Adjust for zero-based indexing
    colors[r-1, f-1, m-1] = segment_color_map[segm]

fig = plt.figure(figsize=(9, 9))
ax = fig.add_subplot(111, projection='3d')

ax.voxels(voxels, facecolors=colors, edgecolors=None, alpha=0.63, linewidth=0.1)

ax.view_init(elev=20, azim=160)

ax.set_xlabel('Recency (R)')
ax.set_ylabel('Frequency (F)')
ax.set_zlabel('Monetary (M)')
ax.set_xlim(0,3)
ax.set_ylim(0,3)
ax.set_xticks([0, 1, 2, 3])
ax.set_yticks([0, 1, 2, 3])
ax.set_zticks([0, 1, 2, 3])

legend_patches = [Patch(facecolor=color, label=segm) for segm, color in segment_color_map.items()]
ax.legend(handles=legend_patches, bbox_to_anchor=(1.05, 1))

plt.tight_layout()
plt.show()

Explore the Code on GitHub!

I’m open to new job opportunities, projects, and collaborations — feel free to reach out.
Aleksei Ogarkov

Feel free to contact me.

Aleksei Ogarkov

Home

Customer Value & Retention Intelligence

Technologies

Business Value

Dashboard Highlights