Our Motivation

Motivation

Companies are better when they know their customers well. A key to knowing them well is being able to predict when they have become disengaged, dissatisfied, and ultimately when they are about to disappear. Helping companies do so is the core focus of our project.

Our Solution

Vision

To that end, we built a customer loyalty prediction engine, which estimates which customers are at risk of leaving and when they are most likely to do so. The results are delivered to the company via a highly customizable, highly interactive dashboard, which allows an in-house marketing team or a team of AI agents to intervene before it is too late.

Demo of our Solution

Process Diagram: How Customers Interact

Our system captures real-time player activity through their casino card, tracking gaming sessions, food & beverage purchases, spa visits, and special events. This data flows through our inference engine to generate churn predictions, which are then delivered to the customer retention team for targeted intervention strategies.

Transaction Flow Diagram

Technical Overview: Making Inference Come True

To deliver this, we built, tested, and deployed our prediction engine in the following steps:

Data Source

  • Real, anonymized data from Commerce Casino in Los Angeles
  • 23,000 customers over 2 years (June 2023 to June 2025)
  • Data stored in ndjson files where each record is a single game session
  • Features: session start/end, hands played, game played, game limits
  • Derived features: session length, time of day played, time between sessions, hands/session length over periods of time, inactivity streak, active days, recency of activity, session length trends
Commerce Casino

Data Pipeline

Model Selection & Performance

Casino Model: "Customer Baseline"

  • Rule-based segmentation, typical in casinos, using RFM (Recency, Frequency, Monetary) scores
  • Transparent and interpretable using real-world business logic
  • Fixed thresholds and weird labels limit predictive precision
F1 Score Gauge

Logistic Regression: "Modeling Baseline"

  • Baseline ML model using LogisticRegressionCV with 5-fold cross-validation, no temporal dimensions
  • Fast to train, interpretable, identifies key player attributes
  • Ignores time structure and underperforms on sequence-driven churn
F1 Score Gauge

Temporal Convolutional Network (TCN)

  • Deep convolutional model with causal and dilated 1 dimensional convolutions
  • Models long-term trends w/o recurrence; parallelizable, fast
  • Sensitive to hyperparameters and prone to overfitting on imbalance
F1 Score Gauge

Bidirectional Long Short-Term Memory (Bi-LSTM)

  • Multi-layer sequence model with forward-backward memory and attention
  • Captures long-range dependencies and temporal weighting
  • Slower training; requires tuning to avoid overfitting on rare classes
F1 Score Gauge

Baseline Transformer

  • Shallow self-attention model with 2-layer encoder and positional embeddings
  • Parallel processing with strong recall and attention maps
  • Performance sensitive to data size and class imbalance without tuning
F1 Score Gauge
SELECTED MODEL

Full Sequence Transformer

  • Deep transformer with 5 layers, 4 heads, engineered features, and focal loss
  • Strongest overall performance across all metrics
  • Requires significant training time, sequence padding, and architectural tuning
F1 Score Gauge

Customer Segmentation Model

The clustering model segments casino players into behavioral groups by analyzing their gaming patterns, creating five distinct clusters:

High-Intensity Players

Frequent, long sessions

Regular Hold'Em Players

Consistent poker enthusiasts

Mixed-Game Monthlies

Monthly variety players

Infrequent Visitors

Occasional players

One-offs

Single-session players

The model incorporates table time, game preferences, and betting limits, using PCA for dimensionality reduction and K-means clustering optimized by silhouette score. This integration allows prediction of not just who will churn, but how churn patterns vary across different player segments.

Key Learnings & Impact

Time Series Complexity

Successfully incorporated the richness of temporal patterns without overfitting through careful feature engineering and model architecture selection.

Performance Challenges

Overcame limitations of traditional non-neural network approaches by leveraging transformer architecture for superior pattern recognition.

Future Works

  • Model Improvements: Iterative improvements of segmentation and churn models
  • Cloud Infrastructure: Build production quality cloud infrastructure setup
  • Dashobard Visualizations: Add additional user visualizations to the dashboard
  • Agentic AI Integration: Develop AI agents to personalize marketing interactions with churning customers, leveraging individual behavioral profiles for targeted retention strategies
  • Industry Expansion: Test generalizability of our framework across other industries with customer loyalty programs, adapting to varying data density levels

Our Contributions

Ayushi Joshi

EDA, Feature Engineering, Model Development

Edwin Figueroa

Data Pipeline, Stakeholder Communication, Visualization Design

Michael Drexler

Algorithm Design, Dashboard Integration, Performance Optimization

Michael Eisenberg

EDA, Feature Engineering, Infrastructure Setup, Website Development

All team members contributed to cross-functional reviews of EDA, modeling, infrastructure, and visualizations

Acknowledgements

We would like to thank our professors Zona Kostic and Morgan Ames for their invaluable guidance throughout this project. Special thanks to the Commerce Casino team members who provided insights and access to the anonymized dataset that made this analysis possible.

Meet the Team

Michael Drexler

Michael Drexler

LinkedIn
Michael Eisenberg

Michael Eisenberg

LinkedIn
Edwin Figueroa

Edwin Figueroa

LinkedIn
Ayushi Joshi

Ayushi Joshi

LinkedIn