Back to Blog

We Predicted 9 of Our Last 11 Churns. Here's the Signal Stack.

Product Akif Kartalci 16 min read
churn signal stackchurn prediction signalsearly warning signs churncustomer health score signalsproduct usage churn indicatorschurn preventionB2B SaaS
We Predicted 9 of Our Last 11 Churns. Here's the Signal Stack.

Of the last eleven accounts we tracked as at-risk, nine churned within the window we predicted. Two didn’t.

The nine aren’t the impressive part. They fired across a combination of saas churn prediction signals we’d been refining for about two years: usage frequency drops, feature breadth regression, champion disengagement, commercial pressure building at renewal. The churn signal stack caught them 60 to 90 days before the cancellation email arrived. That lead time changed the save math completely: an analytics platform we worked with saw save rates go from 14% to 51% after adding composite health scoring, because CS had real runway to intervene rather than scrambling two weeks before contract expiration.

The two we missed taught us more. Not because the stack failed to fire. It fired late and weakly on both. Those churns lived in a layer product telemetry can’t reach: one involved a champion being restructured out of the company three months before renewal, and one involved a finance-driven consolidation decision made in a board meeting we had no visibility into. No Mixpanel event captures either of those. No Amplitude dashboard surfaces them.

Between 70% and 80% of customers who eventually churn show measurable warning signals at least 30 days before they cancel. The gap between “we lost this account” and “we saw it coming” isn’t a data problem. It’s a signal architecture problem. Most teams are watching one or two signal types, usually login frequency and NPS, when prediction accuracy comes from stacking five layers simultaneously.

This post covers the stack: what goes into each layer, how to weight the composite, how to build it without enterprise tooling, and what we learned from the two accounts that got through.

Why Churn Looks Unpredictable When It Isn’t

I’ve read the post-mortems. “We had no idea they were going to leave.” “Their usage looked fine until last month.” “The QBR went well.”

Every time I hear those phrases, I ask to see the data. What I find is usually the same thing: teams measuring aggregate activity (logins, tickets, page views) without any structured weighting of what those signals mean for churn probability. Or they’ve built a health score that’s really just a login frequency tracker with green/yellow/red labels on it.

The fundamental problem is signal selection. Most teams default to easy-to-collect metrics rather than predictive ones. Login frequency is easy to collect. It’s also a lagging indicator: by the time login frequency collapses, the customer has been mentally out for six to eight weeks. NPS is easy to collect and nearly useless as a standalone churn predictor. A customer who scores you an 8 and uses three features every week is a completely different risk profile from a customer who scores you an 8 and has had two users log in all quarter. The score looks the same. The churn probability doesn’t.

The research bears this out. Multi-factor models achieve roughly 79% prediction accuracy at 90 days before renewal, compared to 52% for usage-only models and 41% for support-only models. Adding a single relational layer to a behavioral model improves accuracy by 15% to 25%. A well-built four-dimensional health score outperforms a single-dimension model by 34%, per Forrester’s 2025 Customer Success Technology analysis.

These aren’t marginal gains. The difference between catching a churn at day 90 and catching it at day 10 is the difference between a real save attempt and a consolation call.

If you want the cohort-level context first: which customer segments are structurally at risk before you go account by account, we covered that in the cohort retention analysis framework. This post picks up at the account level: once you know you have a retention problem, here’s how to see which specific accounts are about to become part of it.

The 5-Layer Signal Stack

Here’s how the stack is organized. Each layer captures a different dimension of churn risk. None of them alone is sufficient. The prediction accuracy comes from the composite.

LayerCategoryWhat It TracksDetection WindowStarting Weight
1Product behaviorLogin frequency, core action rate, session depth45-90 days30-35%
2Engagement depthFeature breadth, seat activation, collaboration30-60 days25-30%
3RelationshipChampion tenure, CSM last live conversation, contact activity60-120 days20-25%
4CommercialRenewal proximity, contract trend, billing signals14-45 days10-15%
5ConversationalSupport sentiment, call language, escalation phrasesVariable10-15%

These weights are a starting point. They should be calibrated against your actual churn history: run the signals backward across 12 months of churned accounts and weight toward the signals that fired first and most reliably. The more core-action-dependent your product is, the higher you’ll weight Layer 1. The more relationship-driven your retention is (enterprise accounts with frequent CS involvement), the higher you’ll weight Layer 3.

Layer 1: Product Behavior Signals

This is where most teams start. It’s also where they stop, which is why their models underperform.

The most predictive product behavior signal isn’t raw login frequency. It’s the rate of change in core action completion over a trailing 30-day window compared to the prior period. A customer who logs in five times a week but stopped running the report, firing the campaign, or creating the deal that represents your product’s core value delivery is more at risk than a customer who logs in twice a week but completes the core action every time.

Specific thresholds that hold up across B2B SaaS products:

  • 40% or more drop in login frequency over trailing 30 days: High risk tier. This fires roughly 45 to 60 days before most cancellations.
  • 50% or more decline in core action completions: Critical risk. Often correlates with a friction event, a team change, or a competing tool being evaluated.
  • Three consecutive weeks of declining activity trend: Even if individual week numbers look acceptable, the direction matters. An account at 70% of peak usage heading down is a worse signal than an account at 60% that has stabilized.

The silent failure alert belongs in this layer too. If an account is more than five business days old with zero core action completions, it should flag immediately. Activation failure is the earliest product behavior signal in the stack, and also the cheapest to fix. We covered the mechanics in detail in the 72-hour activation window framework: the accounts you don’t activate in the first three days rarely activate at all.

Layer 2: Engagement Depth Signals

This layer gets the most value per unit of instrumentation. It’s also the most underused.

Customers using a single core feature churn at roughly three times the rate of customers using five or more features. More precisely: users who don’t engage with secondary features in their first 30 days are 60% more likely to churn than those who do. Breadth is a proxy for value realization. The more of your product’s surface area a customer uses, the harder you are to replace.

Three specific signals in this layer:

  • Feature breadth contraction: A customer who was using seven features six months ago and is now consistently using three is showing retrograde adoption. Contraction is almost always a symptom of something changing inside the account: a team member who owned those features left, a competing tool started handling part of the workflow, or the champion stopped pushing internal adoption.
  • Power user ratio: The percentage of licensed seats with weekly or daily activity. In most B2B SaaS products, a healthy account has 40% to 60% of seats actively engaged. Below 20% and the product is being used by one or two individuals without organizational buy-in. Those accounts churn when those individuals move on.
  • Collaborative usage decline: Most B2B products have a collaboration layer: shared reports, comment threads, exported documents, invited teammates. A decline in these signals means the product is shifting from a team tool to an individual one. Individual tools are easier to cancel because the decision involves fewer stakeholders.

Layer 3: Relationship Signals

This is the layer most teams don’t instrument at all. It’s also the layer that predicted the miss we didn’t catch, which I’ll cover in detail below.

The relationship layer tracks what’s happening to the human connections between your team and the account. Product data tells you what people are doing. Relationship data tells you who’s doing it and whether the people who matter are still there.

Three core relationship signals:

  • Champion last live conversation date: Not email. Not an automated NPS survey. An actual conversation (call, Slack message, video meeting) between a CS or sales contact and someone at the account with real organizational authority. If that date is more than 45 days ago on a strategic account, something needs to happen before the next renewal window.
  • Contact-level activity vs. account-level activity: An account-level login count can look fine while a single user drives all of it. Segment your usage data by contact, not just account. When one contact is responsible for 80% or more of activity on a 10-seat account, you have a champion dependency risk that product telemetry won’t surface until that one person goes quiet.
  • Champion tenure alert: Track how long the primary contact has been in their role. When a champion changes through promotion, departure, or restructuring, churn probability spikes. A 2,500-account study by Kumo.ai found that when a primary power user goes inactive, churn probability increases by 8x. The $22M ARR HR tech company I’ll reference in the miss section learned this the hard way: once they built automated relationship health signals to detect champion activity changes, they caught departures 45 days before cancellation and preserved $1.8M in ARR in the first year.

Layer 4: Commercial Signals

This layer is the most time-bound. The signals here confirm risk rather than predict it: by the time commercial pressure signals fire, you’re in a 30 to 60 day window and need CS actively engaged.

Key commercial signals to track:

  • Renewal proximity combined with low health score: Renewal pressure amplifies existing risk. An account at 55 health score with 90 days to renewal is a very different situation from an account at 55 health score with 11 months to go. The former needs an intervention now.
  • Contract value contraction: Any downgrade request, seat reduction, or tier change is a signal. Not just for that account: contraction requests often cluster around specific customer profiles before a broader churn wave shows up in monthly metrics. When you see three downgrade requests in 30 days from a specific ICP segment, something has changed for that cohort.
  • Billing contact emergence: When someone from procurement or finance appears in your billing communications for the first time, it usually means a cost review is underway. This is one of the two silent churn signals. I’ll detail it in the misses section.

Layer 5: Conversational Signals

This layer is qualitative, which is why it gets underweighted by default. Quantifying it requires discipline, not tooling.

The most powerful signal in this layer: when a CSM hears or reads the phrase “we’re evaluating options” or anything semantically close to it, that account’s churn probability jumps 4x to 6x within a 90-day window. One phrase in a renewal call is worth more signal than three months of product usage data. But only if someone logs it.

Build a simple tagging system: CSMs flag calls and emails where they encounter any of these signal phrases and enter them in the CRM with a timestamp. This doesn’t require AI or NLP. It requires 30 seconds of discipline per call.

Two other conversational signals worth tracking:

  • Support ticket sentiment, not volume: High ticket volume can be a positive signal (active engagement, investment in getting value). What matters is sentiment over time. A customer whose tickets shift from “how do I do X” to “why doesn’t X work” to “this is a problem for our team” is showing a deteriorating experience that won’t appear in login data.
  • Zero tickets from a complex product: If an account has been live for four months on a product with genuine complexity and hasn’t filed a single support ticket, they’re not satisfied. They’ve stopped trying.

Scoring the Churn Signal Stack: How to Weight Five Layers

The five layers produce a composite score. Here’s how to build it without enterprise tooling.

The three-stage maturity model for operationalizing churn prediction:

StageAccount CountMethodAccuracy vs. Full ML
Rule-basedUnder 200 accountsSpreadsheet scorecard, weighted manuallyWithin 15%
Statistical500+ accounts with 12+ months historyLogistic regression on historical churnWithin 8%
ML/AI2,000+ accountsFeature-importance models, automatic signal discoveryBaseline

For most companies in the $50K to $150K MRR range: a well-built spreadsheet outperforms a poorly implemented ML model. You do not need Gainsight. You do not need a data science team. You need a spreadsheet with five input columns, clear signal definitions, and someone reviewing it weekly. Rule-based scorecards perform within 15% of ML accuracy for companies with fewer than 200 accounts. The complexity of ML adds real value at 2,000+ accounts, where you have enough historical churns for the model to learn which signal combinations matter for your specific product.

The scoring structure for a rule-based model:

Health TierScore RangeDefault Action
Healthy80-100Expansion candidate, quarterly touchpoint
Neutral60-79Monthly engagement, usage review
At Risk40-59Active CS intervention, bi-weekly check-in
CriticalBelow 40Executive escalation, save plan initiated

Weight each signal on a 1-to-10 scale within each layer, multiply by the layer weight, sum the result. Calibrate the weights quarterly against actual churn outcomes: which signals fired first on accounts that churned, and which ones were false positives on accounts that held. After two quarters of calibration, the model’s accuracy improves significantly because it’s fit to your specific product’s churn patterns rather than generic benchmarks.

For context on where churn prediction fits in the broader business health picture, the five SaaS metrics that actually predict growth covers how retention signals compound with acquisition and expansion metrics into a complete company health model.

The Two Churns We Didn’t Catch

The nine churns we predicted share a signature: at least two layers fired simultaneously, with one of them being Layer 1 or Layer 2. The signal stack caught them 60 to 90 days out. CS had runway.

The two misses are different.

Miss 1: The Champion Departure We Couldn’t See

The account had a strong health score: 77 across all five layers. Product usage was stable. Feature breadth was solid. The CSM had a 7 out of 10 NPS response from the primary contact three weeks before renewal.

Six weeks before renewal, the primary contact was restructured out of the company as part of a reorg. We found out when a new person showed up to what was supposed to be a renewal call and said they’d been asked to review all software contracts.

The “user went dark” alert fired 11 days after the champion left. It looked like vacation. By the time it fired, the new contact had already started a competitor evaluation.

This is a structural limitation of product telemetry. Usage data captures behavior, not organizational changes. LinkedIn job changes don’t live in Mixpanel. The only instrument that reaches this layer is relationship tracking: specifically, whether your team has had a live conversation with the primary contact recently enough to notice behavioral changes that precede a departure.

The $22M ARR HR tech company I referenced earlier lost $400K in ARR from champion departures before they built a relationship health signal for it. After they instrumented champion activity changes and built an alert for primary contacts who hadn’t been active or hadn’t been in a live conversation with CS in 45+ days, they caught departures before the replacement showed up. First year result: $1.8M in ARR preserved.

The instrument we built after this miss: a weekly process to cross-reference primary contacts against job change signals and a CRM field that tracks “champion last contact date” and “champion role tenure months.” Anything under 18 months in role gets flagged at renewal minus 120 days. At our account count, this is a manual process. At 500+ accounts, it should be automated.

Miss 2: The Silent Budget Decision

The second miss had no champion departure. The champion was still there, still engaged, still using the product. Health score: 71. No red flags in any of the five layers.

What happened: finance ran a Q3 cost consolidation exercise and the tool appeared on a board-level review list that had nothing to do with product performance. Nobody in the account told CS. Usage continued normally until the contract date approached and a cancellation came through from an email address we didn’t recognize.

The only pre-signal that existed: about three weeks before the cancellation email, a procurement contact we’d never heard of opened a renewal pricing email we’d sent. We didn’t have a trigger on first-time billing contact engagement from an unknown address. We do now.

Silent budget decisions originate above your product’s organizational reach. Your champion isn’t the CFO. But the CFO’s cost review is what terminates contracts. The instruments that reach this layer are narrow: broad relationship mapping (knowing who procurement and finance are in each account, not just the champion from the original sale), and billing engagement tracking at the individual contact level rather than the aggregate account level.

Both misses share the same root cause: the signal lived in a layer product telemetry doesn’t reach. You can optimize Layers 1, 2, 4, and 5 to near-maximum. Layer 3 is where prediction either extends further out or stops.

Building Your Signal Stack in 30 Days

This doesn’t require a major tooling investment. The build sequence:

Week 1: Instrument Layers 1 and 2

Pull product usage data into a spreadsheet. Define your “core action” specifically: the one thing a customer does that delivers the value they paid for. Track completion rate per account over 30-day windows. Add a column for feature breadth count (distinct features used in last 30 days). Add a column for power user ratio (seats with weekly activity divided by total seats). You now have the foundation of Layers 1 and 2 working.

Week 2: Build the Relationship Layer

Add a CRM field: “Last live conversation date.” Actual conversations, not automated emails. Add a column for “Champion tenure months” for your primary contact. Add a note field for any contact that has changed roles or companies in the last six months. None of this requires new tooling beyond what you already have.

Week 3: Add Commercial and Conversational Signals

Set up a renewal proximity alert: any account within 90 days of renewal that falls below a 60 health score gets flagged in a weekly review. Brief your CS team on four or five specific signal phrases to log in CRM notes when they appear on calls. Run a support ticket volume and sentiment check for each at-risk account.

Week 4: Calibrate Against History

Take your last 12 months of churned accounts and run the signals backward. Which layers fired? How early? Which fired and were false positives? Adjust weights based on what actually predicted churn in your specific product. This calibration step is what separates a useful signal stack from a generic health score template.

After four weeks you have a working system. After two calibration cycles, you have an accurate one. Once your account count crosses 500 with a year of churn history, add statistical validation. Once you’re at 2,000+ accounts, consider ML. Until then, the rule-based stack outperforms anything more complex because you simply don’t have enough churned accounts for a model to learn the patterns.

Once your signal stack is firing at the right accounts, you need a response playbook for each risk tier: what CS actually does at each alert level, when to escalate, how to frame the intervention conversation. That’s covered in detail in the SaaS churn prevention framework, including specific playbooks for neutral, at-risk, and critical tiers.

What the Nine Had in Common

The nine churns we caught shared this signature: at least two layers fired simultaneously, with one of them being Layer 1 or Layer 2. Single-layer signals produce false positives. Multi-layer convergence produces actual prediction.

The accounts we ultimately saved had something else in common: the CS intervention happened within 48 hours of the stack crossing a risk threshold. Not a templated email. A real conversation where someone said: “We’re seeing X and Y in your usage data. Can we get 20 minutes to understand what’s happening?” That framing changed the dynamic. It positioned the call as consultative rather than reactive, because we weren’t calling because they submitted a cancellation request. We were calling 60 days before they’d made the decision.

That’s the operational value of a signal stack. Not the prediction itself, but the conversation it lets you have earlier.

If you want help building the signal architecture or the CS response layer that acts on it, book a free growth audit and we’ll map your specific account base, retention risk profile, and where your current model is most likely to miss.

Ready to Scale Your Startup?

Let's discuss how we can help you implement these strategies and achieve your growth goals.

Schedule a Call