Healthcare

Healthcare Data Warehouse: Unifying GA4, App Analytics and CRM in BigQuery

How we built a BigQuery data warehouse for a European healthcare clinic, unifying GA4, mobile app analytics, CRM and 10+ paid media platforms to expose onboarding drop-off and enable churn modeling.

Challenge

A healthcare clinic selling B2C and B2B digital subscription plans (personal, family and corporate health benefit packages) was running its customer acquisition across a wide mix of paid media, mobile apps (iOS, Android and Huawei AppGallery) and a desktop portal. Its marketing and product teams had data in a dozen disconnected tools and no way to answer basic questions end-to-end.

Specifically, the clinic could not:

  • See the full journey from paid ad click to app install, onboarding completion, first active subscription and retention.
  • Compare drop-off rates across subscription tiers ($150 / $300 / $400 / $800 plans) and across devices (iOS vs Android) on the same data.
  • Keep a longer history than GA4's default 14-month retention, which made year-over-year and cohort analysis impossible.
  • Combine CRM data (corporate contracts, renewals) with behavioral app data on the same customer ID.

The ask was not "another dashboard". It was to build the analytical foundation that would make segmentation, funnel optimization and churn prediction possible.

Solution

We built a centralized data warehouse on Google Cloud Platform's BigQuery, ingesting and normalizing data from 10+ sources into a single, analyst-ready model.

BigQuery data warehouse architecture integrating GA4, Firebase, AppsFlyer, Huawei AppGallery, InnerTrends, paid media and CRM sources feeding Looker and Tableau.

Architecture highlights

  • Ingestion layer
    • Native GA4 and Firebase exports to BigQuery, scheduled API pulls from AppsFlyer, Huawei AppGallery Connect, Kareo, Cbox, Actito and Mailchimp, plus a Funnel.io pipeline consolidating paid media spend across all acquisition channels.
  • In-app behavioral tracking
    • InnerTrends for product analytics events (onboarding steps, scan ID, account validation, activation) with ETLs landing clean event tables in BigQuery.
  • Normalization layer
    • GA4's nested event parameters were flattened and typed so analysts did not have to reinvent UNNEST logic in every query. Media data was normalized to a common channel/campaign/cost schema across platforms.
  • Modeling layer
    • Dimensional models in GCP Dataform joining behavioral events, CRM contacts, subscription plans and media spend on unified customer identifiers.
  • Consumption layer
    • Looker Studio for self-service exploration, Tableau Desktop for analyst-built deep dives, and SQL access in BigQuery for data scientists working on churn and conversion models.

Engineering details

The operational side was deliberately minimal: containerized ingestion jobs (Docker) running on Compute Engine with Instance Scheduler to keep costs down, code versioned in GitHub, and monitoring on refresh freshness and row-count anomalies. This was designed to run without a full data engineering team on the client side.

Result

With the warehouse in production, the clinic can now analyze the full acquisition and activation funnel in one place. The onboarding funnel below is a live example of the kind of report that was previously impossible to produce:

Healthcare app onboarding funnel in Tableau showing 28,000 first app opens, 5,165 active users, 81.55% overall drop rate, and drop-off at each step: Created Account, Scan ID, Validated Account, Became Active - with breakdowns by subscription plan, device and migration type.

Concrete outcomes

  • Full-funnel visibility, cohort-based. In a single view the team can see 28,000 first app opens translating to 5,165 active users, with drop-off rates at every step (First Open -> Created Account: 61.2%, Created Account -> Scan ID: 29.2%, Scan ID -> Validated: 31.6%, Validated -> Active: 1.8%). The overall cohort drop rate of 81.55% was previously hidden across four disconnected tools.
  • Plan-level insight. Active users can be segmented by plan ($150 -> 2,289, $300 -> 1,374, $400 -> 1,350, $800 -> 152), making it obvious that the $800 tier either has a product-market fit issue or a pricing-page problem worth investigating.
  • Device-level comparison on the same dataset. iOS and Android activation are roughly balanced (50.5% / 49.5%), which ended a long-running internal debate that had been based on anecdotal evidence.
  • Migration type attribution. Active users can be broken down by migration type (B2C new, employee, spouse, child, referral), which lets the clinic attribute activation to the right acquisition motion and measure the true cost per active user per channel.
  • Historical data beyond GA4 limits. Events are now retained indefinitely in BigQuery, enabling year-over-year cohort analysis that GA4's 14-month default retention makes impossible.
  • Foundation for churn and conversion modeling. With behavioral, CRM and subscription data co-located on unified IDs, the data science team now has the clean feature store required to build churn prediction and LTV models - work that is ongoing.

Why this architecture matters for healthcare

Healthcare subscription businesses have a specific data problem: the customer lifecycle is long, governance requirements are stricter than in retail, and the decision journey crosses paid media, web, mobile app and often a CRM-driven B2B sales motion. A BigQuery-centric warehouse solves this because it lets you keep behavioral data, first-party CRM data and media spend in one governed environment, with fine-grained access control, and still run the machine learning you need for retention work without moving data out.

Technologies and Tools

Google Analytics 4, Firebase, AppsFlyer, Huawei AppGallery Connect, Kareo, Cbox, Actito, Mailchimp, Funnel.io, InnerTrends, GCP Dataform, BigQuery, Google Cloud Platform, Compute Engine, Instance Scheduler, Docker, GitHub, Tableau Desktop, Looker Studio.

Ready to Transform Your Data?

Let us help you turn raw data into actionable insights. Schedule a free consultation to discuss your analytics needs.