Deploying the Churn Prediction Model in Production
Deploying a churn prediction model into production using BigQuery ML enables businesses to generate actionable insights through batch and real-time predictions. Batch predictions are cost-effective and ideal for periodic analysis, while real-time predictions provide instant insights during customer interactions. This article outlines how to use SQL-based prediction queries for both approaches and integrate them with Cloud Functions and Pub/Sub for automation. Additionally, strategies like scheduling batch predictions, limiting real-time triggers, and optimizing costs with BigQuery slot reservations ensure an efficient and scalable deployment. By leveraging GCP tools, companies can align machine learning capabilities with their business objectives.

Todd Bernson
2024-11-20

Overview
Deploying a churn prediction model into production is the final step in leveraging machine learning insights to drive business decisions. BigQuery ML simplifies this process by enabling both batch and real-time predictions using SQL queries. This article outlines the deployment steps, compares batch and real-time prediction approaches, and discusses cost management strategies to ensure efficient operations.
Batch Predictions vs. Real-Time Predictions
Batch Predictions
Batch predictions are suitable for periodic updates where predictions don’t need to be generated instantly. This approach is ideal for churn prediction in scenarios like:
- Weekly or monthly churn reports.
- Customer retention campaigns.
Advantages:
- Lower costs as predictions are run less frequently.
- Simpler implementation, as they rely on scheduled queries.
Disadvantages:
- Not suitable for use cases requiring immediate insights.
Real-Time Predictions
Real-time predictions are triggered for specific events, such as when a customer interacts with a service. This approach is ideal for:
- Dynamic churn prediction during customer interactions.
- Triggering personalized offers or interventions.
Advantages:
- Instant insights allow for immediate action.
- Enhances customer experience with timely responses.
Disadvantages:
- Higher costs due to constant computation.
- Requires more complex infrastructure, such as event-driven architectures.
Prediction Queries in BigQuery
Batch Prediction Query
The following query predicts churn for all customers and writes the results to a BigQuery table for downstream analysis:
CREATE OR REPLACE TABLE `<DATASET_NAME>.predictions` AS
SELECT
customerID,
predicted_label AS churn_prediction,
predicted_probability AS churn_probability,
*
FROM
ML.PREDICT(MODEL `<DATASET_NAME>.<MODEL_NAME>`,
TABLE `<DATASET_NAME>.<INPUT_TABLE_NAME>`);
Explanation:
- ML.PREDICT: Generates predictions using the specified model.
- predicted_label: Indicates whether a customer is predicted to churn.
- predicted_probability: Provides the confidence score for the prediction.
Real-Time Prediction Query
For real-time predictions, a single customer’s data can be used as input:
SELECT
customerID,
predicted_label AS churn_prediction,
predicted_probability AS churn_probability
FROM
ML.PREDICT(MODEL `<DATASET_NAME>.<MODEL_NAME>`,
(SELECT * FROM `<DATASET_NAME>.<INPUT_TABLE_NAME>` WHERE customerID = 'CUST123'));
Explanation:
- Filters the input table to focus on a specific customer (
CUST123). - Returns immediate churn predictions for personalized actions.
Integration with Cloud Functions
To enable real-time predictions, you can use Cloud Functions and Pub/Sub to automate the process. For example, trigger a Cloud Function whenever a customer event is logged.
Cloud Function Code Example
import os
from google.cloud import bigquery
def predict_churn(event, context):
client = bigquery.Client()
dataset_name = os.environ.get("DATASET_NAME")
model_name = os.environ.get("MODEL_NAME")
customer_id = event['attributes']['customerID']
query = f"""
SELECT
predicted_label AS churn_prediction,
predicted_probability AS churn_probability
FROM
ML.PREDICT(MODEL `{dataset_name}.{model_name}`,
(SELECT * FROM `{dataset_name}.customers` WHERE customerID = '{customer_id}'))
"""
results = client.query(query).result()
for row in results:
print(f"CustomerID: {customer_id}, Churn Prediction: {row.churn_prediction}, Probability: {row.churn_probability}")
Explanation:
- Event-Driven Architecture: The function is triggered by a Pub/Sub event containing the
customerID. - BigQuery Integration: Executes a query to predict churn for the specified customer.
Cost Management
Running predictions at scale can incur significant costs. Here are some strategies to optimize spending:
- Batch Predictions: Schedule predictions during non-peak hours to leverage GCP’s cost-saving features.
- Selective Real-Time Predictions: Limit real-time predictions to high-value customers or critical interactions.
- BigQuery Slot Reservations: For frequent predictions, consider purchasing BigQuery slots to reduce on-demand query costs.
Code Snippets for Batch and Real-Time Predictions
Batch Prediction Scheduling with Cloud Scheduler
gcloud scheduler jobs create query churn-batch-predictions \
--schedule="0 2 * * *" \
--statement="CREATE OR REPLACE TABLE `<DATASET_NAME>.predictions` AS SELECT * FROM ML.PREDICT(MODEL `<DATASET_NAME>.<MODEL_NAME>`, TABLE `<DATASET_NAME>.<INPUT_TABLE_NAME>`);" \
--time-zone="UTC" \
--bigquery-use-default-project
Explanation:
- Runs the batch prediction query nightly at 2 AM UTC.
- Stores results in the predictions table for analysis.
Deploying a churn prediction model involves careful consideration of use cases, cost management, and infrastructure complexity. By leveraging BigQuery ML’s integration with Cloud Functions and Pub/Sub, you can implement both batch and real-time predictions tailored to business needs.
Read More
View all posts
AI/ML
Why Enterprise AI Must Be Application-Led, Not Agent-Led
A deep dive by Todd Bernson, CTO and Chief AI Officer, on why enterprise AI systems should be architected as application-led, deterministic platforms with embedded agentic AI—not fully autonomous agents. This article explains how API-first, governed, multi-channel architectures deliver higher reliability, compliance, scalability, and business value in real-world Fortune-500 environments.

Todd Bernson
2025-12-02

AI/ML
Application-First Agentic AI
Application-first agentic AI is emerging as the only reliable path to real enterprise ROI. In this in-depth analysis, Todd Bernson, CTO & CAIO, breaks down why most generative AI initiatives stall in production—and how disciplined enterprise architecture, deterministic workflows, and narrowly scoped AI agents can finally unlock repeatable business value. Using a real sprint-intelligence system as a case study, the article shows how organizations can combine serverless engineering, structured orchestration, and constrained LLM reasoning to reduce reporting effort, increase trust, eliminate hallucinations, and deliver actionable insights across engineering, operations, compliance, and customer experience.

Todd Bernson
2025-11-28
AI/ML
Why 95% of AI Projects Fail and How to Be Among the 5% That Succeed

Lee Hylton
2025-08-22