Solo engineer 2026

Vertex AI Model Monitoring PoC

An end-to-end proof-of-concept running ML model monitoring across three deployment paradigms — custom sklearn, AutoML Tabular, and BigQuery ML — on Google Vertex AI with Terraform-provisioned infrastructure.

Problem

Teams evaluating Vertex AI Model Monitoring V2 had no single reference showing how drift detection actually differs across custom-trained, AutoML, and BigQuery ML models. Each path has different serving, logging, and drift-validation mechanics, and stitching them together from documentation alone was slow and error-prone.

Contribution

Built an end-to-end PoC covering all three pillars: synthetic data generation, training (sklearn / AutoML / BQML), endpoint and batch deployment, and Model Monitoring V2 jobs — including BigQuery-native drift via ML.VALIDATE_DATA_DRIFT and attribution drift via ML.EXPLAIN_PREDICT. Provisioned the full GCP footprint in Terraform (GCS, BigQuery, Pub/Sub retraining gate, provisioner/runtime service accounts, IAM, logging) with layered config resolving from env vars → Terraform outputs → defaults.

Outcome

A reference PoC with full technical documentation. [Add the decision or platform adoption it supported if shareable.]

PythonVertex AITerraformBigQuery MLscikit-learnPub/SubGCP

The value of a PoC like this is in the breadth done cleanly: three genuinely different MLOps paradigms — a custom sklearn model, an AutoML Tabular model, and a BQML logistic regression — unified behind one configuration system and one Terraform footprint. Without a reference that spans all three, teams evaluating the platform end up testing one path and assuming the others behave similarly. They don’t.

The Pub/Sub topic standing in as a retraining gate reflects an intentional architectural choice: rather than triggering retraining automatically on any drift signal, the gate gives a human or downstream system the chance to validate the signal before committing to a retraining run. That’s the pattern production MLOps actually needs, and the PoC demonstrates it rather than simplifying it away.

The config-resolution order (env vars → Terraform outputs → defaults) means the PoC can be pointed at a different GCP project by changing a single variable — a small detail that makes the difference between a demo that only works in its original environment and a reference that teams can actually adopt.