Skip to main content

Case Studies

Deep dives into production data engineering solutions with measurable business impact.

Data Warehouses

Tampa Rent Signals Data Pipeline

Data Engineer2024

Production-ready data engineering pipeline integrating Zillow, ApartmentList, and FRED data with dbt Core, Great Expectations, and Dagster orchestration on Snowflake.

Key Results

15 Assets
Orchestration
100+ Rules
Data Quality
12 Checks
Asset Checks
9 Endpoints
API Endpoints

Technology Stack

Snowflakedbt CoreDagsterGreat ExpectationsFastAPIPythonAWS S3DockerRender
ML/AI

LeaseRadar - Full-Stack Rental Market Platform

Full-Stack Engineer2024

End-to-end full-stack application combining Next.js 14 frontend with FastAPI backend, featuring real-time market analytics, interactive data visualizations, and comprehensive rental market insights powered by Snowflake data warehouse.

Key Results

15+ Components
Frontend Components
9 Endpoints
API Endpoints
5min Cache
Performance
WCAG AA
Accessibility

Technology Stack

Next.js 14TypeScriptFastAPISnowflakeTailwind CSSZustandRechartsRadix UIPythonDockerRender
Real-time Streaming

Real-Time Offers Engine — Kafka → Spark → S3 → SQS/Lambda

Data Engineer2024

Built a sub-minute streaming pipeline that enriches card transactions and triggers SMS-style offers; PII-safe data lake in S3.

Key Results

30-60s
End-to-End Latency
10K+/min
Processing Speed
99.9%
System Uptime
~$50
Monthly Cost

Technology Stack

Apache SparkConfluent KafkaAWS S3AWS LambdaAWS SQSPythonReal-time Streaming
Data Warehouses

E-commerce Data Warehouse

Data Engineer2024

Production-ready cloud-native data warehouse showcasing modern data engineering practices with Medallion Architecture on Google Cloud Platform.

Key Results

116,294
Records Processed
58%
Data Deduplication
18/19
Test Coverage
8 seconds
Pipeline Speed

Technology Stack

GCPBigQuerydbt CoreGitHub ActionsWorkload IdentityMermaidPython
Lakehouses

Earthquakes Lakehouse (Azure)

Data Engineer2024

ADF → ADLS Gen2 → Databricks Delta (Bronze/Silver/Gold) with dbt marts and Azure Monitor alerts.

Key Results

18:00
ADF Schedule
B/S/G
Medallion
dbt tests
Quality
Azure Monitor
Alerts

Technology Stack

AzureADFADLS Gen2Databricks/DeltadbtGitHub ActionsAzure Monitor
Lakehouses

Taxi Lakehouse (Azure)

Data Engineer2024

ADF Copy to ADLS, Databricks Bronze/Silver, and dbt Gold fact fct_taxi_daily with email alerts and CI integration.

Key Results

06:00
ADF Schedule
Bronze/Silver
Layers
fct_taxi_daily
Gold
Email
Alerts

Technology Stack

AzureADFADLS Gen2Databricks/DeltadbtGitHub Actions
Data Warehouses

SEC EDGAR Financials Warehouse

Data Engineer2024

Production-style lakehouse architecture processing SEC financial data with BigQuery, dbt, and automated data quality validation.

Key Results

14/14
dbt Tests Passed
100%
GE Validations
80-90%
Query Cost Reduction
Daily 06:00 UTC
Automation

Technology Stack

GCPBigQuerydbtGreat ExpectationsGitHub ActionsLooker StudioPython
ETL Pipelines

Cloud-Native Crypto ETL

Data Engineer2024

Serverless cryptocurrency data pipeline using Cloud Run Jobs, BigQuery, and Terraform with automated cost optimization.

Key Results

~$2.36
Monthly Cost
Serverless
Architecture
Daily/6-hourly
Automation
100% Code
Infrastructure

Technology Stack

PythonDockerCloud Run JobsCloud SchedulerBigQueryGCSSecret ManagerdbtTerraformGreat Expectations
ML/AI

Recruit Reveal - Low-Latency Model Serving

ML Engineer / Data Engineer2024

Machine learning model deployment on Databricks with Python SQL UDFs for real-time NFL draft predictions.

Key Results

XGBoost Multi-class
Model Type
Reproducible
Feature Engineering
MLflow
Model Tracking
SQL UDFs
Serving

Technology Stack

DatabricksPySparkSQL WarehouseUnity CatalogPythonXGBoostMLflowNode.jsExpressNext.js