Technology Stack

A high-performance analytics and ad delivery pipeline built with Rust, Kafka, ClickHouse, and an ML layer powered by dbt and XGBoost.

Event Tracking Pipeline

How page views, clicks, and ad events flow from client to ClickHouse.

Tracking & Storage Flow

graph LR subgraph Clients["Clients"] NW["Web Browser"] NB["Mobile App"] end subgraph Tracker["Analytics Tracker"] TV["Page Events"] TAV["Ad Events"] end subgraph Server["API Server"] EP1["Page Tracking\n(async)"] EP2["Ad Tracking\n(synchronous)"] end subgraph Kafka["Message Queue"] K1["Page Event Stream"] K2["Ad Event Stream"] end C["Stream Consumer"] subgraph CH["ClickHouse"] T1["Page Events"] T2["Ad Events"] T3["User Profiles"] end NW --> TV NB --> TV NW --> TAV NB --> TAV TV --> EP1 TAV --> EP2 EP1 -->|buffered| K1 EP2 -->|buffered| K2 EP1 -->|direct| T1 EP2 -->|direct| T2 K1 --> C K2 --> C C --> T1 C --> T2 T1 -->|aggregated| T3 T2 -->|aggregated| T3

Async Page tracking is fire-and-forget — the server responds immediately and processes the event in the background, keeping client latency minimal. Sync Ad tracking is synchronous — the server confirms the write before responding, ensuring impressions and clicks are reliably recorded.

Ad Delivery Flow

How ads are served from MongoDB to clients in real time.

Ad Serving (SSE)

graph LR MG["Ad Database\n(MongoDB)"] subgraph Server["API Server"] ADS["Ad Stream\n(real-time SSE)"] LIST["Ad Listings\n(on demand)"] end subgraph Clients["Clients"] NW["Web Browser"] NB["Mobile App"] end MG --> Server ADS -->|"real-time push"| NW List -->|"on-demand fetch"| NB NW -->|"impression & click events"| Server NB -->|"impression & click events"| Server

SSE The web client opens a persistent connection and receives ads as they rotate every 30 seconds. The mobile app fetches ads on launch and rotates them locally on a timer.

Analytics & ML Pipeline

How raw events become dashboards and trained models.

Analytics & ML Flow

graph LR subgraph CH["ClickHouse"] RAW["Raw Events\n(page views, clicks,\nad impressions)"] end subgraph DBT["dbt Transform Layer"] STG["Staging\n(cleaned events)"] INT["Sessions\n(user sessions)"] MART["Feature Datasets\n(CTR, booking, user)"] end PARQUET["Exported Datasets\n(Parquet)"] subgraph ML["ML Training"] XGB1["CTR Model\n(XGBoost)"] XGB2["Conversion Model\n(XGBoost)"] end MLFLOW["Experiment Tracker\n(MLflow)"] DASH["Analytics Dashboards"] RAW --> STG STG --> INT INT --> MART MART -->|"export"| PARQUET PARQUET --> XGB1 PARQUET --> XGB2 XGB1 --> MLFLOW XGB2 --> MLFLOW RAW -->|"live queries"| DASH

Components

Every piece of the stack and what it does.

Server

Rust / Axum

A fast, reliable backend that handles all tracking requests and serves ads to clients.

Messaging

Apache Kafka

A message queue that absorbs high volumes of events, ensuring no data is lost under load.

Analytics Storage

ClickHouse

A high-performance analytics database purpose-built for querying billions of events in real time.

Ad Storage

MongoDB

A flexible document store for ad definitions, supporting banner and video ad formats.

Tracker

naboria-tracker.js

A lightweight JavaScript library that integrates into any web or mobile app with a single init() call.

Data Pipeline

dbt

Transforms raw events into clean, analytics-ready datasets using version-controlled SQL.

Machine Learning

XGBoost + MLflow

Predictive models for ad click-through rate and booking conversion, with full experiment tracking.

Dashboards

Analytics UI

Real-time dashboards for views, clicks, ad performance, and user activity — refreshed automatically.