High Level Design
End-to-end view of every layer — from public ingress to the analytics database tier.
Platform Services
Every service in the Webtrends Optimize production platform, its role, and what it connects to.
Main web application server. Serves the Webtrends Optimize dashboard, tag delivery, and the platform UI. 2-node Availability Set cluster.
Authentication and authorisation gateway for all platform APIs. Validates tokens and enforces role-based access. Backed by RAS DB and REDIS.
Persistent data layer abstraction. Handles all read/write operations to underlying databases. Sits between the application tier and storage tier.
Processes and serves A/B test tags to CDN and Origin. Has a dedicated internal Load Balancer. 2-node AVST cluster.
High-throughput event ingestion endpoint. Receives visitor tracking data from client JS tags across all regions. 4 VMs per region, 3 regions.
Captures conversion and interaction events. Streams into KAFKA which feeds ClickHouse. 4 VMs per region, 3 regions.
Internal orchestration bus. Routes messages between services and triggers schedulers, trial provisioning, and social campaigns.
Generates experiment reports and data exports. Reports VM Node 1–2 plus a dedicated Extracts VM for batch export jobs.
Distributed event streaming for PAT and Collect pipelines. 3 VMs per region across EUW, EUN, UE2. Coordinated by Zookeeper cluster.
3-node cluster providing distributed coordination and leader election for KAFKA. Required for KAFKA quorum to function correctly.
Primary analytics database. Columnar storage for high-speed event queries. 4-node cluster with zone redundancy (A, B, C, D). Fronted by an internal Load Balancer.
In-memory cache and session store. GEO-distributed (2 VMs × 3 regions) plus a central REDIS STACK cluster (VM 1–3) for the application tier.
How Requests Move Through the Platform
Traced end-to-end for each major traffic type — public ingress to storage write.
Component Connections
All defined data flows between platform components, from the HLD diagram.
| From | To | Traffic / Purpose | |
|---|---|---|---|
| Traffic Manager (Collect) | → | Application Gateway (EUW / EUN / UE2) | Geo-DNS load balancing |
| Traffic Manager (PAT) | → | Application Gateway (EUW / EUN / UE2) | Geo-DNS load balancing |
| Azure CDN Service | → | Origin VM Cluster | CDN origin pull for static assets |
| Application Gateway | → | Origin VM Cluster | Dashboard HTTPS requests |
| Application Gateway | → | Collect GEO VM Pools | Event ingestion traffic |
| Application Gateway | → | PAT GEO VM Pools | Test delivery traffic |
| Collect GEO VMs | → | Load Balancer · ClickHouse | Write analytics events to DB |
| PAT GEO VMs | → | KAFKA GEO | Stream test events |
| KAFKA GEO | → | ClickHouse DB Cluster | Event consumption to analytics store |
| Zoo Cluster (3 nodes) | ↔ | KAFKA GEO | Distributed coordination / quorum |
| Load Balancer · ClickHouse | → | ClickHouse DB Nodes 1–4 | DB-level traffic distribution |
| Load Balancer · TMS | → | TMS VM Node 1–2 | Tag management requests |
| Origin VM Cluster | → | ACS VM Cluster | Auth token validation |
| Origin VM Cluster | → | DSS VM Cluster | Data read / write |
| IO Service | → | Scheduler / Trial / Social VMs | Job orchestration triggers |
| IO Service | → | WASDB SQL + REDIS STACK | State + session management |
| Reporting Services | → | Q2DB GEO | Report data reads |
| Reporting Services | → | REDIS GEO | Report result caching |
| DSS VM Cluster | → | WASDB SQL | Primary application data writes |
| ACS VM Cluster | → | REDIS STACK | Session token cache |