High-Performance Cryptocurrency Exchange Backend
A distributed matching system built in Go, split into gateway, order-service, matching-engine, and market-data-service, with Transactional Outbox, leader election, batch ingestion, and architecture-level load validation.

Project Overview
This project is a high-performance exchange backend built from scratch in Go. The goal was not merely to expose order APIs, but to validate the hard parts of a trading system under load: consistency, failover, event flow, settlement correctness, and real-time market data delivery. The current system is split into four services, gateway, order-service, matching-engine, and market-data-service, connected through PostgreSQL, Redis, and Kafka/Redpanda to handle order intake, matching, settlement, order status updates, and WebSocket streaming. Beyond feature delivery, I also built Prometheus/Grafana observability, k6 architecture-level stress tests, and staging deployment automation on ECS so the project can be measured and operated like a production system.
Technical Challenges & Solutions
Dual-write Consistency and Reliable Event Delivery
Placing an order must lock funds, create the order row, and publish a Kafka event. If the database commit succeeds but message publication fails, the system can end up with inconsistent trading and accounting state.
High Availability Without Split-Brain
The matching engine needs standby replicas for failover, but if two replicas consume the same partition at once, duplicate trades and broken balances become possible.
Deadlocks and Idempotency in Asynchronous Settlement
TX2 settlement updates maker and taker orders, balances, locked funds, and trade records in one path. Under high concurrency, row lock contention and duplicate Kafka delivery can lead to deadlocks or double settlement.
High-frequency Batch Ingestion for Market Makers
Market makers and arbitrage bots can submit large bursts of orders. Per-request HTTP overhead and row-by-row inserts quickly cap throughput.
Real-time Fanout and Observable Performance Validation
Trade events must reach large WebSocket audiences without slowing the order path, and without end-to-end metrics it is hard to locate whether bottlenecks live in HTTP, Kafka, matching, or push delivery.
Architecture
The gateway is the single public entry point, responsible for Redis-based sliding-window rate limiting, Idempotency-Key protection, and reverse proxy routing. order-service handles HTTP APIs and TX1: lock funds, create orders, and persist outbox records before publishing exchange.orders in the background. matching-engine runs in Active-Standby mode using PostgreSQL leader election with fencing tokens, restores live limit orders from the database on cold start, executes in-memory price-time-priority matching, and publishes settlement, trade, and orderbook events. market-data-service focuses only on Kafka consumption and high-fanout WebSocket delivery. Locally the stack runs with Docker Compose; staging is deployed as microservices on AWS ECS via Terraform and ecspresso.
Learnings
This rewrite moved me from building APIs to designing a trading system. I had to solve dual-write consistency, asynchronous settlement, split-brain prevention, batch ingestion, deadlock avoidance, slow WebSocket consumer isolation, and, just as importantly, how to prove the system works under pressure with metrics and load tests. The biggest takeaway was that throughput alone is not enough for high-concurrency systems; the real bar is whether the data stays correct, the failure modes remain understandable, and the system is still operable when parts degrade.