Microservices Architecture Design and Implementation: From Monolith to Distributed Systems
Microservices architecture has become the de facto standard for building scalable, maintainable, and resilient enterprise applications. This comprehensive guide explores the journey from monolithic applications to distributed microservices, covering design principles, implementation strategies, and operational best practices.
Architecture Overview
Microservices vs Monolithic Architecture
graph TB
subgraph "Monolithic Architecture"
M1[User Interface]
M2[Business Logic]
M3[Data Access Layer]
M4[Database]
M1 --> M2
M2 --> M3
M3 --> M4
end
subgraph "Microservices Architecture"
subgraph "API Gateway"
AG[Load Balancer & Routing]
end
subgraph "User Service"
US1[User API]
US2[User Logic]
US3[User DB]
end
subgraph "Order Service"
OS1[Order API]
OS2[Order Logic]
OS3[Order DB]
end
subgraph "Payment Service"
PS1[Payment API]
PS2[Payment Logic]
PS3[Payment DB]
end
subgraph "Notification Service"
NS1[Notification API]
NS2[Message Queue]
NS3[Email/SMS]
end
AG --> US1
AG --> OS1
AG --> PS1
AG --> NS1
OS2 --> US1
OS2 --> PS1
PS2 --> NS1
end
Core Design Principles
- Single Responsibility: Each service owns a specific business capability
- Decentralized: Services manage their own data and business logic
- Fault Isolation: Failure in one service doesn’t cascade to others
- Technology Diversity: Services can use different tech stacks
- Independent Deployment: Services can be deployed independently
Service Decomposition Strategy
Domain-Driven Design (DDD) Approach
// Domain model example for User Service
package user
import (
"time"
"errors"
)
// User aggregate root
type User struct {
ID string `json:"id"`
Email string `json:"email"`
Username string `json:"username"`
Profile Profile `json:"profile"`
CreatedAt time.Time `json:"created_at"`
UpdatedAt time.Time `json:"updated_at"`
Version int `json:"version"`
}
type Profile struct {
FirstName string `json:"first_name"`
LastName string `json:"last_name"`
Avatar string `json:"avatar"`
Bio string `json:"bio"`
}
// Domain events
type UserCreatedEvent struct {
UserID string `json:"user_id"`
Email string `json:"email"`
Timestamp time.Time `json:"timestamp"`
}
type UserUpdatedEvent struct {
UserID string `json:"user_id"`
Changes map[string]interface{} `json:"changes"`
Timestamp time.Time `json:"timestamp"`
}
// Repository interface
type UserRepository interface {
Create(user *User) error
GetByID(id string) (*User, error)
GetByEmail(email string) (*User, error)
Update(user *User) error
Delete(id string) error
}
// Domain service
type UserService struct {
repo UserRepository
eventBus EventBus
validator UserValidator
}
func (s *UserService) CreateUser(email, username string, profile Profile) (*User, error) {
// Validate input
if err := s.validator.ValidateEmail(email); err != nil {
return nil, err
}
if err := s.validator.ValidateUsername(username); err != nil {
return nil, err
}
// Check if user already exists
existing, _ := s.repo.GetByEmail(email)
if existing != nil {
return nil, errors.New("user already exists")
}
// Create user
user := &User{
ID: generateID(),
Email: email,
Username: username,
Profile: profile,
CreatedAt: time.Now(),
UpdatedAt: time.Now(),
Version: 1,
}
if err := s.repo.Create(user); err != nil {
return nil, err
}
// Publish domain event
event := UserCreatedEvent{
UserID: user.ID,
Email: user.Email,
Timestamp: time.Now(),
}
s.eventBus.Publish("user.created", event)
return user, nil
}
func (s *UserService) UpdateUser(id string, updates map[string]interface{}) (*User, error) {
user, err := s.repo.GetByID(id)
if err != nil {
return nil, err
}
if user == nil {
return nil, errors.New("user not found")
}
// Apply updates with validation
if email, ok := updates["email"].(string); ok {
if err := s.validator.ValidateEmail(email); err != nil {
return nil, err
}
user.Email = email
}
if username, ok := updates["username"].(string); ok {
if err := s.validator.ValidateUsername(username); err != nil {
return nil, err
}
user.Username = username
}
user.UpdatedAt = time.Now()
user.Version++
if err := s.repo.Update(user); err != nil {
return nil, err
}
// Publish domain event
event := UserUpdatedEvent{
UserID: user.ID,
Changes: updates,
Timestamp: time.Now(),
}
s.eventBus.Publish("user.updated", event)
return user, nil
}
Service Boundaries Identification
# Service decomposition matrix
services:
user-service:
responsibilities:
- User registration and authentication
- Profile management
- User preferences
data:
- users
- user_profiles
- user_preferences
apis:
- POST /users
- GET /users/{id}
- PUT /users/{id}
- DELETE /users/{id}
order-service:
responsibilities:
- Order creation and management
- Order status tracking
- Order history
data:
- orders
- order_items
- order_status_history
apis:
- POST /orders
- GET /orders/{id}
- PUT /orders/{id}/status
- GET /users/{id}/orders
dependencies:
- user-service (user validation)
- inventory-service (stock check)
- payment-service (payment processing)
payment-service:
responsibilities:
- Payment processing
- Payment method management
- Transaction history
data:
- payments
- payment_methods
- transactions
apis:
- POST /payments
- GET /payments/{id}
- POST /payment-methods
- GET /users/{id}/payment-methods
dependencies:
- user-service (user validation)
- external payment gateways
inventory-service:
responsibilities:
- Product catalog management
- Stock management
- Price management
data:
- products
- inventory
- pricing
apis:
- GET /products
- GET /products/{id}
- PUT /products/{id}/stock
- GET /products/{id}/availability
Communication Patterns
Synchronous Communication (REST APIs)
// API Gateway implementation
package gateway
import (
"context"
"encoding/json"
"fmt"
"net/http"
"net/http/httputil"
"net/url"
"time"
"github.com/gorilla/mux"
"github.com/prometheus/client_golang/prometheus"
)
type Gateway struct {
router *mux.Router
services map[string]*ServiceConfig
circuitBreaker CircuitBreaker
rateLimiter RateLimiter
metrics *Metrics
}
type ServiceConfig struct {
Name string
BaseURL string
Timeout time.Duration
Retries int
HealthCheck string
}
type Metrics struct {
RequestsTotal prometheus.CounterVec
RequestDuration prometheus.HistogramVec
ErrorsTotal prometheus.CounterVec
}
func NewGateway() *Gateway {
return &Gateway{
router: mux.NewRouter(),
services: make(map[string]*ServiceConfig),
metrics: initMetrics(),
}
}
func (g *Gateway) RegisterService(name string, config *ServiceConfig) {
g.services[name] = config
// Register routes for the service
serviceRouter := g.router.PathPrefix(fmt.Sprintf("/%s", name)).Subrouter()
serviceRouter.PathPrefix("/").HandlerFunc(g.proxyHandler(name))
}
func (g *Gateway) proxyHandler(serviceName string) http.HandlerFunc {
return func(w http.ResponseWriter, r *http.Request) {
start := time.Now()
// Rate limiting
if !g.rateLimiter.Allow(r) {
http.Error(w, "Rate limit exceeded", http.StatusTooManyRequests)
return
}
service, exists := g.services[serviceName]
if !exists {
http.Error(w, "Service not found", http.StatusNotFound)
return
}
// Circuit breaker check
if !g.circuitBreaker.Allow(serviceName) {
http.Error(w, "Service unavailable", http.StatusServiceUnavailable)
return
}
// Create proxy
target, _ := url.Parse(service.BaseURL)
proxy := httputil.NewSingleHostReverseProxy(target)
// Add timeout
ctx, cancel := context.WithTimeout(r.Context(), service.Timeout)
defer cancel()
r = r.WithContext(ctx)
// Add headers
r.Header.Set("X-Forwarded-Host", r.Header.Get("Host"))
r.Header.Set("X-Origin-Host", target.Host)
// Proxy the request
proxy.ServeHTTP(w, r)
// Record metrics
duration := time.Since(start)
g.metrics.RequestsTotal.WithLabelValues(serviceName, r.Method).Inc()
g.metrics.RequestDuration.WithLabelValues(serviceName, r.Method).Observe(duration.Seconds())
}
}
// Circuit breaker implementation
type CircuitBreaker interface {
Allow(service string) bool
RecordSuccess(service string)
RecordFailure(service string)
}
type circuitBreaker struct {
states map[string]*CircuitState
}
type CircuitState struct {
State string // CLOSED, OPEN, HALF_OPEN
FailureCount int
LastFailure time.Time
Threshold int
Timeout time.Duration
}
func (cb *circuitBreaker) Allow(service string) bool {
state, exists := cb.states[service]
if !exists {
cb.states[service] = &CircuitState{
State: "CLOSED",
Threshold: 5,
Timeout: 30 * time.Second,
}
return true
}
switch state.State {
case "CLOSED":
return true
case "OPEN":
if time.Since(state.LastFailure) > state.Timeout {
state.State = "HALF_OPEN"
return true
}
return false
case "HALF_OPEN":
return true
default:
return false
}
}
Asynchronous Communication (Event-Driven)
// Event bus implementation using NATS
package eventbus
import (
"encoding/json"
"log"
"time"
"github.com/nats-io/nats.go"
)
type EventBus struct {
conn *nats.Conn
}
type Event struct {
ID string `json:"id"`
Type string `json:"type"`
Source string `json:"source"`
Data map[string]interface{} `json:"data"`
Timestamp time.Time `json:"timestamp"`
Version string `json:"version"`
}
func NewEventBus(natsURL string) (*EventBus, error) {
conn, err := nats.Connect(natsURL)
if err != nil {
return nil, err
}
return &EventBus{conn: conn}, nil
}
func (eb *EventBus) Publish(eventType string, data interface{}) error {
event := Event{
ID: generateEventID(),
Type: eventType,
Source: getServiceName(),
Data: convertToMap(data),
Timestamp: time.Now(),
Version: "1.0",
}
eventData, err := json.Marshal(event)
if err != nil {
return err
}
return eb.conn.Publish(eventType, eventData)
}
func (eb *EventBus) Subscribe(eventType string, handler func(Event)) error {
_, err := eb.conn.Subscribe(eventType, func(msg *nats.Msg) {
var event Event
if err := json.Unmarshal(msg.Data, &event); err != nil {
log.Printf("Failed to unmarshal event: %v", err)
return
}
handler(event)
})
return err
}
// Saga pattern implementation for distributed transactions
type SagaOrchestrator struct {
eventBus *EventBus
steps []SagaStep
}
type SagaStep struct {
Name string
Action func(data map[string]interface{}) error
Compensation func(data map[string]interface{}) error
}
func (so *SagaOrchestrator) Execute(sagaID string, data map[string]interface{}) error {
executedSteps := []int{}
for i, step := range so.steps {
if err := step.Action(data); err != nil {
// Compensate executed steps in reverse order
for j := len(executedSteps) - 1; j >= 0; j-- {
stepIndex := executedSteps[j]
if compErr := so.steps[stepIndex].Compensation(data); compErr != nil {
log.Printf("Compensation failed for step %s: %v",
so.steps[stepIndex].Name, compErr)
}
}
return err
}
executedSteps = append(executedSteps, i)
}
return nil
}
// Order processing saga example
func NewOrderProcessingSaga(eventBus *EventBus) *SagaOrchestrator {
return &SagaOrchestrator{
eventBus: eventBus,
steps: []SagaStep{
{
Name: "ValidateUser",
Action: func(data map[string]interface{}) error {
userID := data["user_id"].(string)
return validateUser(userID)
},
Compensation: func(data map[string]interface{}) error {
// No compensation needed for validation
return nil
},
},
{
Name: "ReserveInventory",
Action: func(data map[string]interface{}) error {
orderItems := data["items"].([]interface{})
return reserveInventory(orderItems)
},
Compensation: func(data map[string]interface{}) error {
orderItems := data["items"].([]interface{})
return releaseInventory(orderItems)
},
},
{
Name: "ProcessPayment",
Action: func(data map[string]interface{}) error {
amount := data["amount"].(float64)
paymentMethod := data["payment_method"].(string)
return processPayment(amount, paymentMethod)
},
Compensation: func(data map[string]interface{}) error {
transactionID := data["transaction_id"].(string)
return refundPayment(transactionID)
},
},
{
Name: "CreateOrder",
Action: func(data map[string]interface{}) error {
return createOrder(data)
},
Compensation: func(data map[string]interface{}) error {
orderID := data["order_id"].(string)
return cancelOrder(orderID)
},
},
},
}
}
Data Management Strategies
Database per Service Pattern
# Docker Compose for multi-database setup
version: '3.8'
services:
user-db:
image: postgres:15
environment:
POSTGRES_DB: userdb
POSTGRES_USER: userservice
POSTGRES_PASSWORD: userpass
volumes:
- user-db-data:/var/lib/postgresql/data
ports:
- "5432:5432"
networks:
- user-network
order-db:
image: postgres:15
environment:
POSTGRES_DB: orderdb
POSTGRES_USER: orderservice
POSTGRES_PASSWORD: orderpass
volumes:
- order-db-data:/var/lib/postgresql/data
ports:
- "5433:5432"
networks:
- order-network
payment-db:
image: postgres:15
environment:
POSTGRES_DB: paymentdb
POSTGRES_USER: paymentservice
POSTGRES_PASSWORD: paymentpass
volumes:
- payment-db-data:/var/lib/postgresql/data
ports:
- "5434:5432"
networks:
- payment-network
inventory-db:
image: mongodb:6.0
environment:
MONGO_INITDB_ROOT_USERNAME: inventoryservice
MONGO_INITDB_ROOT_PASSWORD: inventorypass
MONGO_INITDB_DATABASE: inventorydb
volumes:
- inventory-db-data:/data/db
ports:
- "27017:27017"
networks:
- inventory-network
cache:
image: redis:7-alpine
ports:
- "6379:6379"
volumes:
- cache-data:/data
networks:
- shared-network
volumes:
user-db-data:
order-db-data:
payment-db-data:
inventory-db-data:
cache-data:
networks:
user-network:
order-network:
payment-network:
inventory-network:
shared-network:
Event Sourcing Implementation
// Event sourcing for order aggregate
package order
import (
"encoding/json"
"time"
)
type OrderAggregate struct {
ID string
Version int
Events []DomainEvent
State OrderState
}
type OrderState struct {
ID string
UserID string
Items []OrderItem
Status string
TotalAmount float64
CreatedAt time.Time
UpdatedAt time.Time
}
type OrderItem struct {
ProductID string
Quantity int
Price float64
}
type DomainEvent interface {
GetEventType() string
GetAggregateID() string
GetVersion() int
GetTimestamp() time.Time
}
type OrderCreatedEvent struct {
AggregateID string `json:"aggregate_id"`
Version int `json:"version"`
Timestamp time.Time `json:"timestamp"`
UserID string `json:"user_id"`
Items []OrderItem `json:"items"`
TotalAmount float64 `json:"total_amount"`
}
func (e OrderCreatedEvent) GetEventType() string { return "OrderCreated" }
func (e OrderCreatedEvent) GetAggregateID() string { return e.AggregateID }
func (e OrderCreatedEvent) GetVersion() int { return e.Version }
func (e OrderCreatedEvent) GetTimestamp() time.Time { return e.Timestamp }
type OrderStatusChangedEvent struct {
AggregateID string `json:"aggregate_id"`
Version int `json:"version"`
Timestamp time.Time `json:"timestamp"`
OldStatus string `json:"old_status"`
NewStatus string `json:"new_status"`
Reason string `json:"reason"`
}
func (e OrderStatusChangedEvent) GetEventType() string { return "OrderStatusChanged" }
func (e OrderStatusChangedEvent) GetAggregateID() string { return e.AggregateID }
func (e OrderStatusChangedEvent) GetVersion() int { return e.Version }
func (e OrderStatusChangedEvent) GetTimestamp() time.Time { return e.Timestamp }
// Event store interface
type EventStore interface {
SaveEvents(aggregateID string, events []DomainEvent, expectedVersion int) error
GetEvents(aggregateID string) ([]DomainEvent, error)
GetEventsFromVersion(aggregateID string, version int) ([]DomainEvent, error)
}
// Order aggregate methods
func (o *OrderAggregate) CreateOrder(userID string, items []OrderItem) error {
if o.State.Status != "" {
return errors.New("order already exists")
}
totalAmount := 0.0
for _, item := range items {
totalAmount += item.Price * float64(item.Quantity)
}
event := OrderCreatedEvent{
AggregateID: o.ID,
Version: o.Version + 1,
Timestamp: time.Now(),
UserID: userID,
Items: items,
TotalAmount: totalAmount,
}
o.applyEvent(event)
o.Events = append(o.Events, event)
return nil
}
func (o *OrderAggregate) ChangeStatus(newStatus, reason string) error {
if o.State.Status == newStatus {
return errors.New("status is already " + newStatus)
}
event := OrderStatusChangedEvent{
AggregateID: o.ID,
Version: o.Version + 1,
Timestamp: time.Now(),
OldStatus: o.State.Status,
NewStatus: newStatus,
Reason: reason,
}
o.applyEvent(event)
o.Events = append(o.Events, event)
return nil
}
func (o *OrderAggregate) applyEvent(event DomainEvent) {
switch e := event.(type) {
case OrderCreatedEvent:
o.State.ID = e.AggregateID
o.State.UserID = e.UserID
o.State.Items = e.Items
o.State.Status = "CREATED"
o.State.TotalAmount = e.TotalAmount
o.State.CreatedAt = e.Timestamp
o.State.UpdatedAt = e.Timestamp
case OrderStatusChangedEvent:
o.State.Status = e.NewStatus
o.State.UpdatedAt = e.Timestamp
}
o.Version = event.GetVersion()
}
func (o *OrderAggregate) LoadFromHistory(events []DomainEvent) {
for _, event := range events {
o.applyEvent(event)
}
o.Events = []DomainEvent{} // Clear uncommitted events
}
// Repository implementation
type OrderRepository struct {
eventStore EventStore
}
func (r *OrderRepository) Save(aggregate *OrderAggregate) error {
if len(aggregate.Events) == 0 {
return nil
}
expectedVersion := aggregate.Version - len(aggregate.Events)
return r.eventStore.SaveEvents(aggregate.ID, aggregate.Events, expectedVersion)
}
func (r *OrderRepository) GetByID(id string) (*OrderAggregate, error) {
events, err := r.eventStore.GetEvents(id)
if err != nil {
return nil, err
}
if len(events) == 0 {
return nil, errors.New("order not found")
}
aggregate := &OrderAggregate{ID: id}
aggregate.LoadFromHistory(events)
return aggregate, nil
}
Service Mesh Implementation
Istio Configuration
# Istio service mesh configuration
apiVersion: networking.istio.io/v1beta1
kind: Gateway
metadata:
name: microservices-gateway
namespace: production
spec:
selector:
istio: ingressgateway
servers:
- port:
number: 80
name: http
protocol: HTTP
hosts:
- api.company.com
tls:
httpsRedirect: true
- port:
number: 443
name: https
protocol: HTTPS
tls:
mode: SIMPLE
credentialName: api-tls-secret
hosts:
- api.company.com
---
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: microservices-routes
namespace: production
spec:
hosts:
- api.company.com
gateways:
- microservices-gateway
http:
- match:
- uri:
prefix: /api/v1/users
route:
- destination:
host: user-service
port:
number: 80
timeout: 30s
retries:
attempts: 3
perTryTimeout: 10s
- match:
- uri:
prefix: /api/v1/orders
route:
- destination:
host: order-service
port:
number: 80
timeout: 60s
retries:
attempts: 3
perTryTimeout: 20s
- match:
- uri:
prefix: /api/v1/payments
route:
- destination:
host: payment-service
port:
number: 80
timeout: 45s
retries:
attempts: 2
perTryTimeout: 15s
---
apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
name: user-service-destination
namespace: production
spec:
host: user-service
trafficPolicy:
connectionPool:
tcp:
maxConnections: 100
http:
http1MaxPendingRequests: 50
maxRequestsPerConnection: 10
loadBalancer:
simple: LEAST_CONN
outlierDetection:
consecutiveErrors: 3
interval: 30s
baseEjectionTime: 30s
maxEjectionPercent: 50
---
apiVersion: security.istio.io/v1beta1
kind: PeerAuthentication
metadata:
name: default
namespace: production
spec:
mtls:
mode: STRICT
---
apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
name: user-service-authz
namespace: production
spec:
selector:
matchLabels:
app: user-service
rules:
- from:
- source:
principals: ["cluster.local/ns/production/sa/api-gateway"]
- source:
principals: ["cluster.local/ns/production/sa/order-service"]
to:
- operation:
methods: ["GET", "POST", "PUT"]
paths: ["/api/v1/users/*"]
Monitoring and Observability
Distributed Tracing with Jaeger
// Tracing middleware
package tracing
import (
"context"
"net/http"
"github.com/opentracing/opentracing-go"
"github.com/opentracing/opentracing-go/ext"
"github.com/uber/jaeger-client-go"
"github.com/uber/jaeger-client-go/config"
)
func InitTracer(serviceName string) (opentracing.Tracer, error) {
cfg := config.Configuration{
ServiceName: serviceName,
Sampler: &config.SamplerConfig{
Type: jaeger.SamplerTypeConst,
Param: 1,
},
Reporter: &config.ReporterConfig{
LogSpans: true,
LocalAgentHostPort: "jaeger-agent:6831",
},
}
tracer, _, err := cfg.NewTracer()
if err != nil {
return nil, err
}
opentracing.SetGlobalTracer(tracer)
return tracer, nil
}
func TracingMiddleware(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
spanCtx, _ := opentracing.GlobalTracer().Extract(
opentracing.HTTPHeaders,
opentracing.HTTPHeadersCarrier(r.Header),
)
span := opentracing.GlobalTracer().StartSpan(
r.URL.Path,
ext.RPCServerOption(spanCtx),
)
defer span.Finish()
ext.HTTPMethod.Set(span, r.Method)
ext.HTTPUrl.Set(span, r.URL.String())
ctx := opentracing.ContextWithSpan(r.Context(), span)
r = r.WithContext(ctx)
next.ServeHTTP(w, r)
})
}
func TraceHTTPClient(client *http.Client) *http.Client {
client.Transport = &tracingTransport{
RoundTripper: client.Transport,
}
return client
}
type tracingTransport struct {
http.RoundTripper
}
func (t *tracingTransport) RoundTrip(req *http.Request) (*http.Response, error) {
span, ctx := opentracing.StartSpanFromContext(
req.Context(),
"HTTP "+req.Method,
)
defer span.Finish()
ext.HTTPMethod.Set(span, req.Method)
ext.HTTPUrl.Set(span, req.URL.String())
ext.SpanKindRPCClient.Set(span)
opentracing.GlobalTracer().Inject(
span.Context(),
opentracing.HTTPHeaders,
opentracing.HTTPHeadersCarrier(req.Header),
)
req = req.WithContext(ctx)
resp, err := t.RoundTripper.RoundTrip(req)
if err != nil {
ext.Error.Set(span, true)
span.SetTag("error.message", err.Error())
} else {
ext.HTTPStatusCode.Set(span, uint16(resp.StatusCode))
if resp.StatusCode >= 400 {
ext.Error.Set(span, true)
}
}
return resp, err
}
Prometheus Metrics
// Metrics collection
package metrics
import (
"net/http"
"strconv"
"time"
"github.com/prometheus/client_golang/prometheus"
"github.com/prometheus/client_golang/prometheus/promhttp"
)
var (
httpRequestsTotal = prometheus.NewCounterVec(
prometheus.CounterOpts{
Name: "http_requests_total",
Help: "Total number of HTTP requests",
},
[]string{"method", "endpoint", "status_code", "service"},
)
httpRequestDuration = prometheus.NewHistogramVec(
prometheus.HistogramOpts{
Name: "http_request_duration_seconds",
Help: "HTTP request duration in seconds",
Buckets: prometheus.DefBuckets,
},
[]string{"method", "endpoint", "service"},
)
businessMetrics = prometheus.NewCounterVec(
prometheus.CounterOpts{
Name: "business_events_total",
Help: "Total number of business events",
},
[]string{"event_type", "service"},
)
activeConnections = prometheus.NewGaugeVec(
prometheus.GaugeOpts{
Name: "active_connections",
Help: "Number of active connections",
},
[]string{"service"},
)
)
func init() {
prometheus.MustRegister(httpRequestsTotal)
prometheus.MustRegister(httpRequestDuration)
prometheus.MustRegister(businessMetrics)
prometheus.MustRegister(activeConnections)
}
func MetricsMiddleware(serviceName string) func(http.Handler) http.Handler {
return func(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
start := time.Now()
// Wrap response writer to capture status code
wrapped := &responseWriter{ResponseWriter: w, statusCode: 200}
next.ServeHTTP(wrapped, r)
duration := time.Since(start)
statusCode := strconv.Itoa(wrapped.statusCode)
httpRequestsTotal.WithLabelValues(
r.Method,
r.URL.Path,
statusCode,
serviceName,
).Inc()
httpRequestDuration.WithLabelValues(
r.Method,
r.URL.Path,
serviceName,
).Observe(duration.Seconds())
})
}
}
type responseWriter struct {
http.ResponseWriter
statusCode int
}
func (rw *responseWriter) WriteHeader(code int) {
rw.statusCode = code
rw.ResponseWriter.WriteHeader(code)
}
func RecordBusinessEvent(eventType, serviceName string) {
businessMetrics.WithLabelValues(eventType, serviceName).Inc()
}
func SetActiveConnections(count float64, serviceName string) {
activeConnections.WithLabelValues(serviceName).Set(count)
}
func MetricsHandler() http.Handler {
return promhttp.Handler()
}
Deployment and Operations
Kubernetes Deployment
# Complete microservice deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: user-service
namespace: production
labels:
app: user-service
version: v1.0.0
spec:
replicas: 3
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
maxUnavailable: 0
selector:
matchLabels:
app: user-service
template:
metadata:
labels:
app: user-service
version: v1.0.0
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "8080"
prometheus.io/path: "/metrics"
sidecar.istio.io/inject: "true"
spec:
serviceAccountName: user-service
containers:
- name: user-service
image: myregistry.com/user-service:v1.0.0
ports:
- containerPort: 8080
name: http
- containerPort: 8081
name: grpc
env:
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: user-service-secrets
key: database-url
- name: JAEGER_AGENT_HOST
valueFrom:
fieldRef:
fieldPath: status.hostIP
- name: SERVICE_NAME
value: "user-service"
resources:
requests:
memory: "256Mi"
cpu: "250m"
limits:
memory: "512Mi"
cpu: "500m"
livenessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
httpGet:
path: /ready
port: 8080
initialDelaySeconds: 5
periodSeconds: 5
volumeMounts:
- name: config
mountPath: /app/config
readOnly: true
volumes:
- name: config
configMap:
name: user-service-config
---
apiVersion: v1
kind: Service
metadata:
name: user-service
namespace: production
labels:
app: user-service
spec:
ports:
- port: 80
targetPort: 8080
name: http
- port: 8081
targetPort: 8081
name: grpc
selector:
app: user-service
---
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: user-service-hpa
namespace: production
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: user-service
minReplicas: 3
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
Best Practices and Lessons Learned
1. Start Small and Evolve
- Begin with a modular monolith
- Extract services gradually based on business needs
- Use domain boundaries to guide decomposition
2. Design for Failure
- Implement circuit breakers and timeouts
- Use bulkhead patterns to isolate failures
- Plan for eventual consistency
3. Observability First
- Implement distributed tracing from day one
- Use structured logging with correlation IDs
- Monitor business metrics, not just technical ones
4. Data Consistency Strategies
- Embrace eventual consistency where possible
- Use saga patterns for distributed transactions
- Implement proper event sourcing for audit trails
5. Security Considerations
- Implement zero-trust networking
- Use service-to-service authentication
- Encrypt data in transit and at rest
Conclusion
Microservices architecture offers significant benefits in terms of scalability, maintainability, and team autonomy. However, it also introduces complexity in areas such as distributed systems management, data consistency, and operational overhead.
Success with microservices requires careful planning, robust tooling, and a strong focus on operational excellence. The patterns and practices outlined in this guide provide a foundation for building resilient, scalable microservices systems that can evolve with your business needs.
Remember that microservices are not a silver bullet – they should be adopted when the benefits outweigh the complexity they introduce. Start with a clear understanding of your domain, invest in proper tooling and monitoring, and evolve your architecture incrementally based on real-world feedback and requirements.