2026-05-24 - Redis resilience against Upstash idle disconnects
Problem
In production, AuthService.getProfile was returning User profile not found (404) for users whose profiles existed. Logs showed the underlying error: RedisError [ERR_REDIS_CONNECTION_CLOSED]: Connection has failed. Two issues stacked:
- Upstash closes idle TCP connections after a few minutes. Bun's
RedisClientreconnects, but commands in-flight when the socket drops (or arriving during the reconnect window) fail immediately withERR_REDIS_CONNECTION_CLOSED. This is unlikeioredis, which queues commands during reconnects and hid the issue. getProfile's catch block converted any thrown error intoNotFoundException, so Redis failures masqueraded as missing profiles.
Fix
CacheServicenow swallows all store errors. A Redis outage degrades to "always miss, never write" instead of throwing - reads fall through to Supabase, writes silently skip.get/updateon store error → log + treat as cache miss (returnnull/false).set/delete/deleteByPrefix/clearon store error → log + no-op.
RedisStoresends aPINGevery 60s to prevent Upstash from closing the socket, and is constructed withautoReconnect: true,maxRetries: 20,idleTimeout: 0.CacheServiceimplementsOnModuleDestroyto tear down the keepalive interval and close the Redis socket on shutdown.AuthService.getProfileno longer wraps non-domain errors. Only Supabase row-not-found producesNotFoundException; everything else surfaces as a real 500.
Operational notes
REDIS_URLmust userediss://(TLS) for Upstash.- Watch logs for sustained
Redis keepalive ping failedwarnings - sporadic ones during reconnect are expected, sustained failures indicate credentials/URL/network issues. - If
ERR_REDIS_CONNECTION_CLOSEDcontinues to appear after this fix, consider migratingRedisStoreto@upstash/redis(HTTPS, no persistent socket). TheCacheInterfaceabstraction makes this a single-file change.
See the Cache module docs for the full architecture and the Resilience section for the error-handling contract.