SaaS Platform unavailable

Incident Report for Lucidworks Platform

Postmortem

Summary

On December 3, 2025, at 13:30 UTC, the Lucidworks SaaS Platform experienced widespread service disruptions affecting multiple clients. This issue affected Lucidworks AI functionality (including Neural Hybrid Search), Commerce Studio and Analytics Studio accessibility, and Connected Search. API requests received HTTP 401 and 403 errors with invalid or expired token messages, and the platform.lucidworks.com UI became inaccessible. During this time, Lucidworks infrastructure also experienced a high volume of malicious traffic from external sources.

Lucidworks Engineering resolved the issue by 22:03 UTC on December 3, 2025, restoring service for all affected products.

A similar issue occurred on December 5, 2025, beginning at 16:17 UTC. Following further coordination with our third-party identity provider (explained in detail below), this repeat incident was resolved by 19:00 UTC that day.

Root Cause

The incident was caused by a combination of three interconnected issues that occurred simultaneously on December 3, 2025. Okta, Lucidworks' identity provider for Platform authentication, experienced a service outage, and in response implemented aggressive rate-limiting measures. During this period, Lucidworks Platform IP addresses were blocked by Okta for an extended duration, preventing the Lucidworks authentication system from retrieving JSON Web Key Sets (JWKS) from Okta's servers. This resulted in HTTP 401 invalid/expired token errors for authentication requests.

Concurrent with the Okta outage, Lucidworks Platform infrastructure experienced a distributed denial of service (DDoS) attack in the form of a high rate of malicious requests from multiple geographic sources. The attack traffic significantly exceeded Okta's rate limit thresholds and contributed to Okta's decision to block the Lucidworks Platform IP addresses.

Additionally, for some Lucidworks Search customers, a configuration flag (failOnError) in the Core Package vectorization stage was inadvertently left enabled in the Production environment following earlier debugging activities. When the Okta authentication issue prevented access to Lucidworks AI services, queries using Neural Hybrid Search failed completely rather than falling back to lexical-only search, amplifying the impact for clients using Neural Hybrid Search functionality.

Lucidworks determined the root causes through analysis of Okta service status, review of authentication logs showing JWKS retrieval failures, traffic analysis identifying the DDoS attack pattern, coordination with Okta support, and code review of affected query pipeline configurations. Changing the Lucidworks Platform IP address immediately restored authentication functionality, confirming that Okta's IP blocking was the primary cause of the authentication failures.

Two days later, on December 5, 2025, the Lucidworks Platform again began to experience 401 invalid/expired token errors for authentication requests. We again had to coordinate with Okta support to update their IP address allowlist in order to ensure our traffic was not blocked or unnecessarily rate-limited.

Lucidworks Actions

Lucidworks has taken the following actions as a result of this incident:

  • Implemented enhanced rate limiting and DDoS protection and firewall rules to defend against future attacks
  • Modified affected client query pipelines to disable the failOnError flag and enable automatic lexical-only fallback when vectorization services are unavailable
  • Changed the Lucidworks Platform IP address to restore authentication services and coordinated with Okta to maintain proper IP allowlisting
  • Escalated with Okta support to resolve the IP blocking issue and ensure continuity of authentication services

Additionally, we intend to take the following actions to further enhance our ability to detect, withstand, and respond to similar incidents in the future:

  • Ensure debug configuration flags are removed from all Production query pipelines to prevent similar failures during future service disruptions
  • Implement caching mechanisms for JWKS to provide fallback authentication capabilities during Okta service disruptions
  • Enhance our firewall’s adaptive protection capabilities to more quickly detect and mitigate DDoS attacks
  • Implement improved synthetic monitoring with automated incident detection to reduce time to detection for similar issues
  • Establish a process to proactively notify Okta when Lucidworks Platform IP addresses change to maintain current allowlist configurations

Recommended Client Actions

Lucidworks recommends that all clients using Neural Hybrid Search upgrade to Fusion 5.9.15 or later as soon as possible. This version includes enhanced failsafe fallback mechanisms in the Neural Hybrid Query stage that automatically switch to lexical-only queries when Lucidworks AI services are unreachable, improving overall system resilience during service disruptions.

Lucidworks also recommends that clients subscribe to Lucidworks status updates to receive real-time notifications about Lucidworks SaaS Platform incidents. To enable this feature, click Subscribe to Updates on status.lucidworks.com.

Posted Dec 12, 2025 - 13:19 PST

Resolved

The Lucidworks SaaS Platform is now fully restored and operational. The third-party network issue has been successfully fixed and full stability has been confirmed during our monitoring period.

We thank you for your patience. A detailed postmortem will be posted here as soon as it is available.
Posted Dec 03, 2025 - 14:05 PST

Monitoring

The core issue has been successfully mitigated by our third-party vendor. The Lucidworks SaaS Platform is now entering a critical monitoring phase as our teams verify full stability and functionality across all services (Lucidworks AI, Commerce Studio, etc.). We anticipate a full resolution announcement shortly.
Posted Dec 03, 2025 - 13:01 PST

Update

The Lucidworks SaaS Platform remains unavailable due to the external third-party issue. The vendor has successfully identified the root cause within their system and their engineering teams are now actively working on the corrective fix. We are closely monitoring their progress and maintaining direct communication. We will post our next status update within 30 minutes
Posted Dec 03, 2025 - 12:48 PST

Update

The Lucidworks SaaS Platform remains unavailable as our teams are still actively engaged with the third-party service vendor. We have successfully identified the root cause as a network infrastructure issue on the vendor's side, and we are closely collaborating with their engineering team as they prioritize and execute the necessary corrective actions to restore service stability and platform connectivity.
Posted Dec 03, 2025 - 12:16 PST

Update

The Lucidworks Platform remains unavailable due to an outage with a critical third-party service provider. Our teams are working directly with the vendor's engineers to restore connectivity and service stability as quickly as possible.
Posted Dec 03, 2025 - 11:28 PST

Identified

The Lucidworks SaaS Platform is currently experiencing an unexpected outage.

This impacts all services, including Lucidworks AI, Commerce Studio, Analytics Studio, and Connected Search.

We are actively investigating the root cause and deploying resources to restore full functionality as quickly as possible.

Next Update: We will post a full status update within 30 minutes.

We apologize for the disruption and appreciate your patience.
Posted Dec 03, 2025 - 10:30 PST

Investigating

We are currently investigating this issue.
Posted Dec 03, 2025 - 11:30 PST
This incident affected: Lucidworks Platform (User Logins & Configuration).