SaaS Platform unavailable

Incident Report for Lucidworks Platform

Postmortem

Summary

On June 12, 2025, at 18:06 UTC, Lucidworks’ internal monitoring on Google Cloud Platform (GCP) and access to the GCP console became unavailable. Shortly thereafter, the Lucidworks Platform became largely non-functional, including both the console UI and APIs. The issue was promptly reported to Google through a non-incident channel due to the inability to use the GCP interface.

Once internal teams confirmed that GCP was experiencing a widespread outage and that Lucidworks Platform components were unable to respond, mitigation actions were initiated and affected clients were alerted. The majority of Lucidworks clients were not impacted by the incident and Lucidworks Search functionality remained operational throughout this incident.

Google began recovery efforts, and by June 12, 2025, at 19:30 UTC, Google reported that most regions were recovering, although some products and services continued to experience issues. The us-central1 region—critical for core Platform services—remained affected. Full functionality of the Lucidworks Platform was not restored until June 12, 2025, at 21:00 UTC, when Google implemented a full mitigation that included this critical region as well.

Affected client sites were verified as operational, and clients were informed that services had resumed. Interim mitigations were reverted, and full Platform functionality was restored.

Root Cause

The Lucidworks Platform runs entirely on Google Cloud Platform. While redundancy is built in at several layers throughout the system to ensure it is highly available and able to withstand the failure of components at many levels, a global GCP outage such as the one experienced in this incident is not something we have architected for at this stage.

A widespread outage which affected almost all Google Cloud Platform (GCP) services impacted core GCP functionality, most significantly the us-central1 region, which is critical for Lucidworks Platform operations, as it is where our Platform UI and central configuration resides. This meant that throughout the incident, the Platform UI at platform.lucidworks.com was unavailable, making it impossible to modify service configurations or to deploy new search experiences.

Customers deployed on our managed Lucidworks Search product were largely unaffected by this incident, as core indexing and search functionality does not rely upon GCP APIs in any way. However, those who have implemented Neural Hybrid Search (NHS), which uses Lucidworks AI (LWAI) to power vector embeddings for semantic search capabilities, experienced a reduction in service throughout the incident’s duration. This is because the LWAI APIs were unavailable, as that component is reliant upon GCP for basic functionality; as a result, we had to switch these environments back to a pure lexical search while the incident was active. Once services were recovered, we were able to re-enable full hybrid lexical-semantic search capabilities. In these scenarios, the ability of Lucidworks Search to power search results for customers’ environments remains functional, though at a reduced level of relevance during the period where vector embeddings cannot be retrieved from LWAI.

Customers on our Connected Search product were more significantly impacted, experiencing a search query outage from 10:49am to 11:55am PDT (17:49 to 18:55 UTC), in addition to the loss of ability to manage their applications’ configurations via our Platform UI until 21:00 UTC.

Lucidworks Actions

Lucidworks has taken the following actions as a result of this incident:

  • We proactively alerted affected clients via our Support system.
  • For clients in production with Neural Hybrid Search, who were specifically affected by Lucidworks AI API downtime, we temporarily fell back to pure lexical search to prevent service disruption until the incident was resolved.

Lucidworks will take the following actions as a result of this incident:

  • Develop multi-region model deployments for LWAI within the Lucidworks Platform. This will enable more graceful and transparent fallback in scenarios where Lucidworks Search NHS capability is impacted by a LWAI issue in a particular GCP region.
  • Develop and implement more robust region failover strategies for critical services. We will perform a service-by-service audit of our services to ensure we have architected automated failovers throughout our Platform products, and have robust response procedures documented and exercised.

Recommended Client Actions

There are no recommended client actions as a result of this incident.

Posted Jun 16, 2025 - 15:48 PDT

Resolved

We have confirmed that all Lucidworks Platform products and services are once again fully functional, including LWAI and all Neural Hybrid Search functionality for Lucidworks Search customers.
Posted Jun 12, 2025 - 13:43 PDT

Monitoring

Our cloud provider has resolved the underlying issue, and Lucidworks Platform services are recovering. Lucidworks engineers are monitoring the recovery of our services and will provide another update once we've confirmed that everything is fully functional again.

Our Platform UI is once again accessible.

Connected Search functionality has fully recovered.
Posted Jun 12, 2025 - 13:27 PDT

Update

Our cloud provider is actively mitigating the issue. We are monitoring the recovery and will keep impacted customers informed of the progress.
Posted Jun 12, 2025 - 12:54 PDT

Update

Our cloud provider has indicated that they are in the process of mitigating the underlying issue affecting us. We are continuing to monitor and to notify impacted customers.
Posted Jun 12, 2025 - 12:14 PDT

Investigating

Due to a widespread outage with our public cloud provider, the SaaS Platform user console is currently unavailable. This incident is affecting Lucidworks customers' ability to use LWAI, as well as Connected Search's ability to serve queries. We are actively working with our provider to reach a resolution as soon as possible, and will provide regular updates here.
Posted Jun 12, 2025 - 11:46 PDT
This incident affected: Lucidworks Platform (User Logins & Configuration UI, Integrations) and Connected Search (US Region Data Ingest, US Region Search APIs).