Lucidworks AI Hosted LLM Service Disruption

Incident Report for Lucidworks Platform

Resolved

The service disruption affecting certain Lucidworks AI hosted models has been fully resolved, and all previously affected services are operating normally. End-user functionality is fully restored, with all dependent services operating as expected.

The disruption originated during routine Kubernetes maintenance involving node upgrades, during which a cloud provider capacity shortfall prevented the instances hosting these models from re-launching. Stability was restored after successfully launching new nodes and pulling the required images. We will share a postmortem report containing the full root cause analysis within three business days.

Posted May 28, 2026 - 18:08 UTC

Monitoring

We have successfully launched new nodes with the required LLM images, and redirected Lucidworks AI traffic to them. All LWAI hosted models are fully functional again, and stability has been confirmed through successful 200 responses from the prediction endpoint. End-user functionality is fully restored, with queries no longer returning errors. We are continuing to observe monitoring metrics to ensure system stability remains consistent before providing a final resolution update.

Posted May 28, 2026 - 18:01 UTC

Identified

The service disruption affecting certain Lucidworks AI hosted models remains ongoing, and the models are currently unavailable. The disruption occurred during routine Kubernetes maintenance involving node upgrades, during which a cloud provider capacity shortfall prevented the instances hosting these models from re-launching. We have secured the needed capacity and have engaged our cloud provider to resolve the remaining delays in bringing the models back up. We will provide further updates as additional information becomes available.

Posted May 28, 2026 - 17:39 UTC

Investigating

Certain Lucidworks AI hosted models (llama-3-8b-instruct, llama-3v2-3b-instruct, and phi-4-multimodal-instruct) in the us-southcarolina region are experiencing a service disruption, causing them to be currently unavailable. Customers attempting to use these Lucidworks-hosted models are receiving 500 or 429 errors for queries that rely on them. End users utilizing services that depend on these specific models are encountering errors, though passthrough LLM calls remain fully functional. We are currently investigating the issue to determine the cause and are actively working to restore full availability. We will provide further updates as new information becomes available.

Posted May 28, 2026 - 17:14 UTC

This incident affected: Lucidworks AI.