Skip to content

MD Express: RabbitMQ Timeouts and PRECONDITION_FAILED #158

@bluna301

Description

@bluna301

We are currently running a Bone Age MAP on MONAI Deploy Express v0.6.0 in shadow mode. We are receiving clinical studies via Compass (COMPASS-SCP) and launching the MAP. We are exporting the produced DICOM SR to the MDE ORTHANC instance and back to Compass, and we are also exporting the source DICOM study to MDE ORTHANC. Workflow definition is attached.

The problem we are seeing is that a slew of RabbitMQ errors are being produced during workflow execution. They seem to come in two main types.


Timeout Waiting for Delivery - looks like RabbitMQ is expecting the export service to acknowledge a message, but it never does, causing timeout after 30 minutes:

Consumer 1 on channel 1 has timed out waiting for delivery acknowledgement. Timeout used: 1800000 ms.

basic.ack Precondition Failed with Unknown Delivery Tag - I believe this occurs when a consumer acknowledges a message that RabbitMQ no longer recognizes:

operation basic.ack caused a channel exception precondition_failed: unknown delivery tag X

Possible error causes.


These timeouts are causing exports to either be delayed (for around 30 minutes until the timeout occurs) or cancelled entirely (export tasks timeout, as can be seen in the Workflow Manager logs).

I have attached the Informatics Gateway, Workflow Manager, Task Manager, and RabbitMQ container logs (PHI redacted). For these logs, 5 studies were sent to the deployment in relatively quick succession (the deployment was down, so Compass queued the studies and sent them once the deployment was up again). I believe the behavior observed for these 5 studies were:

  • 2 completed exports in real time / expected time (~10 to 15 seconds)
  • 1 completed exports after the 30 minute timeout period
  • 2 failed exports as the tasks timed out

The same RabbitMQ errors are observed when studies are sent in spaced intervals (i.e. hours between study sends), so the issue is not isolated to this batch send occurrence.


Any insights into a possible cause or solution for this problem would be greatly appreciated. As some additional context, in all cases (irrespective of export failure or not), MAPs are launched successfully, and the input DICOM study and output DICOM SR are populated in the minio and mdtm folders. The RabbitMQ issues seem to only block the export of DICOM data.

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions