← Back to News List

Storage System Performance Update

To all chip users,

The DoIT Research Computing & Data (RCD) Team is continuing to work on the storage system performance issues that have been impacting the chip cluster since early February. For some users this is intermittent or even not noticeable, for others the issue is persistent. When present, these issues are manifesting as unusually long file read/write/load times for operations as simple as “ls” or “cd” when navigating the filesystem or as complicated as long slurm job runtimes. The RCD Team has been working with support vendors and other institutions to better understand the problem that is affecting the storage system. We apologize for this interruption to productivity and want users to understand that this is not expected behavior and that we are working to resolve the issues.

The team feels we have a better understanding of the issue at large, and are continuing to work with our support vendor to apply mitigation strategies as well as benchmarking. In the meantime, understand that we have increased the amount of machines handling metadata five fold, effectively spreading the workload more evenly, and helping ensure that when a metadata service fails it doesn’t fail all at once, killing all client connections. Please understand that this does not mean an increase in speed is expected. 

The team continues to encourage all users to send in any issues that seem to arise from an issue with client connections to their data. This will reveal itself in mostly I/O intensive actions such as large package installation or large movement between files in directories (locally or from the internet). 

The team will give weekly updates and give substantive updates as they develop.

 

Max Breitmeyer

HPC Specialist

UMBC DoIT

Posted: May 8, 2026, 5:27 PM