← Back to News List

September HPCF Newsletter

Hi everyone,

I'm writing to give everyone an update regarding the UMBC HPCF and other Research Computing projects currently undertaken by DoIT.

Summary

  • Office hours! We have them! They are available via appointment at https://hpcf.umbc.edu/help/office-hours/ or drop-in in-person on Fridays at 1:30 in Engineering Building Room 102
  • We've added some new GPU partitions in an effort to make it more clear what nodes are "safe" from preemption
  • Staff will be continuing to reach out to PIs in an effort to continue migrating to the new RRStor system
  • SIG meetings updates and new "User Support" SIG group 
  • SSH connection issues

Office Hours
In coordination with the ScaleS center, we are excited to announce that drop-in office hours have returned for this academic semester. They will be held weekly on Fridays at 1:30PM in Engineering Building Room 102. In fact, we've already held our first drop-in office hours on September 19th and it was a SMASHING success.

When coming with questions please note that it is a first come, first served system, and to have as much information ready as possible. Remember, the faster we can get started on your issue, the faster we can fix it. We will continue to offer one-on-one appointments for more in-depth problems or if the drop-in hours don't work for you.

You can see a listing of all drop-in office hour times on the HPCF myUMBC group events listing: https://my3.my.umbc.edu/groups/hpcf/events 

For scheduling an office hour appointment, please refer to this page: https://hpcf.umbc.edu/help/office-hours/ 

GPU Partitions
During our last downtime, one of our main goals was to implement a new GPU Slurm model, where contributing PIs have priority on machines that they helped to purchase.

In an effort to relieve some of the confusion around these changes, we've added two new partitions gpu-general and gpu-contrib. The gpu-general partition will place a job on a node that is not owned by one of the contributing PIs. This guarantees that the job will run for at least three days without being preempted when this partition is specified in the Slurm commands. The gpu-contrib partition is not a usable partition, meaning that no one can submit jobs to this partition. This partition shows users what nodes are owned by one (or more) PIs. 

If you don't care what node you are placed on you can still use the standard gpu partition. 


Migration of Storage Volumes
While the new Ceph Storage Cluster (RRStor) is here to stay, the Isilon storage system will be sunset at the end of this calendar year. The Isilon storage system was purchased at the start of the COVID-19 Pandemic, but its proprietary file system, high cost of expansion, and high software support costs make it no longer feasible for us to maintain.

Instead, we'll be taking a portion of the saved expense from the Isilon and dedicating it to expanding RRStor and its capabilities. This will require that we migrate research storage volumes from the Isilon Storage System to Ceph before the end of the calendar year. This represents the careful movement of more than 2PiB of data, so DoIT Research Computing staff will be working with researchers to schedule the migration of their storage volumes so as to minimize impact on the research enterprise. Note that the duration of each volume migration will depend on the size of the volume.

Staff will continue reaching out to PIs to schedule migrations. At this time, we have about 10% of all migrations completed. We ask that you continue to work with us to make sure that we have all volumes scheduled by October 15th, and sunset the Isilon volumes completely by Thanksgiving.

SIG
The SIG-CPU and SIG-GPU Committees have met and selected members. Please see the Shared Infrastructure Governance webpage for more information and past meeting materials. The following SIG Committees have meetings scheduled, and will be visible on this myUMBC Group Events page.

  • The CPU Committee will hold a meeting of voting members October 2, 2025 
  • The GPU Committee will be meeting October 27, 2025 at 13:30ET, all researchers using GPUs are welcome. Location TBD
  • The Cloud Committee is working to identify membership and meeting times 
  • The User Experience Committee will be meeting on the Third Tuesdays of each month at 13:30ET. These user experience meetings are open to any user of the chip HPC. Location TBD

SSH Connection Issues
We've seen an increase in the number of users saying they are having connection issues. Specifically with repeated closures of their active SSH connections. We're currently working alongside our DoIT network engineering team to determine the cause of the issue. Any information about a dropped connection you might have experienced will help us. The information we need to know is the following:

  • Date and time of login
  • Date and time of dropped connection
  • Are you on/off campus?
  • Are you on a wired/or wireless network?
  • Are you using the VPN?
  • Which login node did you connect to initially (chip-login1/2)?

Publications
If you have any publications, presentations, theses, or other works that made use of the campus cluster(s), please submit an RT Ticket with bibliographic information so that we can accurately reflect this work in our records and on the HPCF Website. 

Need Help?
As always, please communicate any issues/questions to the Research Computing RT Queue (hpcf.umbc.edu > User Support > Request Help).

Thanks for reading,
Roy Prouty
Assistant Director for Research Computing


Tags:

Posted: September 30, 2025, 1:43 PM