System status and downtime
October 11, 2023: reduced Wheeler node availability
We are temporarily reducing the number of available compute nodes in the Wheeler cluster to maintain safe temperatures in the machine room. The “qgrok” command provides up to date information on the state of the cluster.
The increase in temperature is due to an issue with one of the cooling units. The required parts have been ordered, and as soon as repairs are complete, we will return Wheeler to full capacity.
Currently running jobs will not be interrupted, but job queue times may increase.
Preventative Maintenance 2023
Dec 6: Xena and Wheeler
Dec 13: Taos, BeeGFS filesystem (/carc/scratch), Firewall, sshgw1 & 2, mail server, Serrano3
Dec 20: Hopper