Update: 3/20/24, 10:34am

Many college infrastructure services recovered around 8:15am, such as cc.gatech.edu, ta-app.gatech.edu, mailman.cc.gatech.edu.  At this time all servers in the CCB data center have been powered on. TSO continues to address any unresolved issues. If you encounter any issues, please submit a request to helpdesk@cc.gatech.edu.

Your patience and understanding have been greatly appreciated. 

Update: 3/20/24, 5:45am

Chilled water service to the CCB247 Data Center has been fully restored with normal flow and temperature. TSO staff will be onsite at 7:00am to start powering up systems.

Update: 3/19/24, 2:13pm

The CCB247 Data Center is not receiving the full supply of chill-water needed to keep the data center temperature in a safe range.  TSO monitoring has alerted that temperatures in the CCB247 Data Center have begun rising.  TSO will begin shutting down systems in the Data Center. 

Update: 3/17/24, 6:31pm

TSO has been working diligently throughout the day to restore our research and some core infrastructure. All servers in the CCB data center have been powered on. If you encounter any issues, please submit a request to helpdesk@cc.gatech.edu. We will start addressing any unresolved issues tomorrow at 8:00 am. 

Your patience and understanding have been greatly appreciated. 

---------------------------------------------------

Update: 3/17/24, 3:10pm

TSO arrived on-site at 9:00am, but the CCB data center temperature was not sufficient to start powering on servers. This was because the Liebert cooling units, although powered on, were not running. GT Facilities resolved the issue at approximately 11:30am. At noon, TSO began powering on servers. Due to the severity of the outage, this process is taking a while, as we troubleshoot and address hardware issues. In addition, we are also addressing issues resulting from OIT’s campus network maintenance this morning.

We appreciate your patience.


---------------------------------------------------

Update: 3/17/24, 8:05am

I&S found a control valve on the service loop that was not open and this caused CCB not to receive proper chilled water service. The valve has been manually opened and normal chilled water service has been restored.

TSO will be on-site at 9:00am.

---------------------------------------------------

Update: 3/17/24, 6:51am

GT Facilities is past the point of backing out of the maintenance. They are exploring alternative measures to increase the chilled water capacity and are also in discussions with a vendor about the possibility of using a temporary chiller.

---------------------------------------------------

WHAT’S HAPPENING? 

We have been notified that the Data Center in CCB 247 is overheating. TSO members are working to shut down servers & clusters.
 

Affected servers & equipment:
Everything inside of CCB 247/Data Center
A follow-up of all servers and equipment will be provided.
 

WHEN IS IT HAPPENING? 

At roughly 03:00 AM EST on March 17th, 2024, reports of high temperatures and VMs crashing were reported.
 

WHY IS IT HAPPENING? 

Campus work on chilled water this weekended was not supposed to impact CCB, however it has. Details are unknown at this time; facilities and TSO members have been notified.

WHO IS AFFECTED? 

Everyone who has equipment, virtual machines, inside of the College of Computing Data Center (CCB 247). 

Affected services:
All servers and equipment inside the College of Computing Data Center, CCB 247.
 

WHAT DO YOU NEED TO DO? 

Nothing at this time.

WHO SHOULD YOU CONTACT FOR QUESTIONS? 

Feel free to contact the TSO Help Desk (404.894.7065, helpdesk@cc.gatech.edu). 

Owner of Alert
TSO/OIT