System Status

Sections: 

staff are automatically paged and respond when an issue arises.

key: a green dot means the system is up; a red dot means the system has an issue

System Status Marker  Faculty/Staff Email (andromeda - normal maintenance is on Tuesdays, 6AM - 8:30AM)

System Status Marker  Student Email (pegasus - normal maintenance is on Tuesdays, 6AM - 8:30AM)

System Status Marker  WebMail (maintenance follows andromeda and pegasus above)

System  Status Marker  Blackboard (normal maintenance is on Fridays, 9AM - 12PM)

System  Status Marker  DS Exchange (normal maintenance is on Thursdays, 9PM - 2AM)

Please keep in mind the scheduled maintenance windows as detailed above when reporting problems.

System Log

More information on current and past service interruptions is provided by the log below. For brief problems, explanations will not be entered until the problem has been fixed. For longer problems, we will try to keep you up to date on what is being done.

view system log

    Newark Systems Log

  • January 30, 2013 4:03PM - 4:30PM, Andromeda hosted websites and mail slowdown due to a server issue.

  • December 3, 2012 1:18PM - 1:40PM, Andromeda account creation and SSH service interruptions due to a hardware failure.

  • November 27, 2012 10:30AM - 11:30AM, Andromeda web service interruptions.

  • November 13, 2012 10:30AM - 10:45AM, Web services had interruptions due to a mysql failure.

  • October 20, 2012 12AM - October 22, 2012 7:14AM, mail services on one antivirus/antispam machine stopped processing mail properly.

  • October 16, 2012 7AM - 1:40PM, all services were interrupted due to maintenance gone awry, followed by a network outage.

  • September 28, 2012 10AM - 3PM, Web services were sporadically interrupted due to denial of service attack.

  • August 30, 2012 5:16AM - 8:10AM, Blackboard and email were interrupted due to power issues.

  • August 21, 2012 8:30AM - 9:35AM, Some web services interrupted due to network issues.

  • August 21, 2012 8:30AM - 9:00AM, Email services interrupted due to network issues.

  • August 11, 2012 4:40AM - 2:00PM, All services were interrupted due to network issues.

  • May 25, 2012 12:30AM - May 25, 2012 8:20AM, Andromeda web service interruptions.

  • April 27, 2012 11:30PM - April 28, 2012 2:00AM, All services were interrupted due to network issues.

  • April 24, 2012 8:30AM - 9:45AM, Andromeda and Pegasus web was interrupted due to network issues after upgrades during the 6AM-8:30AM scheduled maintenance

  • April 24, 2012 8:30AM - 9:20AM, Andromeda and Pegasus email was interrupted due to network issues after upgrades during the 6AM-8:30AM scheduled maintenance

  • April 3, 2012 2:45PM - 4:22PM, Andromeda had web service interruptions due to a mysql server crash.

  • January 25, 2012 5:12AM - 7:20AM, Andromeda had web service interruptions.

  • January 24, 2012 8:30AM - 9:00AM, Andromeda had web service interruptions (these began during the maintenance window which lasts until 8:30am).

  • January 12, 2012 4:37AM - 1:25PM, Andromeda had web service interruptions unrelated to yesterday's issue.

  • January 12, 2012 6:37PM - 11:25PM, Andromeda had web service interruptions due to a caching issue.

  • August 25, 2011 8:20AM - 8:30AM, Andromeda and the campus webserver had service interruptions due to power loss to the drive arrays.

  • July 13, 2011 12:00AM - 8:55AM, Webmail had service interruptions due to hardware issues.

  • June 7, 2011 7:30PM - 10:13PM, Andromeda, Pegasus, Webmail, and Blackboard had service interruptions due to a load balancer issue.

  • December 13, 2010, 5:11PM - 5:35PM, Andromeda, Pegasus, Webmail, and Blackboard had service interruptions due to a networking issue.

  • November 9, 2010, 6:30PM - 10:40PM, inbound email from sites we haven't seen before would have been delayed.

  • November 5, 2010, 2:00PM - 5:20PM, Andromeda email and webserver, and the Campus webserver unreachable because of a switch failure.

  • October 31, 2010, 1:00AM - 1:00PM, DHCP connection interruptions.

  • October 30, 2010, 7:00AM - October 31, 2010, 3:30PM, Andromeda and pegasus email sporadic mail interruptions.

  • October 30, 2010, 7:00AM - October 31, 2010, 3:30PM, Blackboard service interruptions.

  • October 30, 2010, 1:00AM - 11:00AM, DHCP connection interruptions.

  • October 30, 2010, 1:00AM - 5:00AM, Pegasus webmail was down.

  • October 29, 2010, 6:00PM - October 30, 2010, 1:00AM. Scheduled maintenance of all Central Systems Services. All services offline.

  • October 21, 2010, 12:00PM - 1:07PM. Lab login failures occurred because DHCP stopped running. It stopped running because the NOC de-registered a subnet we serve, making our configuration invalid. Better client resiliency is being looked into.

  • October 14, 2010. Several services have high latencies due to our storage device being overloaded. They continue to work but access may take several tries in order to be successful. An upcoming upgrade should resolve these issues permanently.

  • October 8 - 10, 2010. Pegasus email interruptions due to a DNS issue.

  • September 23, 2010. Wireless outage continues.

  • September 22, 2010, ~4:00PM - 5:00PM. Andromeda was overwhelmed by end-of-day email transactions, and some wireless areas lost connectivity due to a disk failure in a network bridge. The bridge is back online but there are issues in another area of the network, so there is still no wireless in those areas.

  • July 20, 2010, 9:28PM - 10:15PM. WebMail server interruptions.

  • July 6, 2010, 10:10AM - 11:50AM. Due to an abnormal flood of email from another email server there was intermittent reception of Andromeda email.

  • June 22, 2010, 2:00PM - 6:00PM. Intermittent reception of Andromeda email.

  • June 12, 2010, 3:50AM - June 13, 2010, 4:19PM. Intermittent reception of email eventually leading to no reception at all.

  • June 7, 2010, 6:24PM - 12:00AM. Intermittent connection problems with some Pegasus services due to a partial system crash.

  • May 13, 2010, 3:47PM - 4:10PM. Blackboard and some Pegasus services were unavailable due to very high usage.

  • May 10, 2010, 5:27PM - 5:36PM. Blackboard was unavailable for 9 minutes due to a database issue.

  • May 10, 2010, 1:00PM - 1:11PM. Blackboard was unavailable for 15 minutes due to a database issue.

  • May 5, 2010, 1:34PM - 1:44PM. Blackboard was unavailable for 10 minutes due to a database issue.

  • April 26, 2010, 4:30PM - 4:40PM. Pegasus and the blackboard.newark.rutgers.edu site were unavailable for 10 minutes due to a load balancer issue.

  • April 16, 2010, 2:55PM - 3:05PM. Pegasus lost power due to a power supply failure.

  • February 25, 2010, 3:38AM - 7:00AM. WebMail service was sporadic after 1:06AM, and unavailable after 3:38AM, due to a server cluster failure.

  • December 23, 2009, 5:04PM - 5:30PM. Blackboard server interruptions.

  • December 22, 2009, 3:50PM - 4:03PM. Blackboard server interruptions.

  • December 21, 2009, 1:22PM - 1:31PM and 2:19PM - 2:26PM. Blackboard server interruptions.

  • December 19, 2009, 6:00PM - 7:43PM. Blackboard server interruptions.

  • December 18, 2009, 3:00PM-3:35PM and 4:25PM-5:00PM. Blackboard server interruptions.

  • December 16, 2009, 6:10PM-6:38PM. Blackboard server interruptions.

  • November 13, 2009, 4:20AM-10:43AM. Sporadic Pegasus email login failures.

  • October 12, 2009, 10:51AM-2:19PM. Blackboard server interruptions due to load on the authentication servers.

  • October 7, 2009, early afternoon. Wireless became unavailable due to a partial controller crash which took time to identify and repair.

  • October 7, 2009, 8:00AM. Not a failure: pegasus' load still being excessive, we added yet another server to the cluster. This, combined with refactoring the connection between WebMail and pegasus, finally allowed pegasus to scale up to meet demand. We continue to work to make it even more responsive, and we thank you for your patience.

  • October 5, 2009, 10:00AM - October 6, 2009, 4:00PM. Pegasus server interruptions due to the highest load we've yet seen from WebMail and users of pegasus email. We added another server to the pegasus cluster today, which helped with the load.

  • October 5, 2009, 10:04AM-10:32AM. 2:18PM-3:40PM. Blackboard server interruptions because of LDAP services being slow to respond, due to very high pegasus usage this week.

  • October 2, 2009, 11:30AM-3:30PM. Pegasus, Wireless, and Blackboard server interruptions due to an outside company's mishap near our equipment. A critical switch was fried, and we essentially had to rewire half of our network to workaround the problem. In the middle of it all, pegasus crashed, requiring manual intervention in order to boot again; however, another machine stayed running during this time, and continued to service connections.

  • September 10, 2009, 12:30PM-5:40PM. The DHCP server at 165.230.81.226 stopped responding, apparently causing issues in the labs several hours later when the leases expired. It is not yet known why the lab machines didn't find and use the DHCP service at 165.230.79.226; we are investigating the issue.

  • August 25, 2009, 7:30AM-4:30PM. Various issues affecting both DNS servers (165.230.79.226 and 165.230.81.226) at times throughout the day caused what appeared to be a slowdown in using the network.

  • July 29, 2009, Midday. The DNS service at 165.230.81.226 became unresponsive when trying to perform a critical security update to the service's software. This manifested itself to most users as a slowdown in using the network.

  • July 29, 2009, 8:30AM-9:30AM. The DNS service on andromeda became wedged when trying to perform a critical security update to the service's software. It was worked around relatively quickly but it affected all services, including a slowdown of email delivery. Blackboard showed the "unavailable" page during this time, but to most users this manifested itself as a slowdown in using the network.

  • June 20, 2009, 8:55PM-9:00PM. We experienced another apparent power problem in the same rack as on June 6, which affected mail delivery until Sunday at about 2:30PM.

  • June 14, 2009, 1:56PM-9:07PM. The storage device for Blackboard had an equipment failure, resulting in all of Blackboard's storage to be offline. A system administrator drove in and resolved the issue.

  • June 14, 2009, 1:56PM-4:35PM. The storage device for Pegasus had an equipment failure, resulting in some of the pegasus servers to not have access to user data. This affected sending and receiving email, as well as file and home directory access. A system administrator resolved the problem remotely.

  • June 6, 2009, 8:55PM-9:00PM. We experienced a power problem in one of the racks, but it was seen by staff and the systems were recovered immediately.

  • March 1, 2009, 3:22AM - March 2, 2009, 5:57PM. One of the Pegasus mailservers refused to allow users to send email through it. WebMail was not affected by this.

  • February 17, 2009, Midday. Sporadic Web Mail Outages.

  • February 16, 2009, 4:30AM - 9:15AM. Blackboard Outage.

  • August 10, 2008, 9:33PM-11:54PM. The same rackmounted UPS tripped again, causing andromeda and the campus webserver to be powered off. The next day (Sunday) staff went in and rerouted power to prevent this from happening again.

  • July 28, 2008, 10:40AM. A rackmounted UPS's circuit breaker tripped, taking out a rack full of devices including andromeda and the campus webserver. It was reset quickly but andromeda is a large machine that takes many minutes to boot. Services were restored by 11AM.

  • January 15, 2008, 6:00AM. Not a failure: several services restarted to resolve potential security vulnerabilities.

  • January 11, 2008, 12:40PM. Not a failure: web services were restarted to resolve potential security vulnerabilities.