Service outage resolved
(earlier, blog only) > I've identified and replaced the failed server, and authentication is working again. We've got some NFS issues lingering that I'm tracking down that are preventing people from logging into the login servers. I'll post anothe...
(earlier, blog only)
> I’ve identified and replaced the failed server, and authentication is working again. We’ve got some NFS issues lingering that I’m tracking down that are preventing people from logging into the login servers. I’ll post another update once that’s taken care of. Update: NFS has been restored. Some machines may need a reboot. login2.mcs.anl.gov was rebooted in the process of debugging this. login1.mcs.anl.gov was done in by a combination of the service outage and a user job on the login node that ran the machine out of memory. login3 and login4 remained up.Zimbra issues seem to have cleared up as soon as the authentication service was restored. This outage affected:
CELS Zimbra users
MCS LDAP/Kerberos authentications (including MCS Unix workstations)
NFS
MCS License servers. We will investigate why backup authentication methods did not help, as well as why CI Zimbra users were affected. Thanks for your patience.