CELS Systems Newsletter, October 2021 (Part 2: Services and Updates)
Hi, everyone. Sorry I’m a little late with this update. There’s a fair amount in here so I want to start with a very high level summary so you can decide how much you want to read.
- There will be a building-wide power outage that affects our data center in building 240 during the weekend of November 19-21. The current plan has all equipment in the data center being shut down at end of day on Friday, November 19th, and brought back online the morning of Monday, November 22nd.
- New services: We have a Jenkins server in GCE, configured to run workflows on GCE compute resources. Also, we have a Globus Endpoint in GCE for transferring data into or out of GCE home directories and project directories.
- Legacy compute resources will be retired or migrated into GCE by the end of this calendar year. By January, Legacy Linux shell access will be limited to the Legacy login nodes for the purposes of copying files into GCE. If you haven’t moved your workflows into GCE, it’s time to start.
Data Center Outage:
As noted above (and tenants of the building will have already received a notice to this effect), building 240 will be undergoing a full power outage in support of the work to bring Aurora into reality. We will send detailed notices as we approach the outage, but right now all systems in the building will be powered off and offline from the evening of Friday November 19 through the following Monday morning when we will start to bring things back online.
The teams managing the various systems (ALCF, LCRC, etc.) will also send more detailed notices to their userbases. Please note that this newsletter is only going to ANL employees and affiliates who are supported by CELS Systems. If you have systems you know to be affected by this outage, please communicate it with your users.
As part of the work, we will be announcing a second, earlier outage. On the morning of Friday, November 19, we will be taking down the GCE nodes. This is so that we can move the primary file server from building 240 to building 386 (the Enterprise Data Center), and bring it back online there. The downside is that GCE will be down for a portion of that Friday, but when it comes up later that say, it will be able to stay up through the weekend and any other future outages the building has to endure. Obviously, any compute nodes in building 240 will not stay up through the weekend, but the login nodes as well as any of the compute nodes hosted in 386 will stay up. More details will be sent on this as we approach the outage date.
New GCE Servers and services:
For those of you who haven’t yet converted your CI/CD workflows to use gitlab, we now have a Jenkins server in GCE. We have some rudimentary documentation at help.cels.anl.gov/docs/ – just search for “Jenkins”. If you’re a Jenkins user, please check it out and let us know if there are things missing or improvements you’d like to see.
We also have a new Globus Endpoint connected to GCE. You can find instructions on how to use it at help.cels.anl.gov/docs/linux/gce-globus-endpoint/. This is a nice way to move data from GCE to other globus endpoints. Both home and project directories are available from this endpoint.
Also, if you’re a laptop Linux user and you haven’t already, please update your print server to printers-240.cels.anl.gov, as the older print server will be going away after the above-mentioned power work. You can find instructions by visiting help.cels.anl.gov/docs and searching for printing.
Legacy environment retirement update:
With the advent of the Jenkins server in GCE, we are now ready to start migrating the remaining viable compute nodes from Legacy into GCE. For each migration we will send an announcement of the specific retirement, but the goal is to have all the Legacy compute nodes running jobs in GCE by end of 2021.
So, what does this mean? Well, for one, it means as we migrate each node, where viable, we will also relocate it to building 386, allowing it to weather building-wide outages for 240. But it also means that after we return from the holiday break in January, the only user-facing interactive pieces of the Legacy MCS Linux environment will be the virtual machines service as login nodes.
Using those login nodes, you’ll be able to access your Legacy files until the environment is ultimately retired (date yet to-be-determined, but certainly in the first half of 2022). We have instructions on how to migrate into GCE (for home directories, project directories, and other services) at help.cels.anl.gov/docs/migrating-from-legacy-to-gce. If you’ve still got a Confluence space to migrate, you can find instructions there as well. Speaking of confluence, the MCS confluence space will be migrating in December. You generally won’t notice it when it happens, except there’ll be a new URL at the top of the browser.
And if you’ve reached it this far and are thinking “What’s GCE?” I encourage you to check out help.cels.anl.gov/docs/linux and read up on the General Computing Environment.
I think that’s enough for this newsletter. There’ll be more in the coming weeks as some of these outages and retirements get closer.
Thanks for reading, and remember the best resources for help are out website at help.cels.anl.gov or dropping us a line at [email protected].