I have been wrestling with an issue with our Internet firewall recently, and the culmination of troubleshooting efforts boiled down to a simple fact: a module inside the firewall would have to be rebooted. This ends up being a big deal because while our firewall is rebooting, network traffic cannot pass thru it, effectively isolating WSU until the firewall properly initializes again.
While this is a minor annoyance when we have to do this at home with our Cable/DSL Modems, it has the potential of being something very nasty at an institution as large as Wayne State University. That is all time that off-site students cannot access their Blackboard sessions, faculty cannot collaborate with other Universities, and prospective students cannot browse our webpages looking for that perfect program to enroll in.
Thankfully, in working with the Network Engineering group, the Information Security Office has multiple redundant systems setup for exactly this purpose. With a few keystrokes, the Internet traffic was instantly rerouted thru our secondary Internet firewall, picking up the 161,000 network connections with ease. Now that our troublesome firewall was “out of the loop”, we were able to run the diagnostic commands to restart certain modules without causing a moment of downtime. This, in turn, helped resolve several production issues that have been growing over the past few weeks.
Exercises like this should be a reminder on how important it is to build redundancy in the systems that we create. While the above was a controlled event, it as just as important to be ready in the case of an unexpected failure, such as a power supply failing or a backhoe digging up your fiber connection. When dealing with large enterprise systems (including our Internet backbone), effective redundancy, Disaster Recovery, and Business Continuity Planning must be built into your methods and practices. Without these things, it will be impossible to deliver the quality of services that our consumers live to expect!