IT Resilience defines procedures, creates tools, and leverages partnerships to increase emergency response preparedness and service resiliency across University IT (UIT). These efforts are a part of the framework for delivering resilient services*, contributing to a more resilient Stanford.
(*Viewable with SUNet ID)
In an emergency situation, dial 9-911 from an office phone or 911 from a mobile phone.
Protecting life safety is the top priority listed in the Stanford University Emergency Plan. As such, the following resources and materials should be reviewed in advance of needing them during an actual emergency.
Department Operations Centers (DOCs) coordinate response and recovery and serve as the interface between the campus community and the Emergency Operations Center (EOC).
The UIT Department Operations Center (DOC) is a process and group that assembles to address escalated incidents that impact delivery of IT services to the Stanford community. The objective of activating the UIT DOC is to pull together the necessary resources to address the incident and to communicate toward timely and well-informed resolution and restoration of services.
Access the UIT DOC Plan (viewable to UIT only)Learn more about DOCs
Power outages can impact IT services across the historic Stanford campus, as well as the Stanford Redwood City campus (SRWC). IT Resilience maintains a resource to provide guidance on impacts to specific services, as well as best practices for individuals in preparing for and recovering from power outages.
SCERT provides emergency support in a disaster situation to UIT staff and members of the Stanford community.
UIT IT Resilience facilitates the Business Affairs Business Impact Analysis (BIA) Program to help prepare for the consequences of disruption to business functions and processes, and gathers information needed to develop recovery strategies. This is a collaborative process to identify and mitigate potential risks, assemble needed information, and describe recovery strategies.
IT Resilience facilitates the annual UIT Disaster Recovery Plan (DRP) Program. Each cycle, a set of services are selected to create new DRPs in an effort to ensure that all UIT services requiring IT infrastructure, either UIT supported or externally supported, have completed a formal DRP. Additionally each cycle, every completed DRP is reviewed and revised.
IT Resilience facilitates an annual UIT Criticality Assessment Program to identify resiliency gaps within the portfolio of UIT services. This effort is conducted with varying scopes on an annual basis. A full assessment is completed every other year and includes remediation plans for a select number of services that have resiliency gaps.
IT Resilience facilitates a cross-organizational tabletop discussion around response preparedness of Stanford Health Care and Stanford Children’s Health in the event of a major service impacting event of a specified UIT service. The objective is to identify gaps in business processes, as well as areas for improved efficiencies across the organizations.
IT conducts the Annual Disaster Recovery (DR) exercise to test the recovery processes of critical technology services in preparation for inevitable natural or human-made disasters. The cross-UIT team tests cloud services by changing firewall rules on a Virtual Private Cloud. Load and performance testing of DR instances in the Cloud are also performed.
IT Resilience collaborates with the Information Security Office (ISO) and the University Privacy Office to facilitate recurring Incident Response Exercises. Scenarios are selected that allow for valuable cross-group collaboration and discussion. Guest groups are invited on a rotational basis, and exercise and readiness trends are tracked over time.
IT Resilience serves as the Problem Management Process owners for UIT. A “Problem” is defined as the unknown cause of one or more Incidents. The Problem Management process manages the lifecycle of problems. The main objective is to prevent Incidents from reoccurring or, if they cannot be prevented, ensure they can be resolved in the most expedient manner.
IT Resilience created several tools for service owners to use in determining the criticality level for their service, as well as service level expectations between the Critical, Major, Moderate and Minor criticality designations. These tools include a Service Criticality Infographic, Criticality Profiles, and a Major Incident Communications Matrix.
The Service Release Process (SRP) ensures IT services are production-ready upon release, and that any potential negative impact to clients is mitigated. Early in the service development process, SRP engages specific UIT workgroups, increasing confidence that services pushed to production are resilient, secure, supportable, accessible, and usable by our intended communities.
Learn more about the Service Release Process Service Release Process (SRP)
UIT IT Resilience manages a variety of programs intended to increase resiliency of systems, applications, and workgroups. The IT Business Continuity/Disaster Recovery Community of Practice brings together members of the Stanford community to collectively support and spread awareness for the value in these efforts. Join the discussion!
Submit questions or feedback about this page or about IT Resilience Programs via email to firstname.lastname@example.org
Last modified May 1, 2023