Considerations of a Mission Critical Site
Emergency relief and recovery, such as that required during and after a disaster such as a significant bushfire, is a serious business.
Processes and procedures need to work. Systems need to function flawlessly within an environment characterised by huge spikes of induced demand.
The Department of Human Services (DHS) Emergency Relief and Recovery (ER&R) website is an example of such a system. Salsa recently completed the migration and build of DHS’ ER&R website. The site is now running on a next generation platform purposely engineered to support the mission critical nature of the site. This blog article provides an appreciation of some of these requirements and Salsa’s approach to delivering on them.
Notable characteristics of the ER&R site, as distinct from a non-mission critical site, include:Robust and scalable platform and build to accommodate peaks on demand;
- Support ease and timeliness of content management by authors and approvers under high pressure situations;
- No practical limit to browser, devices, and bandwidth permutations;
- Existing site design and testing investment preserved;
- Project delivery and consequent timelines to consider looming high risk season.
Robust and Scalable
The tragic events of the Black Saturday fires provided a baseline of the sorts of demands the ER&R site would need to operate in. Traffic to the legacy ER&R site increased by approximately 10 times during this period.
Salsa’s approach to delivering a site capable of handling this type of demand was to engineer a system built from ground up for robustness and scale. Salsa worked with partner Acquia (www.acquia.com) to utilise their enterprise level drupal cloud platform as a foundation for the new site. The architecture, underpinned by AWS, is inherently scalable.
A success factor of the project was to design and execute a stress test which would demonstrate the site’s ability to function under demand. Salsa’s stress testing modelled the Black Saturday demand as well as a demand 10 times that of Black Saturday. 90% and 95% response time frequencies showcased the newly engineered site’s capability to support emergency response demands. Checkout how fast this site responds: http://www.recovery.vic.gov.au/
Extreme Content Management
Emergency response requires content to be managed under extreme pressure. Content managers need simple and reliable content processes.
In appreciation of this requirement Salsa crafted a method of templating an entire DHS hierarchy of content. In the case of an emergency a single content management action is able to produce a series of assets that are consistent with other emergency responses and populated with selected templated content. Content authors have simple and standardized methods of on-boarding new content.
Extended Assumptions on browser versions and devices
Emergency response does not have the luxury of broad assumptions regarding browser penetration rates and the like. The site is not able to ignore marginal devices or assume high bandwidth will be available. The site needs to function for any stakeholder, and every stakeholder, involved in the emergency response.
Salsa’s approach was to build a responsive site effective on mobile, tablet as well as desktop. Best practice Drupal methods were utilized to limit browser payloads. Extensive browser and device testing utilizing simulators for initial issue discovery were used. Importantly, actual device and browsers was used in final pass testing for ultimate validation.
Existing investments in design and testing
The project, being a migration of a legacy site, did not have the mandate to alter the creative design. In fact, the legacy site represented a very large investment in testing and validation of the behaviour of the website’s user interface. Salsa needed to engineer the most effective method of preserving this investment - given the critical nature of site - and reuse as much as possible of the UI.
Salsa’s approach involved understanding and deconstructing the website assets such that, as much as possible, they were able transplant these assets into the new build. Salsa also relied heavily on DHS engineers to work collaboratively in order to understand the legacy solution. The new generation site now looks almost identical to the legacy site, maintains the investment that site had in UI development and testing, while providing a more robust technical foundation upon which the site runs.
High Risk Season - A real deadline
Emergencies are by their nature seasonal. High risk periods exist for bushfires, for floods and other types of emergencies. Salsa needed to execute on the project to have the site ready for the looming bushfire season. The deadline was very real.
Salsa’s approach to engineering a mission critical site in less than 3 months was to use many parallel streams, with a governance stream sitting across the streams. DHS were kept highly engaged during execution with workshops and detailed weekly status reports. A series of showcases were designed to demonstrate the system to DHS stakeholders and solicit feedback. Issue and risk reporting and mitigations were taken very seriously.
Emergency response and recovery is a serious business. Luckily the combination of DHS, Acquia and Salsa is a serious team. A lot of hard work and focus resulted in the project meeting an immovable deadline. A robust and scalable ER&R site is now live ready to support bushfire response and other emergencies going forward.
If you have a digital project that has demanding requirements, particularly in the space of pure software engineering, give Salsa a call.