Risk management

A "Normal" Accident -- The Loss of the RAF Nimrod XV230: A Failure of Leadership, Culture, and Priorities

Analyzing accidents can demonstrate how much they have in common with everyday operations. Business principles and sound engineering practices can be at odds.

generic-8029.jpg

A few months ago, a friend sent me a link to an hour-long YouTube video of Charles Haddon-Cave speaking about his investigation of the Nimrod XV230 crash in 2006. The presentation was delivered at the Piper 25 Conference in 2013. It is remarkable. After watching the video, I downloaded the board of inquiry (BOI) report. This column is a summary of the video and report.

The Accident and Board of Inquiry Findings

Developed from the de Havilland Comet, the Nimrod aircraft first entered service in the UK Royal Air Force (RAF) in 1969. A total of 49 Nimrod planes were built. Initially it served in antisubmarine warfare, maritime reconnaissance, and marine search-and-rescue operations. In 1982, the planes were refitted with the air-to-air refueling capabilities needed for service in the Falklands War. Air-to-air refueling allowed the planes to remain in the field for extended periods of time.

More recently, the Nimrod has served as an intelligence-gathering platform in Afghanistan and Iraq. It normally carried a crew of 12 people.

On 2 September 2006, RAF Nimrod XV230 was on a routine mission over Helmand province in southern Afghanistan in support of NATO and Afghani ground forces. Shortly after air-to-air refueling, a fire was detected. Six minutes later, the plane, engulfed in flame, broke apart and crashed.

Fuel escaped during the refueling, either from an overflow from the No. 1 tank through the blowoff valve, or from a leaking coupling. The fuel tracked rearward and accumulated in the starboard No. 7 tank dry bay. The fuel was ignited by contact with exposed high-temperature ducts.

Because the crew had no access to the No. 7 tank dry bay, it had no means to fight the fire. After about 5 minutes, the fire caused the fuel in the tank to boil. The tank ruptured, and shortly thereafter the plane was engulfed in flames. The resulting crash killed all 14 crewmen.

A BOI, led by Haddon-Cave, was established to investigate the crash. It identified several major issues that contributed to the accident, including the following:

  1. Poor initial design and modifications from 1960s onward led to the potential for fuel to pool and contact hot piping
  2. History of leaks in the 1970s and 1980s did not raise alarm flags—normalization of deviance
  3. Increase in operational tempo in 1990s and 2000s. Heavy use in Kosovo, Afghanistan, and Iraq
  4. Problems of maintenance of an aged aircraft with repeated out-of-service date extensions
  5. Distractions of major organizational change and cuts in funding in the UK Ministry Of Defense (MOD) between 1998 and 2005 resulting in an organization of “Byzantine complexity”
  6. A shifted focus from airworthiness to business principles (MBAs over subject matter experts [SMEs])
  7. Outsourcing of the Nimrod Safety Case, and pathetic work by the subcontractors

Design and Modifications

The original Nimrod design incorporated a crossfeed duct. It enabled engines to be shut down and restarted in the air by routing hot bleed air from one engine to another. The crossfeed duct gave rise to a serious fire hazard, especially in the No. 7 tank. The duct was in close proximity to fuel piping and was routed through the bottom of the bay where fuel could pool. The fuel piping was congested, contorted, and contained many couplings subject to leaks.

The addition of air-to-air refueling capacity increased the risk of leakage. It created the possibility of the fuel tank pressure relief valves going off in flight. The valves relieved overpressure on tank overfilling to the outside of the aircraft. When refueling on the ground, any vented fuel fell to the tarmac. In the air, the fuel blew onto the side of the plane, and some of it entered nonpressurized compartments of the plane through gaps in the panels.

In addition, air-to-air refueling occurs at a higher flow rate and higher pressures, thus increasing the likelihood of the fuel tank overfilling, overpressure, and coupling leaks.

The BOI concluded that the fuel that collected in No. 7 tank dry bay was released either from the No. 1 tank blowoff valve or from a leaking coupling.

Normalization of Deviance

The starboard No. 7 tank dry bay was a spaghetti junction of fuel pipes and other kit. The fuel pipes in the bay contained 9 couplings. In total, the fuel system on the Nimrod contained more than 400 couplings, all of which included elastomeric seals.

There were many fuel leaks tolerated to a significant extent. There was a prevailing belief throughout the military that the focus should be on eliminating ignition sources. Also, there was no trend analysis of maintenance, which may have helped officials notice the large increase in fuel system leaks from 0.5 per thousand flying hours in 1980 to 3.5 in 2000.

A major finding from the investigation of the US space shuttle Challenger accident was normalization of deviance. The Challenger solid fuel boosters had O-ring seals that were frequently charred. Initially, the seals raised alarm bells, but as more experience developed, the charred rings came to be accepted as “normal.” This happened with the Nimrod as well. Frequently, leaks did not lead to catastrophe, and that led to a normalization of deviance; the leaks became accepted as normal and not a cause of concern.

Pressure on the fuel system was higher during air-to-air refueling because of higher flow rates. Steady-state operating pressure during air-to-air refueling was 30 to 40 psig, still well within the system’s pressure rating. But the closing of fuel system valves caused surges in pressure (water hammer). Modeling suggests that surges may have exceeded the coupling design pressure of 110 psig. Surge analysis was not attempted until after the accident.

Operation of the aircraft in Iraq and Afghanistan, and proximity to the hot crossfeed ducts, exposed the seals to elevated temperatures, perhaps above 70°C. The seal elastomers experience significant stress relaxation between 70°C and 80°C.

Maintenance and Organizational Problems

The aircraft was not effectively maintained in the years leading to the crash. The BOI attributes this to several factors: The aircraft was old, built in an earlier age without access to good maintenance technologies, and there were dwindling spares; the operating budget was cut, and leaks were accepted as a normality; and continuous delays in the delivery of replacement aircraft caused a serial extension of the out-of-service date.

The most withering criticism leveled by the BOI was reserved for organizational changes in the MOD, which underwent significant organizational changes between 1998 and 2006. The MOD shifted from an organizational structure built along functional lines to a project-oriented organization. Also, organizations within the MOD were “rolled up” to create larger “purple” management structures inclusive of all three military forces (army, navy, and air force). This included mergers of procurement and service organizations. For example, teams with the responsibility for airworthiness no longer had the responsibility for spare parts purchasing and storing, nor for the maintenance of the aircraft.

Business principles were imposed within the MOD to the exclusion of sound engineering practices. The ministry preferred MBAs over SMEs. The imposition of unending cuts amid a steady stream of other business initiatives caused deep organizational trauma. A culture developed with too little appreciation of “hard-handed” engineering specialist skills and too great a reverence for young “soft-handed” MBAs.

The cuts and changes within the MOD led to the dilution of safety and airworthiness cultures and the distraction from airworthiness as the top priority. In addition, the ministry outsourced responsibilities to industry as a way to save cost.

Project Delays and Complexity

An important organizational factor that played a role was the delay in the project that was intended to generate the replacement aircraft for the Nimrod. The Nimrod 2000 program, which was later renamed the MRA4, began in 1989 with the replacement aircraft originally scheduled for operation in 2000.

According to the BOI report, the current MOD airworthiness system “is of Byzantine complexity.” Haddon-Cave wrote that, in his view, the system “lacks sufficient clarity, simplicity, and transparency. Roles and responsibilities are diffuse, diluted, and opaque. Lines of authority are often attenuated, conflicting, and unclear… The collection of so many disparate regulators, each responsible for different aspects of Airworthiness, and each having different levels of authority, is an arrangement that is neither effective nor, frankly, understood by the majority of practitioners in the Service.”

An example of this complexity is the process for purchasing a simple part, the Avimo coupling seal. A serious manufacturing defect was found on the Avimo seal elastomer in 2005. Through a convoluted and dysfunctional purchasing system, the RAF purchased noncompliant seals beginning in 2000. The seals were incompatible with aircraft fuel, swelling significantly on contact. Though the problem was discovered in 2005, working through the bureaucracy proved too difficult for the mechanics; the purchase spec had not been correct a year later at the time of the crash.

Fig. 1 summarizes the MOD’s process for purchasing the Avimo seal.

ogf-2016-02-pfcdir-fig1.jpg
Fig. 1—The procurement chain for Avimo couplings and seals. Source: The Nimrod Review 2009

The Nimrod Safety Case

Safety cases originated from UK regulations following the Piper Alpha disaster in 1988. The Nimrod was designed long before that. A safety case was developed for the aircraft between 2001 and 2005. The safety case took 4 years and cost GBP 400,000.

The purpose of a safety case is to identify, assess, and mitigate potentially catastrophic hazards and is defined as “a structured argument, supported by a body of evidence that provides a compelling, comprehensible, and valid case that a system is safe for a given application in a given environment.”

The No. 7 bay contained eight fuel couplings with elastomeric seals and an exposed duct operating at a temperature above the auto-ignition temperature of jet fuel. One would have thought that it would have been a major focus of a safety case. But it was missed.

The BOI called the safety case “a lamentable job from start to finish,” “riddled with errors of fact and opinion,” “it was essentially a paperwork exercise,” and “its production is a story of incompetence, complacency, cynicism.” It was fatally undermined by the assumption that the Nimrod was “safe anyway,” because the fleet had flown successfully for 30 years. “They were merely documenting something they already knew,” according to the BOI report.

Closing Thoughts on “Normal” Accidents

In her book The Challenger Launch Decision (1996), Diane Vaughan makes a claim that startled me when I read it. To paraphrase:

  • If you study the engineering design organization of a project that went badly, you will find chaos (complex processes, changes and problems not properly communicated, people using outdated drawings and data, etc.)
  • And, if you study the engineering design organization of a project that went well, you will find chaos (complex processes, changes and problems not properly communicated, people using outdated drawings and data, etc.)

Could it be that the types of problems so well documented in this case are more or less normal in our projects, and that we only find out about them following an accident?
Complexity. One of the main themes of the Nimrod BOI is the “Byzantine complexity” of the MOD and the effect of that in making the system ineffective and unsafe. We have the same problems in our industry.

MBAs vs. SMEs. The focus was on business principles at the expense of technical expertise. Yeah, we do that. The pipeline leaks that BP suffered in Alaska were a direct result of cost-cutting initiatives.

Safety studies: we know it is safe anyway. The Nimrod safety case was sabotaged by the team’s assumption that the plane was “safe anyway.” They were just documenting what they already knew.

Do we suffer from the same normalization of deviance mentality when we do hazard and operability studies of familiar systems? Do our safety studies focus too much effort on the minutia of safety, and not enough on the highly unlikely, but potentially catastrophic scenarios? 

For Further Reading

Haddon-Cave, C. 2009. The Nimrod Review: An Independent Review Into the Broader Issues Surrounding the Loss of the RAF Nimrod MR2 Aircraft XV230 in Afghanistan in 2006. Report, The UK Stationery Office, London, https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/229037/1025.pdf (accessed 12 January 2016).

Haddon-Cave, C. 2013. Leadership and Culture, Principles and Professionalism, Simplicity and Safety—Lessons From the Nimrod Review. Presentation at the Oil and Gas UK’s Piper 25 Conference, a three-day conference held to mark the 25th anniversary year of the Piper Alpha disaster, Aberdeen, 18–20 June, https://www.youtube.com/watch?v=y99_lhFFCsk (accessed 12 January 2016).

Lustgarten, A. 2012. Run to Failure: BP and the Making of the Deepwater Horizon Disaster, first edition. W.W. Norton & Company.

Vaughan, D. 1996. The Challenger Launch Decision: Risky Technology, Culture, and Deviance at NASA, first edition. University of Chicago Press.


duhon-howard.jpg
Howard Duhon is the systems engineering manager at GATE and the SPE technical director of Projects, Facilities, and Construction. He is a member of the Editorial Board of Oil and Gas Facilities. He may be reached at hduhon@gateinc.com.