Speech to International System Safety Conference
View document in PDF [554 KB]
Chair, Transportation Safety Board of Canada
8 August 2016
Check against delivery.
Good afternoon. Thank you very much for inviting me today. It’s a real pleasure to be here to speak with you. For those in today’s audience who may not be familiar with the Transportation Safety Board of Canada (or TSB), we are an independent federal agency, whose only goal is to advance transportation safety. It’s been our job, for over 25 years, to investigate accidents in the air, marine, rail and pipeline modes of federally regulated transportation … Wherever and whenever something goes wrong, we find out not just what happened, but also why, in order to make recommendations aimed at preventing it from happening again.
Over the years, our work has evolved quite a bit. It used to be, for instance, that the focus of accident investigation was on mechanical breakdowns. Then, as technology improved, investigators started looking at the contribution of crew behaviour and human-performance limitations. Nonetheless, it was still the case that people most often thought things were “safe” so long as everyone followed standard operating procedures. Don't break any rules or regulations, went the thinking. Make sure the equipment isn't going to fail. And above all: Pay attention to what you're doing and don't make any “stupid” mistakes.
That line of thinking held for quite a while. In fact, even today, in the immediate aftermath of an accident, people on the street and in the media still think accidents begin and end with the person or people at the wheel, as it were—the pilots, say, if it's an air accident; or the locomotive engineer, if there's been a train derailment. And so they jump to premature conclusions and say, “Oh, this was caused by pilot error,” for example, “or by someone who did not follow the rules,” as if that were the end of it. Case closed.
But it's never that simple. As system safety engineers likely know very well, no accident is ever caused by one person or by one factor. That's why our thinking at the TSB had to evolve, why it became critical to look deeper into an accident, to understand why people make the decisions they make. After all, no one wakes up in the morning and says, “I think I'll have an accident today.” And so if their decisions and their actions make sense to them at the time, those decisions and those actions could also make sense to others, in future. In other words, if we only focus on “human error,” we miss out on understanding the context in which the crews were operating.
Today I'd like to take a closer look at two high-profile investigations the TSB has done over the past few years. I'll share the causes and contributing factors, and then I'll illustrate how both of those investigations reveal systemic issues that go far beyond the individual who was initially the main focus of attention. Then I'll close by looking at what this means for the future, the challenges that system safety engineers will need to think about, and the risks that they will have to manage.
In July 2013, a train carrying 7.7 million litres of petroleum crude oil derailed in the centre of Lac-Mégantic, Quebec. The explosions and ensuing fire killed 47 people and destroyed much of the downtown. Immediately following the accident, the TSB deployed a team of investigators to the scene, including human factors experts. Over the course of the following year, what we learned led directly to a series of safety recommendations aimed at improving rail safety—not just on a single rail line or at a single company, but across North America. But first, to give you a sense of how things unfolded, here's a brief animation of what happened that night.
On July 5, 2013, at about 10:50 p.m., an MMA train carrying petroleum crude oil in 72 Class 111 tank cars arrived at Nantes, Quebec. In keeping with the railway's practice, the locomotive engineer parked the train for the night on a descending grade on the main track.
After shutting down four of the five locomotives, the engineer applied seven hand brakes. Railway rules require that hand brakes alone must be capable of holding a train, and this must be verified by a test. That night, however, the locomotive air brakes were left on during the test, meaning the train was being held by a combination of hand brakes and air brakes. It also gave the false impression that the hand brakes alone were enough.
The engineer then contacted two rail traffic controllers: one in Farnham, Quebec, to let him know the train was secure; and the other in Bangor, Maine, to discuss the smoking lead locomotive and the problems it might cause for the next crew. As the smoke was expected to settle, it was agreed to leave the train as it was and deal with it the next morning.
Shortly after the engineer left for the night, the Nantes Fire Department responded to a 911 call of a fire on the train. Firefighters extinguished the blaze on the lead locomotive by shutting off the fuel. Then, following railway instructions, they turned off the electrical breakers. After discussing the situation with the rail traffic controller in Farnham and an MMA employee who had been dispatched to the scene, everyone departed.
With all the locomotives shut down, the air brake system began to lose pressure, and the brakes gradually became less and less effective. About an hour later, just before 1 am, the air pressure dropped to a point where the combination of air brakes and hand brakes could no longer hold the train. The train then began to roll downhill toward Lac-Mégantic, seven miles away.
As it moved down the grade, the train picked up speed, reaching a top speed of 65 miles per hour. The train derailed just past the Frontenac Street crossing.
Almost all of the derailed tank cars were damaged, many of them from large breaches, and about 6 million litres of crude oil began pouring into the streets. The fire began almost immediately, and the ensuing blaze and explosions left 47 people dead. Another 2000 citizens were forced from their homes, and much of the downtown core was destroyed.
In the aftermath of this accident, there were far more questions than answers, yet that didn't stop many people from jumping to conclusions—including the public and the media. “The engineer didn't set enough handbrakes,” many of them said. “He didn't follow the rules.”
But it wasn't that simple. In fact, the TSB found 18 causes and contributing factors. Yes, this includes deficiencies in the number of hand brakes and how the engineer tested them that night—but also the way the rules allowed the train to be left unattended on a descending grade, along with the way the locomotive was maintained and why it caught fire, and the engine shut-down that allowed the air brakes to leak off. Our investigation also cited the volatility of the crude oil, the long-standing, well-known vulnerabilities in the class 111 tank cars, the railway company's poor safety culture, and how ineffective management of its safety risks contributed to the tragedy.
And then we clearly pointed out inadequacies in the oversight by the federal regulator, Transport Canada. For instance, whenever a company is unable or unwilling to manage its operations safely, the regulator—ideally—should step in, overseeing matters in a balanced way, using a combination of inspections for compliance, and audits for effectiveness. What we found in Lac-Mégantic, however, was that Transport Canada did not audit railways often enough, and thoroughly enough, to know how those companies were really managing—or not managing—risks.
So. Deficiencies identified at an individual level. At a company level. And at a federal government level. Change any one of the causes and contributing factors, and the accident might never have happened. If, for example, the engine hadn't caught fire. Or if there hadn't been a steep hill leading toward a small town. Or if the engineer had parked somewhere else for the night.
Just as accidents are not caused only by one person or organization, so safety cannot be allowed to depend on only a single line of defence. Instead, it's better to talk about the system, and multiple ways it can be strengthened. And not just by administrative defences, like rules and procedures. But also by physical defences like additional wheel chocks or derails, or more modern braking technology, so that trains cannot run away, even when they are left unattended on a steep hill. Defences like: advanced route planning and risk analysis, risk-mitigation associated with cars carrying dangerous goods, or emergency response assistance plans whenever liquid hydrocarbons are transported by rail. Also defences like stronger tank cars, with more robust designs; or for a firmer hand from the regulator to audit railways' safety management systems (or SMS)—to audit them in sufficient depth and with sufficient frequency to be sure their safety systems are effective … and that corrective action is being taken when hazards are identified.
I'd now like to offer a second example of a TSB investigation, one where—just as with the Lac-Mégantic investigation—the initial conclusions seemed obvious. At least to the public, and at least until we completed our investigation.
On the morning of September 18, 2013, OC Transpo double-decker Bus No. 8017, operating as Express Route 76, arrived at the Fallowfield Bus Station in South Ottawa. The bus was en route toward downtown Ottawa along the Transitway, a private two-lane roadway dedicated to commuter bus traffic. From the bus station, the Transitway extends east to a left-hand curve which turns sharply north and runs parallel to Woodroffe Avenue. The bus was in good mechanical condition. The driver was fit for duty and familiar with the route.
Inside the bus, the driver's workstation included standard controls and several in-vehicle displays, one of which was a video monitor mounted above and to the left of the driver's seat.
The video monitor screen measured about 6 inches by 4 inches and was further divided into four smaller quadrants, each displaying a view from one of four on-board video cameras. The bottom right quadrant displayed the upper deck.
OC Transpo required drivers to monitor this screen at station stops and while in service, and announce that no standing was permitted on the upper deck if passengers were seen standing, although there was no sign prohibiting this.
At Fallowfield Station, passengers entered and exited the bus. The driver looked at the video screen and announced that there were empty seats on the upper deck. A passenger on the upper deck did not see any available seats and remained standing near the top of the stairs, visible on the driver's video screen.
Just prior to departing Fallowfield Station, the driver was engaged with at least one passenger in conversation regarding seating availability on the upper deck.
The bus departed about 4 minutes behind the scheduled departure time with about 95 passengers on board. At this time, the flashing lights and gate at the railway crossing were already activated. However, the driver's view was obstructed by trees, shrubs, and foliage.
As the bus proceeded along the Transitway, the driver would have overheard nearby passengers involved in conversations regarding the availability of seating on the upper deck.
While negotiating the left-hand curve—a task that requires more attention than driving on a straight road—the driver was also likely distracted by the nearby conversations and by the perceived need to make an announcement that no standing was permitted on the upper deck. During this time, the driver looked up toward the video screen.
With the bus accelerating toward the crossing, passengers began to shout, and the driver refocused attention on the road ahead and applied the brakes.
As a result of the collision, VIA train 51 derailed. The bus was extensively damaged. The driver and five bus passengers sustained fatal injuries, nine were seriously injured, and about 25 incurred minor injuries.
Again, in the immediate aftermath of the accident the public jumped to conclusions. “It was all the driver's fault,” many of them said. “He was speeding,” they said, “and if he'd been paying attention, six people would still be alive today.”
That line of thinking, however, misses a number of important facts about the system in which the driver operated:
First, that screen showing the upper deck on the bus was small—just a few inches across. It was further divided into four quadrants, each smaller still. And company rules meant the driver had to monitor this screen while the bus was in service, even though this took his eyes away from the road.
Second, there was the issue of sightlines: navigating a curve requires more attention than driving on a straight road. Yet the view of the crossing was obscured by trees and foliage. So even though the crossing signals—lights, bells, and a descending gate—were all activated before the bus even left the station, the driver could neither see nor hear them.
Third, when the driver did eventually begin to brake, he did so in a way that was entirely consistent with how he had been trained: not suddenly or abruptly—but gradually, so as to minimize passenger discomfort and avoid potential injury.
Fourth, the company's speed monitoring and enforcement in that area were insufficient. And while the driver was in fact exceeding the speed limit, it was by a fairly small amount. Still, we found that even a small increase in speed can significantly increase the stopping distance and, in this case, might have made the difference.
At the end of the day, though, there still loomed the question of driver distraction. Was he? Almost certainly. For very long? A few seconds at most. In fact, given all of the other systemic elements involved, it might not have even been that long. And so we concluded that this accident could have happened to almost any one of the bus company's drivers faced with the same circumstances.
To its credit, the city took action right away—for example, clearing branches, shrubs, and trees so that the signals are now visible from the moment the bus leaves the station. The city also enhanced signage, including the addition of an advance warning light, and reduced speed limits in both directions approaching the crossing. The bus company and the union, meanwhile, issued reminders to drivers, telling them to follow the speed limit, to watch out for flashing lights at crossings, and to always be prepared to stop.
But at the Transportation Safety Board, we don't stop at what happened, or even at why. Part of our job is to make recommendations to ensure that an accident won't happen again. In this case, the TSB made five recommendations. These were aimed at separating trains from vehicles at this crossing by either an overpass or an underpass; at providing more explicit general guidance to road authorities and railway companies on when roads and railways should be grade-separated; at providing better guidelines for the installation and use of in-vehicle video monitors to reduce the potential for distracted driving; and at implementing crashworthiness standards and event data recorders for commercial busses.
Some of the defences in our recommendations are procedural or administrative—better guidelines, for instance, on how to install video monitors, or when to use them. But the ones most likely to prevent accidents are the physical defences: grade-separating roads and train tracks, making sure there is no possible way for the train and the bus to come in contact.
Unfortunately, that's not always possible, just as it's not possible to reroute trains carrying crude oil around every town or city.
And so I come to the theme of this conference, and what I think will be one of the greatest challenges you face. For system safety engineers of the future, the greatest challenge won't be in designing a system that is capable of preventing a train carrying crude oil—and which is parked on a descending grade—from running away. Nor will the challenge be designing a system that is capable of warning bus drivers of an impending train in time to slow down before a crossing. Those systems have already been designed.
No, your challenge will be to manage the human risks within your systems. Your challenge will be to imagine how real people—people who make mistakes, who don't always follow the rules—will use the systems they have been given; how and where they will look for shortcuts, or adapt rules to their own preferences. In short, how they will react. And not only will you then have to design backup defences, but then you'll have to follow up to ensure that those defences work as intended… so that mistakes, when they inevitably occur, do not lead to a catastrophe.