121 Years of Flight: A Journey Through Aviation Safety and Innovation
Learning to land on aircraft carriers at night meant starting in the simulator. I remember having to fly the simulator manually because they didn’t know if the automation would always work, and wanted you to be able to still land safely if it did not.
After initial qualification, learning to leverage and use autopilot (pilot relief modes) and autothrottle helped me significantly improve my landing grades.
It was immediately apparently that these systems were not smart enough to know what mode to be in, or what settings to use, or when not to - it was still very good at managing the assigned task. A dutiful assistant. With proper supervision, the combination of the high frequency task management, and workload reduction, meant that the combined system (human + machine) could perform better than either on it’s own. A fairly obvious, but key insight.
Sections:
The Evolution of Aviation Safety
Aviation Data and Safety Monitoring
Airline Operating Costs
Avionics: Technology and Certification Challenges
The Aircraft as a System (Humans + Machines)
The Future of Flight
Appendix on American Eagle (JIA) 5342 and PAT 25 Accident
The Evolution of Aviation Safety
First Flight:
Powered flight began with the Wright Brothers in 1903.
First fatality:
Just four years later, the first fatal aviation accident occurred when Army Lieutenant Thomas Selfridge died while flying with Orville Wright.
Human vs. Technical (Mechanical) Factors:
Over the past 120 years, millions of engineers have improved aviation safety, primarily through technical advancements (mechanical, material, system, and other engineering efforts). As mechanical and system failures decline, human limitations remain the biggest challenge. During the same time, the human brain did not have the equivalent engineering renaissance period.
The Decline in Accident Rates:
The absolute number of accidents has dropped significantly. Over the past two decades, it has become very close to zero. While the percentage of human-related incidents has risen, overall safety continues to improve.
Improving Pilot Performance:
Human factors research enhances pilot-machine interactions. Studies focus on fatigue, reaction times, and cognitive load. Simple design changes, like tactile switches shaped like their function, improve usability (flap switch and landing gear switches designed like the component).
Chain of Events Model:
The early “chain-of-events” model assumed accidents resulted from sequential failures. Each failure in a sequence could lead to an inevitable accident if not interrupted.
Swiss Cheese Model:
The Swiss cheese model replaced the Chain-of-Events model and is now the standard. James Reason developed this model to address the complexity of aviation mishaps, which often do not follow a linear sequence of events leading to a specific tipping point. Instead, each layer in the model may contain its gaps or holes, which, when aligned, can contribute to an accident.
Case Study: Lion Air Flight 610::
The Swiss Cheese Model explains the 2018 Lion Air 737 MAX crash, where multiple failures led to disaster. The mishap may have been prevented if any layer’s holes had been closed. In many cases, a well-trained human serves as the last line of defense.
Aviation Safety Record:
The commercial aviation safety record is exceptional. However, this record is continually challenged as global air traffic rises, flights become longer, and systems become more complex. While the accident rate is low, the Swiss cheese model shows frequent interventions are required to maintain this impressive record.
Moving Toward Proactive Safety
Modern frameworks like the Human Factors Analysis and Classification System (HFACS) and Safety Management Systems (SMS) focus on identifying risks before accidents occur. By analyzing patterns and vulnerabilities, these systems help airlines and regulators take action to prevent future incidents rather than reacting after the fact.
Aviation Data and Safety Monitoring:
Analog / `Steam` Instruments:
For the first 70 years of flight, aircraft used analog instruments known as “steam gauges.” These instruments measured individual parameters and displayed them using needles. Pilots relied on direct readings, similar to traditional thermometers.
Digital / `Glass` Instruments:
With advancements in digital technology, cockpits transitioned to digital displays, commonly called “glass cockpits.” These systems integrate multiple data sources, providing features like moving maps and synthetic vision to enhance situational awareness.
Airbus A220 ‘glass’ cockpit
Garmin’s newest avionics feature synthetic vision capabilities that augment the pilots’ situational awareness.
Aircraft Data for Reactive Safety:
Early safety investigations revealed the necessity of capturing flight data to understand accident causes. This need arose because survivors’ recollections often differ due to stress, and many accidents leave no survivors. These insights led to the development of early flight data recorders, commonly known as “black boxes,” invented by Dr. David Warren in 1953.
Black Box Advancements:
Over time, black boxes have evolved alongside computing power, data transmission, and memory improvements. Early versions recorded only a few parameters for limited durations. Today, flight data recorders (FDRs) must store at least 88 parameters for 25 hours. They are now bright orange for visibility and equipped with transmitters to aid recovery. Additionally, larger aircraft carry Cockpit Voice Recorders (CVRs), ensuring voice data collection for accident reconstruction.
Flight Data for Accident Investigation:
After an accident, flight data is plotted on time-series graphs to identify trends and interactions among parameters. This plot can help tell the story of the accident but often lacks other contextual information (e.g., who moved which switch, what they observed outside the airplane, etc.). Additionally, for various reasons, sometimes the recorders fail to capture critical moments (Jeju Air) or are not recovered (Malaysia MH370), leaving investigators with significant gaps in crucial information data.
Operational Control and Flight Data:
There are far more professional aircraft operators than the FAA and NTSB personnel could provide sufficient oversight. To align expectations, the FAA uses Operation Specifications (OpSpec) agreements to define safety standards for Part 121 (airlines like United and Delta) and Part 135 (charter services like NetJets). These agreements delegate specific responsibilities to operators. A section in the OpSpec will likely require a robust Flight Operations Quality Assurance (FOQA) program, primarily leveraging flight data from aircraft avionics for reactive safety and trend analysis.
Monitoring Data and Quality Assurance:
FOQA programs serve two primary purposes: routine analysis and incident investigations. Routine analysis tracks systematically collected data, such as the altitude at which landing gear is deployed. Routine analysis is used to observe trends and provide aspects of proactivity. Incident investigations focus on specific events, like analyzing throttle levels during a jet blast incident.
Collecting the Aircraft Data:
A system was needed to transfer data from avionics sensors and data buses to the black boxes. This led to the development of data acquisition units (DAUs), flight data acquisition units (FDAUs), and digital flight data acquisition units (DFDAUs). However, black boxes are not easily accessible for regular data retrieval, necessitating the introduction of Quick Access Recorders (QARs).
Quick Access Recorders and Aircraft Interface Devices:
QARs, requiring fewer certifications than black boxes, have evolved to store more data over extended periods. Historically, they recorded flight data for weeks, with mechanics manually downloading the data. Today, many commercial aircraft transmit flight data automatically via LTE, allowing operators to analyze FOQA data in real-time. More advanced aircraft interface devices (AIDs) now act as bridges between avionics and modern technologies, integrating with electronic flight bags (EFBs; tablets like iPads used by pilots) and satellite communication networks to provide continuous aircraft data monitoring.
Working Groups:
Aviation safety is a joint effort between regulators and operators. A robust safety culture (such as SMS and FOQA) also aligns with the OpSpec requirements. Many incidents or near-misses are addressed at the operator level, and findings are shared with regulators or other aviation safety working groups to enhance the system and maintain a healthy operator-regulator relationship.
Airline Operating Costs:
High-Volume, Low-Margin Business
Airlines generally operate on thin margins and rely on high traffic volumes to drive profitability.
Operating Costs are a Big Factor:
Airline operating costs can be unpredictable, fluctuating due to fuel price volatility, labor agreements, and unexpected maintenance. In 2019, the International Air Transport Association (IATA) reported fuel and labor as the two highest expense categories.
Flight Data for Safety and Cost Efficiency:
Most commercial airlines leverage flight data to optimize both safety and cost efficiency. FOQA programs help detect inefficiencies, such as unnecessary fuel burn, excessive engine wear, and suboptimal routing. Safety-related inefficiencies, like unstable approaches leading to hard landings, can also result in extra maintenance, flight delays, and increased operational costs. Flight data is used to allow operators to address these inefficiencies, improving safety and financial performance.
Avionics: Technology and Certification Challenges:
Long Development and Certification Cycles
Avionics systems take over a decade to develop and certify for major aircraft programs. Once deployed, they often remain in service for 30 to 40 years, or more. In contrast, consumer electronics follow much faster innovation cycles, with frequent hardware and software updates. The primary avionics software standard, RTCA’s DO-178C, was last updated in 2012, while the related hardware standard, DO-254 has similar challenges.
DO-178 Certification:
DO-178 certification is an extensive, multi-step process, which has evolved from prior failures and the state of capabilities from the jet age (1970s-1980s). The framework is structured to minimize risks by ensuring every function meets rigorous validation and verification standards.
Traceability, Determinism, and Design Assurance Levels (DAL):
Avionics systems must be highly traceable and deterministic to ensure reliability. The Design Assurance Level (DAL) framework categorizes systems by failure severity, with catastrophic failures requiring an error probability of less than one in a billion (10⁻⁹). This extreme level of reliability makes software development and verification slow and costly.
Simplified Data for DAL Compliance:
Avionics relies on rule-based software that processes relatively simple data to meet stringent safety requirements. For example, a system displaying a total air temperature (TAT) of 15°C or an exhaust gas temperature of 695°C must display values accurately and trigger alerts if thresholds are exceeded. The deterministic nature of these alerts maximizes the odds that pilots receive clear warnings without unpredictable system behavior.
Rule-Based Software vs. Machine Learning (ML) / Artificial Intelligence (AI):
Avionics software must adhere to strict determinism and traceability. AI and ML models operate probabilistically and do not currently meet DO-178C standards. While research explores AI’s potential in aviation, rule-based software remains the only approved approach. The lack of complexity handling for rule-based software has led many self-driving car companies to rely more on ML/AI software to enhance overall system performance. Effectively sacrificing some traceability and determinism for increased complexity handling and system performance during edge cases.
737 MAX MCAS software:
The 737 MAX accidents highlighted challenges in balancing certification costs and safety. Reports indicated that MCAS relied on a single angle-of-attack (AoA) sensor to reduce DO-178C software certification costs. This single-point failure increased the risk of erroneous inputs, contributing to the eventual crashes. This has since been revised, leveraging both of the AoA sensors on the aircraft to prevent similar failures.
What about autopilot?
Autopilot is comparable to Level 1 vehicle automation. It manages basic tasks under strict software control. When engaged, it operates in a closed-loop system, adjusting aircraft controls to maintain desired flight parameters. This is similar to a car’s cruise control, where the system adjusts the throttle to maintain a set speed.
Aren’t pilots on autopilot the whole time?
Most flights operate with autopilot engaged, except during taxi, takeoff, and landing. Autopilot and autothrottle systems manage pitch, roll, yaw, and speed, reducing the pilot's workload for routine tasks. While pilots can manually control all aspects of flight, automation enhances efficiency and minimizes errors over long durations. The human role shifts to system oversight and higher-order decision-making that these relatively simple systems cannot perform.
The Aircraft as a System (Humans + Machines)
Crew Resource Management (CRM):
Aircrew members work together using Crew Resource Management principles, similar to teamwork in a kitchen. One person may manage tasks and oversee execution/recipe, while another carries out specific actions like setting the oven to 400°F. If a mistake occurs, such as mishearing "500°F," the other team member corrects it. In this way, teams perform better than individuals and collectively have fewer uncorrected mistakes. CRM in aviation functions the same way, with the Pilot Flying (executing) and the Pilot Monitoring (monitoring, assisting, verifying) ensuring a safe and coordinated operation.
Crew Size and Technology Evolution:
In the early days of commercial aviation, multiple crew members managed flight operations, including a flight engineer, navigator, and radio operator. As technology advanced, automated systems replaced many of these manual processes. Today, in modern aircraft, two pilots handle all aspects of flight through CRM, assisted by modern avionics that streamline tasks like navigation, system monitoring, and communication.
Pilots and Design Assurance Level (DAL) Interaction
When air traffic control (ATC) requests a heading change, pilots assess their current position (high DAL), cross-check maps (high DAL), and evaluate factors like terrain and other aircraft. After confirming the request, the pilot updates the heading bug (high DAL), instructing the autopilot (high DAL) while monitoring execution.
Complexity and DAL Limitations:
A key point: DAL-certified avionics reliably handle structured inputs and outputs but struggle with complex, probabilistic tasks. Elements like speech and vision lack the determinism and traceability required by DAL standards, necessitating human involvement for those and high-order human tasks like sensing, thinking, and acting. This distinction means pilots interact with DAL-governed systems while they are generally neither deterministic nor traceable. Everyone reading this has likely misheard a phone number, misentered the pin on their phone, or accidentally skipped a step on a recipe. Despite the fantastic skill of professional pilots, they make human errors, too. The lack of assistance in human action is a significant source of human error attributed to mishaps.
The Future of Flight and Technology:
Aviation Demand is High, Placing Additional Pressure on the National Airspace System
Only about 20% of the world’s 8 billion people have flown, and this figure is closer to 5% in countries like India. As aviation demand continues to rise, the sector is expected to double in the coming decades. This growth will place immense pressure on pilot training, aircraft system design, and the overall efficiency of the National Airspace System. Scaling safely will require improvements in automation/technology, regulation, training, and infrastructure to support this surge in air travel.
Regulators and Industry:
Governments and industry leaders are heavily investing in the next generation of aviation technology, which includes faster, alternatively powered aircraft and fully autonomous systems. However, while technological advancements accelerate, regulatory frameworks evolve cautiously to maintain safety standards. This often creates tension between technologists pushing for rapid deployment and regulators ensuring new systems are safe and reliable. While AI and automation continue to improve, passengers and regulators will and should remain hesitant to remove humans from the cockpit entirely.
Improving the Aircraft as a system:
The future of aviation lies in improving the aircraft as a complete system, integrating advanced technology with human expertise. This means more reliable engines and avionics, enhanced pilot training, and more intelligent automation. AI and machine learning will be critical in supporting pilots, improving collision avoidance, verifying correct runway usage, and enhancing situational awareness. While professional aircrews today are highly skilled, human error remains a factor. The best approach is not to replace pilots but to augment them with cutting-edge technology, ensuring a safer, more efficient flight experience for all.
--------
--------
Appendix on the Washington, D.C. Accident on January 29, 2025
Initial Thoughts:
It’s worth noting that the NTSB, FAA, DOD, and others just started their investigations, and it will take time for all the learning to emerge. The items in the list below are generally speculation, but they are generally aligned with similar mishaps that have already been investigated. More information will bring additional clarity in the coming months.
Instrument Flight Rules for Traffic Deconfliction:
Airline operations follow Instrument Flight Rules (IFR) to ensure separation, but some responsibilities shift to flight crews near terminal areas. At airports like SFO, where parallel runways are closely spaced, pilots must actively manage deconfliction. Military aircraft often operate under IFR in controlled airspace but can also use Visual Flight Rules (VFR). A 1971 midair collision between a USMC F-4 Phantom and Hughes Airwest Flight 706, which resulted in 50 fatalities, highlighted the importance of IFR guidance for military aviation.
Airport Geometry and Approach Challenges:
Most airports feature fairly long, straight final approaches, but DCA's unique location near restricted airspace forces aircraft to often turn just before landing. At other airports like LAX, similar approach geometry flown by the pilot would typically be considered unstable, requiring a go-around. Most airports have arriving and departing traffic in a `bow-tie` shape (departures for LAX below). This bow-tie shape gives crossing and transiting traffic an easier place to cross, the center of the bow, where the traffic is expected to be at/near the runway. An example of a crossing route for LAX is also included below.
Runway Change:
Changing the runway from 01 to 33 for American Eagle (JIA) 5342 meant that instead of paralleling the helicopter's landing route, the aircraft would need to maneuver to the east to line up for runway 33, which would now intersect the helicopter's path.
Potential Altitude Deviation:
The NTSB reported this week that the helicopter appeared 125 feet above the 200-foot corridor altitude limit. Except for intentional formation flying, no two aircraft should be within 100 feet of each other in the air. Nevertheless, if accurate, 125 feet would have likely been a near-miss instead of an accident.
Traffic Collision Avoidance System (TCAS):
Most military aircraft do not have TCAS, which provides traffic alerts (TA) and resolution advisories (RA). TAs notify pilots of potential conflicts, while RAs offer specific deconfliction guidance. Flight JIA 5342 would have been equipped with TCAS, but these systems are often disabled near landing to prevent unnecessary alerts. As discussed earlier, current automation struggles with complex traffic scenarios, limiting TCAS effectiveness.
Mixed Radios:
Military aircraft primarily use UHF frequencies, while civilian air traffic operates on VHF. However, ATC can communicate on both. This frequency mismatch meant that American Eagle 5342 and PAT 25 likely could not hear each other’s transmissions, contributing to the misunderstanding. At Reagan National Airport, the tower operates two VHF and one UHF frequency (257.6), illustrating this divide. Even two aircraft on different VHF frequencies would [generally] not have been able to hear each other.
Communication Expectation Bias:
Aviation radio communication follows standardized patterns that enhance safety. In this case, the natural sequence may have lowered the vigilance of American Eagle pilots. When ATC queried PAT 25 about traffic awareness, this would have caused JIA 5342’s crew to assume ATC was working on deconfliction. The follow-up call, telling PAT 25 to follow behind JIA 5342, would only have made sense if the prior question was answered that PAT 25 has JIA 5342 in sight. If PAT 25 had said they did not see JIA 5342, a different call, likely directing one or both of the aircraft to change course or altitude, would have been the case. The same would be true for the lack of follow up after the directive to fly behind JIA 5342. The lack of a follow up, would mean that PAT 25 accepted the request. Since PAT 25 was reportedly on a different frequency, the American Eagle pilots would not have heard that confirmation but assumed it occurred, reinforcing a false sense of deconfliction.
Attention and NVGs:
While aligning for landing, the American Eagle crew likely believed the helicopter had them in sight and was following behind. Visual deconfliction is a common occurrence. Meanwhile, PAT 25’s Army crew reportedly used night vision goggles (NVGs), which amplify light but can impair depth perception (I’ve used the same/similar NVGs before). Flying over a well-lit urban area complicates the JIA 5342 identification as well as the fact that there were multiple aircraft in the same visual sector, making it easier to think that they saw the correct aircraft, when it was not. We don’t know if this happened here, but it can happen in situations similar to this. If the helicopter crew was scanning for additional traffic, or generally ‘eyes out’ this may have also contributed to inadvertently climbing and not looking at their instruments as closely.
Air Traffic Control (ATC):
ATC relies on high-assurance software but still requires manual oversight, making them susceptible to errors for the same reasons covered above. As you can see in the radar scope image, they also have technology that has aged over the decades. Additionally, the controllers face fatigue, distraction, and cognitive overload at times. While ATC has prevented countless accidents (and been very helpful to me in my flying career), mistakes can occur. The radarscope for this accident flagged a ‘Collision Alert’ (CA), which would struggle in the example covered above where PAT 25 said they saw JIA 5342, and would fly behind. The technology also struggles with complex, low-altitude scenarios. Future advancements in ATC automation may help mitigate these limitations.
Pilot Fatigue:
Fatigue was not identified as a factor in this accident, but it plays a significant role in many incidents. Studies equate 18 hours of wakefulness to a 0.05% blood alcohol level, impairing cognitive function. Pilots sometimes operating on minimal rest may misinterpret information, overlook critical details, or fail to recognize deconfliction cues. The same risks apply to air traffic controllers managing congested airspace. The FAA, NASA, DOD, and others work on fatigue research and mitigation.
New Layers to the Swiss Cheese Model Could Prevent Similar Accidents:
This accident exemplifies the Swiss Cheese Model, where multiple small failures aligned to create a hazardous scenario. Any one of them, and many others we do not know about, may have been able to prevent this outcome. Lessons will be learned, safety measures will be refined, and technology will improve. This will add new protections from similar accidents happening again. Aviation remains one of the safest modes of transportation, and continuous advancements will further enhance safety for all.
I’ve been meaning to write this post for over a year. Writing about aviation safety is not always the most exhilarating topic; however, I am confident that reading this will have you in the top few percent of people thinking about how it all works, especially in light of last week’s unfortunate accident.
This post combines aviation history, advancements in aviation safety, how some aviation data works, a bit of where things are today, and where I think the space is heading. I provide a little thought on last week’s Washington D.C. accident at the end. I tried to balance accuracy and length, but as a result, in a few places, there may be inaccuracies in the form of simplification / generalization.
By background, I’ve been flying for about 20 years. Three of my grandparents served in/around WWII. One was a pilot. From a young age, I thought I would serve and fly for the country if I could. I joined the US Navy and served nearly 11 years. I flew F/A-18Cs, albeit as a very mediocre one - certainly nobody asked me to go to TOPGUN. After serving in the Navy, I worked on self-driving car software at Cruise. I now work directly on building technology to enhance aviation safety and efficiency at Beacon AI using my and my teammates’ lived experience.