Worst software defects in history

A software bug is the common term used to describe an error, flaw, mistake, failure, or fault in a computer program or system that produces an incorrect or unexpected result, or causes it to behave in unintended ways.

So what are the worst software defects in history ?

  1. July 28, 1962 — Mariner I space probe. A bug in the flight software for the Mariner 1 causes the rocket to divert from its intended path on launch. Mission control destroys the rocket over the Atlantic Ocean. The investigation into the accident discovers that a formula written on paper in pencil was improperly transcribed into computer code, causing the computer to miscalculate the rocket’s trajectory.
  2. 1982 — Soviet gas pipeline. Operatives working for the Central Intelligence Agency allegedly (.pdf) plant a bug in a Canadian computer system purchased to control the trans-Siberian gas pipeline. The Soviets had obtained the system as part of a wide-ranging effort to covertly purchase or steal sensitive U.S. technology. The CIA reportedly found out about the program and decided to make it backfire with equipment that would pass Soviet inspection and then fail once in operation. The resulting event is reportedly the largest non-nuclear explosion in the planet’s history.
  3. 1985-1987 — Therac-25 medical accelerator. A radiation therapy device malfunctions and delivers lethal radiation doses at several medical facilities. Based upon a previous design, the Therac-25 was an “improved” therapy system that could deliver two different kinds of radiation: either a low-power electron beam (beta particles) or X-rays. The Therac-25’s X-rays were generated by smashing high-power electrons into a metal target positioned between the electron gun and the patient. A second “improvement” was the replacement of the older Therac-20’s electromechanical safety interlocks with software control, a decision made because software was perceived to be more reliable.What engineers didn’t know was that both the 20 and the 25 were built upon an operating system that had been kludged together by a programmer with no formal training. Because of a subtle bug called a “race condition,” a quick-fingered typist could accidentally configure the Therac-25 so the electron beam would fire in high-power mode but with the metal X-ray target out of position. At least five patients die; others are seriously injured.
  4. 1988 — Buffer overflow in Berkeley Unix finger daemon. The first internet worm (the so-called Morris Worm) infects between 2,000 and 6,000 computers in less than a day by taking advantage of a buffer overflow. The specific code is a function in the standard input/output library routine called gets() designed to get a line of text over the network. Unfortunately, gets() has no provision to limit its input, and an overly large input allows the worm to take over any machine to which it can connect.Programmers respond by attempting to stamp out the gets() function in working code, but they refuse to remove it from the C programming language’s standard input/output library, where it remains to this day.
  5. 1988-1996 — Kerberos Random Number Generator. The authors of the Kerberos security system neglect to properly “seed” the program’s random number generator with a truly random seed. As a result, for eight years it is possible to trivially break into any computer that relies on Kerberos for authentication. It is unknown if this bug was ever actually exploited.
  6. January 15, 1990 — AT&T Network Outage. A bug in a new release of the software that controls AT&T’s #4ESS long distance switches causes these mammoth computers to crash when they receive a specific message from one of their neighboring machines — a message that the neighbors send out when they recover from a crash.One day a switch in New York crashes and reboots, causing its neighboring switches to crash, then their neighbors’ neighbors, and so on. Soon, 114 switches are crashing and rebooting every six seconds, leaving an estimated 60 thousand people without long distance service for nine hours. The fix: engineers load the previous software release.
  7. 1993 — Intel Pentium floating point divide. A silicon error causes Intel’s highly promoted Pentium chip to make mistakes when dividing floating-point numbers that occur within a specific range. For example, dividing 4195835.0/3145727.0 yields 1.33374 instead of 1.33382, an error of 0.006 percent. Although the bug affects few users, it becomes a public relations nightmare. With an estimated 3 million to 5 million defective chips in circulation, at first Intel only offers to replace Pentium chips for consumers who can prove that they need high accuracy; eventually the company relents and agrees to replace the chips for anyone who complains. The bug ultimately costs Intel $475 million.
  8. 1995/1996 — The Ping of Death. A lack of sanity checks and error handling in the IP fragmentation reassembly code makes it possible to crash a wide variety of operating systems by sending a malformed “ping” packet from anywhere on the internet. Most obviously affected are computers running Windows, which lock up and display the so-called “blue screen of death” when they receive these packets. But the attack also affects many Macintosh and Unix systems as well.
  9. June 4, 1996 — Ariane 5 Flight 501. Working code for the Ariane 4 rocket is reused in the Ariane 5, but the Ariane 5’s faster engines trigger a bug in an arithmetic routine inside the rocket’s flight computer. The error is in the code that converts a 64-bit floating-point number to a 16-bit signed integer. The faster engines cause the 64-bit numbers to be larger in the Ariane 5 than in the Ariane 4, triggering an overflow condition that results in the flight computer crashing.First Flight 501’s backup computer crashes, followed 0.05 seconds later by a crash of the primary computer. As a result of these crashed computers, the rocket’s primary processor overpowers the rocket’s engines and causes the rocket to disintegrate 40 seconds after launch.
  10. November 2000 — National Cancer Institute, Panama City. In a series of accidents, therapy planning software created by Multidata Systems International, a U.S. firm, miscalculates the proper dosage of radiation for patients undergoing radiation therapy.Multidata’s software allows a radiation therapist to draw on a computer screen the placement of metal shields called “blocks” designed to protect healthy tissue from the radiation. But the software will only allow technicians to use four shielding blocks, and the Panamanian doctors wish to use five.The doctors discover that they can trick the software by drawing all five blocks as a single large block with a hole in the middle. What the doctors don’t realize is that the Multidata software gives different answers in this configuration depending on how the hole is drawn: draw it in one direction and the correct dose is calculated, draw in another direction and the software recommends twice the necessary exposure.
  11. Death resulted from inadequate testing of the London Ambulance Service software. Story
  12. Several 1985-7 deaths of cancer patients were due to overdoses of radiation resulting from a race condition between concurrent tasks in the Therac-25 software. ReportReportStoryMoreMoreMoreMore
  13. An Airbus A320 crashes at an air show. Story
  14. An Air New Zealand airliner crashed into an Antarctic mountain; its crew had not been told that the input data to its navigational computer, which described its flight plan, had been changed. From “The development of software for ballistic-missile defense,” by H. Lin, Scientific American, vol. 253, no. 6 (Dec. 1985), p. 52.
  15. On July 1-2, 1991, computer-software collapses in telephone switching stations disrupted service in Washington DC, Pittsburgh, Los Angeles and San Francisco. Once again, seemingly minor maintenance problems had crippled the digital System 7. About twelve million people were affected in the crash of July 1, 1991. Said the New York Times Service: “Telephone company executives and federal regulators said they were not ruling out the possibility of sabotage by computer hackers, but most seemed to think the problems stemmed from some unknown defect in the software running the networks.” Within the week, a red-faced software company, DSC Communications Corporation of Plano, Texas, owned up to glitches in the signal transfer point software that DSC had designed for Bell Atlantic and Pacific Bell. The immediate cause of the July 1 crash was a single mistyped character: one tiny typographical flaw in one single line of the software. One mistyped letter, in one single line, had deprived the nations capital of phone service. It was not particularly surprising that this tiny flaw had escaped attention: a typical System 7 station requires ten million lines of code. From The Hacker Crackdown, by Bruce Sterling, 1992. StoryMoreMoreMore
  16. The Denver airport stayed closed for over a year due to software glitches in the automated baggage handling system. StoryMore
  17. The U.S. Social Security Administration systems could not handle non-Anglo names, affecting $234 billion for 100,000 people, some going back to 1937. From Internet Risks Forum NewsGroup (RISKS) , vol 18, issue 80.
  18. The Korean Airlines KAL 801 accident in Guam killed 225 out of 254 aboard. A design problem was discovered in barometric altimetry in Ground Proximity Warning System (GPWS). From ACM SIGSOFT Software Engineering Notes, vol. 23, no. 1.NTSB final report.
  19. Software reboot during the Apollo 11 landing forced Armstrong to manually land the lunar lander. Story
  20. In 1989, Swedish Gripen prototype crashed due to new software in the fly-by-wire system. Story
  21. French ticket reservation software took 4 months to get working. Story
  22. Software error causes patients to be declared dead.  Story
  23. Software suspected in 1994 Chinook helicopter crash, killing 29.  StoryReport
  24. For two days during the summer holidays in 2004, the French national railroad company’s reservation system was disorganized, due to a faulty patch. Report

Sources and more:

http://courses.cs.vt.edu/~cs3604/lib/Therac_25/Therac_1.html
http://www.cs.tau.ac.il/~nachumd/verify/horror.html
http://www.ccnr.org/fatal_dose.html
http://www.wired.com/software/coolapps/news/2005/11/69355
http://www5.in.tum.de/~huckle/bugse.html
http://www.rvs.uni-bielefeld.de/publications/compendium/index.html

Leave a comment

Blog at WordPress.com.

Up ↑