Hi! We've renamed ScraperWiki.
The product is now QuickCode and the company is The Sensible Code Company.

Blog

Which plane had the most accidents?

Searching by facets

Last year, ScraperWiki helped migrate lots of specialist datasets to GOV.UK.

This afternoon, we happened to notice that the Air Accidents Investigation Branch reports, which we scraped from their old site, are live.

AAIB

The user interface is called Finder Frontend, and is used by GOV.UK wherever the user needs to search for items by varying criteria. In the jargon, it’s called “faceted search”.

We enabled this type of searching by scraping the “Aircraft category”, “Report type” and “Date” fields. Users can then filter the accident reports by one or more of those criteria at once.

Most accident prone

Since we scraped it, we also happen to have the data in an SQL database in our Data Science Platform. A quick query reveals which aircraft has the most accident reports about it.

AAIB query

The answer is G-AWNB, a Boeing 747-136. It was made in 1970, and has 10 accident reports (some of those are errata, so it doesn’t mean ten accidents).

Here are three of its accidents, chosen to span time:

  1. In 1975 in Scotland, part of a flap detached during a training flight and struck the cabin door.

  2. In 1987, shortly after takeoff, a steward noticed a skin panel had ruptured on the left wing, and the hapless plane had to jettison its fuel and return to Heathrow.

  3. Lest you think it was just a badly made or maintained plane, in 1995, also at Heathrow, it suffered bad luck. A faulty passenger jetty rose up damaging the cabin door – repairs took several days.

Conclusion

ScraperWiki often helps with migration projects like the AAIB data. As another example, we’re working on migrating insurance data between two ERP systems at the moment.

The skillset of understanding a (poorly) documented dataset, and producing the best quality output for re-use, is an important part of data science. We use the same skill as part of lots of other projects.

Understanding data fully is the first stage of doing useful analysis with data.

We're hiring!