healthcare – ScraperWiki https://blog.scraperwiki.com Extract tables from PDFs and scrape the web Tue, 09 Aug 2016 06:10:13 +0000 en-US hourly 1 https://wordpress.org/?v=4.6 58264007 Meet Carl Plant – He Wants an Army of Armchair Health Researchers https://blog.scraperwiki.com/2011/11/meet-carl-plant-he-wants-an-army-of-armchair-health-researchers/ https://blog.scraperwiki.com/2011/11/meet-carl-plant-he-wants-an-army-of-armchair-health-researchers/#comments Fri, 18 Nov 2011 17:11:27 +0000 http://blog.scraperwiki.com/?p=758215887

I think more can be done with NHS data. We are seeing health data being published in such places as the NHS Information Centre, the Office for National Statistics, data.gov.uk, health observatories and various institutions such as the Royal College of GPs.

There’s a range of useful NHS data available if you know where to look as it’s pretty much hidden by obscurity at times. It was this reason that led me to building the Healthdata search engine to dig out the data from NHS sites and various institutions. So when you actually find the data you have to be mindful about issues such as the quality and reliability of the data and you will regularly see incomplete datasets or poorly formatted documents along your travels. For example if you look at the A&E dataset you get this warning

During the period covered by the 2008-09 and 2009-10 A&E HES data, not all NHS trusts have complete data submissions and data quality is poor in some cases. As a result of this, caution should be exercised while using these data.

One of my pet hates are poorly designed graphics, you will also regularly hear me on Twitter moan about 3D pie charts and poorly designed data dashboards used in the NHS. This is an area that needs a good make over, turning data into meaningful information is an art form as well as a statistical process.

A role I would like to see developed is ‘citizen health researcher’, much like its close relative the ‘citizen journalist’. We need to provide investigative skills, tools and support to develop an army of citizen health researchers; people who take an interest in how NHS services are run while taking a balanced investigative approach to NHS data. This will provide the ‘many eyes’ or the ‘armchair auditors’ approach to checking how the NHS is running.

Now I’m not advocating an army of NHS bashers but rather people who are able to discover new ways to look at, combine and display data. They need to start looking for patterns or trends not seen from within the health informatics community.

We are seeing a range of tools and online communities aimed at lowering the entry level to becoming a (citizen) health data researcher. Scraperwiki, Kasabi, Buzzdata, Junar, and Google docs are just some of the tools available. We also have ‘citizen investigative channels’ such as Help Me Investigate Health. In fact many of these data analysis and visualisation tools may actually be better than what clinicians have at their disposal!

If we are to see an increase in people participating in how healthcare is run, we need to have more transparency, better tools to engage and lower entry levels. My challenge to (data tool) developers is to lower the entry levels for citizen health researchers.

Carl Plant  is a hybrid of geek, nurse, part time coder, digital artist, health data blogger and dad. He is also open data manager and community manager for a digital healthcare service in the West Midlands. You can see his scrapers here.

]]>
https://blog.scraperwiki.com/2011/11/meet-carl-plant-he-wants-an-army-of-armchair-health-researchers/feed/ 1 758215887
Solving Healthcare Problems with Open Source Software https://blog.scraperwiki.com/2011/11/solving-healthcare-problems-with-open-source-software/ Fri, 11 Nov 2011 13:05:59 +0000 http://blog.scraperwiki.com/?p=758215819 This year, EHealth Insider brought a new feature to their annual EHI Live exhibition: a healthcare skunkworks that gave visitors the chance to ask questions about how open source software can be used to solve healthcare problems.

ScraperWiki, of course had to be one of the invited guests to exhibit at the skunkworks. So as is our way, we drove an agile data mining sprint on the first day of the exhibition. The idea was to convene a small group of developers, give them coffee and an Internet connection, and see if they could create useful healthcare and NHS data sets by the end of the day. Attendees at the ScraperWiki exhibit could watch development progress on the scrapers in real time! It was thrilling!

Four developers participated in the sprint, from ScraperWiki and NHS Connecting for Health. By the end of the day, they had written multiple scrapers delivering data about:

* World Health Organisation outbreak alerts and responses

* Communicable and respiratory disease incidence data from the Royal
College of GPs

* Health information standards from the NHS Information Standards Board

* Foodborne outbreaks in the US, from the Centers for Disease Control
and Prevention

* Suppliers registered with the UK Government Procurement Service

One very lucky developer, Jacob Martin, from NHS Connecting for Health, won the coveted ScraperWiki mug for writing the most scrapers over the course of the day (*applause*).

But it’s not just about the scraping, it’s the ideals of ‘open’ that can be enlightening in such a short period of time given the will and the right equipment. As Shaun Hills, from NHS Connecting for Health, commented: “Interoperability and data exchange are important parts of healthcare IT. It was interesting and useful to see how technology like ScraperWiki can be used in this area. It was also good to brush up on my Python coding and still deliver something in a few hours.”

So watch out healthcare – you’re being ScraperWikied!

]]>
758215819