Civil Service People Survey – ScraperWiki Extract tables from PDFs and scrape the web Tue, 09 Aug 2016 06:10:13 +0000 en-US hourly 1 58264007 Saving time with GOV.UK design standards Thu, 04 Feb 2016 08:46:36 +0000 While building the Civil Service People Survey (CSPS) site, ScraperWiki had to deal with the complexities of suppressing data to avoid privacy leaks and making technology to process tens of millions of rows in a fraction of a second.

We didn’t also have time to spend on basic web design. Luckily the Government’s Resources for designers, part of the Government Service Design Manual, saved us from having to.

In this blog post I talk through specific things where the standards saved us time and increased quality.

This is useful for managers – it’s important to know some details to avoid getting too distant from how projects really work. If you’re a developer or designer who’s about to make a site for the UK Government, there are lots of practical links!

Header and footer

To style the header and the footer we used the govuk_template, via a mustache version which automatically updates itself from the original. This immediately looks good.

CSPS about page

The look takes advantage of years old GDS work to be responsiveaccessible and have considered typography. Without using the template we’d have made something useful, but ugly and inaccessible without extra budget for design work.

It also reduces maintenance. The templates are constantly updated, and every now and again we quickly update the copy of them that we include. This keeps our design up to date with the standard, fixes bugs, and ensures compatibility with new devices.


The template doesn’t have styling for your content. That’s over in a separate module called the govuk_frontend_template. It has a bunch of useful CSS and Javascript pieces. GOV.UK elements is a good guide to how to use them, with its live examples.

For example, there aren’t many forms and buttons in CSPS, nevertheless they look good.

CSPS login form

The frontend template is full of useful tiny things, such as easily styling external links.

CSPS external link

And making just the right alpha or beta banner.

CSPS alpha banner

The tools aren’t perfect. We would have liked slightly more styling to use inside the pages. Some inside Government are arguing for a comprehensive framework similar to Bootstrap.

All I really want is not to have to define my own <h1>!


Privacy is important to users. Despite the flaws in the EU regulations on telling consumers about browser cookies, the goal of informing people how they are being tracked is really important.

Typically in a web project this would involve a bit of discussion over how and when to do so, and what it should look like. For us, it just magically came with the Government’s template.

CSPS cookies

I say magically, we also had to carefully note down all the cookies we use. A useful thing to do anyway!

Error messages

There are now lots of recently made Government digital services to poach bits of web design from.

The probate service has some interesting tabs and other navigation. The blood donation service dashboard has grey triangles to show data increased/decreased. The new digital marketplace has a complex search form with various styles.

I can’t even remember where we took this error message format from – lots of places use it.

CSPS login error

If you’d like to find more, doing a web search for is a great way to start exploring.

Civil Service People Survey – Faster, Better, Cheaper Tue, 08 Sep 2015 13:46:21 +0000 CSPS1

Civil Service Reporting Platform

The Civil Service is one of the UK’s largest employers.  Every year it asks every civil servant what it thinks of its employer: UK plc.

For Sir Jeremy Heywood the survey matters. In his blog post “Why is the People Survey Important?” he says

“The survey is one of the few ways we can objectively compare, on the basis of concrete data, how things are going across departments and agencies.  …. there are common challenges such as leadership, improving skills, pay and reward, work-life balance, performance management, bullying and so on where we can all share learning.”

The data is collected by a professional survey company called ORC International.  The results of the survey have always been available to survey managers and senior civil servants as PDF reports. There is access to advanced functionality within ORC’s system to allow survey managers more granular analysis.

So here’s the issue.  The Cabinet Office wants to give access to all civil servants and in a fast and reliable way.  It wants to give more choice and speed in how the data is sliced and diced – in real time.  Like all government departments it is also under pressure to cut costs.

ScraperWiki built a new Civil Service People Survey Reporting Platform and it’s been challenging.  It’s a moderately large data set.  There’s close to half a million civil servants – over 250,000 answered the last survey which contains 100 questions.   There are 9000 units across government.  This means 30,000,000 rows of data per annum and we’ve ingested 5 years of data.

The real challenges were around:

  • Data Privacy
  • Real Time Querying
  • Design

Data privacy

The civil servants are answering questions on their attitudes to their work, their managers, the organisations they worked in along with questions on who they are: gender, ethnicity, sexual orientation – demographic information. Their responses are strictly confidential and one of the core challenges of the work is maintaining this confidentiality in a tool available over the internet, with a wide range of data filtering and slicing functionality.

A naïve implementation would reveal an individual’s responses either directly (i.e. if they are the only person in a particular demographic group in a particular unit), or indirectly, by taking the data from two different views and taking a difference to reveal the individual. ScraperWiki researched and implemented a complex set of suppression algorithms to allow publishing of the data without breaking confidentiality.

Real-time queriesimage_thumb.png

Each year the survey generates 30,000,000 data points, one for each answer given by each person. This is multiplied by five years of historical data. To enable a wide range of queries our system processes this data for every user request, rather than rely on pre-computed tables which would limit the range of available queries.

Aside from the moderate size, the People Survey data is rich because of the complexity of the Civil Service organisational structure. There are over 9,000 units in the hierarchy which is in some places up to 9 links deep. The hierarchy is used to determine how the data are aggregated for display.

Standard design

image_thumb.pngAn earlier design decision was to use the design guidelines and libraries developed by the Government Digital Service for the GOV.UK website. This means the Reporting Platform has the look and feel of GOV.UK., and we hope follows their excellent usability guidelines.

Going forward

The People Survey digital reporting platform alpha was put into the hands of survey  managers at the end of last year. We hope to launch the tool to the whole civil service after the 2015 survey which will be held in October. If you aren’t a survey manager, you can get a flavour of the People Survey Digital Reporting Platform in the screenshots in this post.

Do you have statistical data you’d like to publish more widely, and query in lightning fast time? If so, get in touch.