internship – ScraperWiki Extract tables from PDFs and scrape the web Tue, 09 Aug 2016 06:10:13 +0000 en-US hourly 1 58264007 Internship – Scraperwiki Sun, 25 May 2014 19:23:32 +0000 Hi, I am Patrick. I am currently in Year 10 at the British International School of Stavanger. As part of our curriculum, we are required to have one week of work experience. For my work experience, I worked at Scraperwiki from May 12 to 16 (2014). I found it a real privilege to be able to work at Scraperwiki and am extremely thankful for all the support and help from all of the team! I have an interest in programming and being with other programmers has been a very informative experience.

The first day, when I arrived in the office, I met all the team and was quickly set to work by Francis programming my first scraper. At first, I had no idea where to start, however Peter was very helpful at getting me started and pointing me in the right direction. I started coding with Jq, which was a language I never programmed in before, so it took a while for me to get used to it. I finished coding it on Tuesday, but then Francis asked me to put it on a Git-hub repository. It took me half of Wednesday to figure out how to make the repository (With help from Ian).

After I finished what Francis asked me to do, he asked me to code it in Python. At that stage, I was familiar with Python but was far from being able to code a scraper with it. Dragon, seeing I was having trouble coding the scraper, showed me what to do. It took me a while to grasp it, but on Thursday, I had got my results, and had organised them in the way Francis asked me to. Unfortunately, I didn’t have it ready for Show and Tell, so I did not have that much to show.

On Thursday, we had a retrospective, which was basically us putting things on a board and choosing which was the most important. I found it interesting to see the different things that popped up as I was not sure about what everyone was working on. I had to say all the things that satisfied us from the week, which was probably the biggest list, however I found it fun as many of the things sparked discussions. When the retrospective was finished, we got to eat the cupcakes Francis brought in, which were extremely rich, but still delicious.

The team at Scraperwiki were really helpful (and very patient!) as well as very friendly. Almost every day I joined them for lunch, which was pretty fun as I got to listen to them talk about work and and explain things to me. 3 out of the 5 days I was there, I ate from Shirley Valentines, which made fantastic sandwiches.

I had an amazing time at Scraperwiki, and am glad I chose there as my work experience. Hopefully, Francis will keep me busy, and I will stay in touch with the team.

Thank you guys! 😛 [contact-form-7]


Hi, I’m Sophie Tue, 25 Feb 2014 11:31:53 +0000 Hi, my name is Sophie Buckley, I’m an AS student studying Computing, Maths, Physics and Chemistry who’s interested in programming, games, art and reading. I’m also the latest in a line of privileged ScraperWiki interns and although my time here has been tragically short, the ScraperWiki team have done an amazing job in making every second of my experience interesting, informative and enjoyable. 🙂

One of the first things I learned about ScraperWiki is their use of the XP (Extreme Programming) methodology. On my first day I arrived just in time for ‘stand up’, where every morning the members of the team stand in a circle and share with the rest of us what they did the previous working day, what they intend to do that day and what they’re hoping to achieve. Doing this every morning was a bit nerve wracking because I haven’t got the best memory in the world, but Zarino (who’d been assigned the role of looking after me for the week) was always there to give me a helping hand and go first.

He also showed me how they use cards to track and estimate the logical steps that were needed to complete tasks, and how they investigated possible routes with the use of ‘spike’ cards. The time taken to complete a ‘spike’ isn’t estimated, it’s artificially time-boxed (usually to ½ or 1 day). The purpose of a spike is to explore a possible route and find out whether it’s the best one, without having to commit to it.

Zarino had set aside the week of my internship so that both of us could work on a new feature: An automatic way to get data out of ScraperWiki and into Tableau. We investigated the options on Monday, and concluded that we had two options: either generate “Tableau Data Extract” files for users to download, or create an “OData” endpoint that could serve up the live data. Both routes were completely unknown, so we wrote a spike card for each of them, to determine which one was best.

Monday and Tuesday consisted of trying to make a TDE file, and during this time I used Vim, SSH, Git, participated in pair programming, was introduced to CoffeeScript (which I really enjoyed using) and was also shown how to write unit tests for what we had written.

On Wednesday we decided to look further into the OData endpoint, and for the rest of the week I learned more about Atom/XML, wrote a Python ‘Flask’ app, and built a user interface with HTML and JavaScript.

One of the great things about ScraperWiki is the friendly nature of everybody who works here. Other members of the team were willing to help where they could and were more than happy to share with me what they were working on when I was curious. They were genuinely interested in me and my studies, and were kind enough to share with me their experiences, which meant that every tea break (tea being in abundance in the ScraperWiki office!) and lunch was never awkwardly spent with people you barely knew.

The guys at ScraperWiki like to do stuff outside the office too, and on Tuesday I was invited to see Her in FACT, which was definitely one of the highlights of the week! Other highlights included the awesome burgers at FSK and Ian’s ecstatic reaction when our spike magically piped his data in Tableau.

Overall, I’m so glad that I took those hesitant first steps in trying to become an intern at ScraperWiki by emailing Francis all those months ago; this has truly been an amazing week and I’m so grateful to everyone (especially Zarino!) for teaching me so much and putting up with me!

If you’d like to hear more from me and keep up with what I’m doing then you can check out my twitter page or you can email me. 🙂


My First Month As an Intern At ScraperWiki Fri, 09 Aug 2013 16:37:43 +0000 The role of an intern is often a lowly one. Intern duties usually consist of the provision of caffeinated beverages, screeching ‘can I take a message?’ into phones and the occasional promenade to the photocopier and back again.

ScraperWiki is nothing like that. Since starting in late May, I’ve taken on a number of roles within the organization and learned how a modern-day, Silicon Valley style startup works.

How ScraperWiki Works

It’s not uncommon for computer science students to be taught some project management methodologies at university. For the most part though, they’re horribly antiquated.

ScraperWiki is an XP/Scrum/Agile shop. Without a doubt, this is something that is definitely not taught at university!

Each day starts off with a ‘stand up’. Each member of the ScraperWiki team says what they intend to accomplish in the day. It’s also a great opportunity to see if one one of your colleagues is working on something on which you’d like to collaborate.

Collaboration is key at ScraperWiki. From the start of my internship, I was pair programming with the many other programmers who are on staff. For those of you who haven’t heard of it before, pair programming is where two people use one computer to work on a project. It’s nothing like this:

This is awesome, because it’s a totally non-passive way of learning. If you’re driving, you’re getting first-hand experience of writing code. If you’re navigating, then you get the chance to mentally structure the code that you’re working on.

In addition to this, every two weeks we have a retrospective where we look at how the previous fortnight went and where the next steps we intend to take as an organization. We write a bunch of sticky-notes where list what was good and what was bad about the previous week. These are then put into logical groups. We then vote for the group of stickies which best represent where we feel that we should focus our efforts as an organization.

What We Work On

Perhaps the most compelling argument for someone to do an internship at ScraperWiki is that you can never really predict what you’re going to do from one day to the next. You might be working on an interesting data science project with Dragon or Paul, doing front end development with Zarino or making the platform even more robust with Chris. As a fledgling programmer, you really get an opportunity to discover what you enjoy.

During my time working at ScraperWiki, I’ve had the opportunity to learn about some new, up and coming web technologies, including CoffeeScript, Express and Backbone.js.  These are all pretty fun to work with.

It’s not all work and no play too. Most days we go out to a local restaurant and get some food and eat lunch together. Usually it’s some variety of Middle-Eastern, American or Chinese. It’s also usually pretty delicious!


All in all, ScraperWiki is a pretty awesome place to intern. I’ve learned so much in just a few weeks, and I’ll be sad to leave everyone when I go back to my second year of university in October.

Have you interned anywhere before? What was it like? Let me know in the comments below!

]]> 1 758219215