We talked to Pierre Romera, former developer who studied computer science, at a five-day data journalism training in Belgrade for a group of young journalists, which he conducted together with Nicolas Kayser-Bril, whom we recently interviewed for Media.ba.
Romera is now chief technical officer (CTO) of Journalism++
, a 2-year old company currently comprising 6 people, which assists newsrooms in the transition towards increased use of data by creating innovative projects around data and around new ways to tell stories.
Romera previously worked at Owni.fr, a news website built around digital culture and new kind of journalism, where Kayser-Bril and himself established a data journalism section and created first French data journalism applications.
What are your impressions of the training in Belgrade?
There is a true curiosity from the people. I think the things are not really different in France. Yes, some people already do data journalism, but only a few know how to do it the right way. And I think at the end of the week people here will be able to do it the right way. They should be able to do charts without making mistakes (for instance, too many data points on a chart or not taking into account margin of error, this kind of thing).
Our main expertise is to build crowdsourcing projects, involve the crowd – citizens into investigating and sourcing data. We started 3 years ago at Owni.fr with Wikileaks projects where we asked people to dig into documents published by Wikileaks and help us find interesting ones. We develop these kinds of projects for Journalism++.
Our latest one is about electricity supply in Duala, Cameroon at www.feowl.com
. There we experiment with daily crowdsourcing, asking people every day if they had power cuts. The project is funded by International Press Institute (IPI).
For different projects we are relying both on partnerships with foundations and media partnerships – thus we operate both with the help of donations and commercially.
Media come with a lot of data or stories to tell with data and we try to apply our expertise to say what the best story is to build with this data and to create services around the data. We attempt to create a platform with every data set.
For example, a project we built with AFP, the biggest press agency in France is called "The E-Diplomacy Hub"
. They came with the need to build a database of diplomacy actors who are active on Twitter. So they asked us to create a tool to make a database of Twitter accounts and we created this tool to visualise the discussion around the world. You can see discussion consisting of tweets by diplomats among countries. We built the visualisation on the map and tried to create a service behind – a way for journalists to export data and embed it into articles (they can export every tweet about a particular country, for instance). They just have to select the data they want to monitor and they have a widget they can embed into an article or website. They can also visualise discussion from a particular country with other countries. For instance you can see now that Belgium is talking with USA.
We have another service that we built for German foundation ABZV named Datawrapper.de
, in cooperation with other people. This project is about creating visualisations quickly from a dataset and is available to everyone. There are a lot of impressions on the charts created with Datawrapper because it is used by the Guardian, Le Monde and other big websites. We have a small community of users, but they are power users.
You spoke about data security for journalists at the training in Belgrade. So what is important for a journalist to know regarding safety of their data?
When you exchange data you have to know how to do it in a safe way. I showed just a few tools - I talked about Pidgin
, which is instant messaging client where you can chat with encryption. There are other tools, like Enigmail
, which allows you to encrypt your email through the e-mail client Thunderbird
. I also talked about TOR
, which is an anonymous browsing tool. But most of security is best practice - it’s not only about tools, but how to have a good password, how the network works and why it is important to know where your data transits. It’s important to encrypt the data and avoid to make it transit too much on the network - avoid to using many websites or web services to send data because when you use a computer and when you are on the Internet you leave traces and you want to leave as little trace as possible. The best way is to send someone a dataset without exposing it, for example give it with a USB key. It’s simple, but it’s probably the safest way to transmit data. People don’t think it’s safe, but the most anonymous way to send information is to use snail mail.
You are also a university teacher at journalism school. What do you teach young journalists?
In France I am a teacher at MA level at L’Ecole de Journalisme de Science Po
, which is one of the main journalism schools in France. I teach journalists programming of web pages. But it’s more about building a base of knowledge to work with developers. When you start to work with developers you have to have a good knowledge of the language you need to use to talk with them. And that’s what we try to teach journalists. Our purpose is not to create programmers, but people able to work with programmers. The idea is that they learn the process so they can control it once they are in a situation to do so.
There is also a course for data journalism at MA level and my course is linked to it because if you want, for example, to scrape a website, you have to know how it is structured and which language enables you to extract data from this website. Journalists learn that in the data journalism course. This course is new and has been running as of January 2013. The students learn pretty much what we taught in this Belgrade workshop – data visualisation with Datawrapper, Infogram, Google Fusion, Google Charts... It’s what we call data literacy, the ability to know which story you can build with data.
I don’t think every journalist should be able to do that, there are already specialists in newsrooms. But I think that it’s important for every journalist to understand why they need data and why they should process data, but it doesn’t mean they should know how to do it, rather they have to know there are people who are able to process the data which is now bigger than before.
What is the status of data driven journalism in France, where you live? What is the situation in newsrooms?
Not every newsroom is applying data journalism, but they try. There is one data journalist in almost every big newsroom, for instance at Le Monde and at Le Parisien, but all build data journalism projects with the external help, such as in cooperation with Journalism++. They try to do it by themselves, but it tends to be at a low level. That's why we give training because they want to be able to build a project alone without big amount of knowledge. We give a lot of training to newsrooms.
What do you think is the future of data journalism with all these tools mushrooming very quickly?
Every journalist should be able to build a simple data visualisation such as ones with Datawrapper or Infogram or Google Fusion because learning to make visualisation with these tools is very easy, you can learn it in two hours. And this process should make us able to build bigger projects with newsrooms because they are already focused by themselves on small visualisation projects and we could be able to create bigger projects such as the one we did with AFP.
It seems easy to create a visualisation, but underneath it you have to have some basic knowledge of mathematics?
Very basic - average, multiplication and that's it.
So there is no danger of journalists making big mistakes?
Sometimes there are big mistakes, so that’s why they should be able to speak with programmers who know how to process a lot of data, as well as with statisticians. Statisticians are more and more to be found in newsrooms, some newsrooms in France have good statisticians, for example economic newspapers.