Responsible Data Policy
In supporting social justice work, we try as best we can to practise responsible data, and we support partners to do the same.
Practising responsible data means considering the privacy, security and ethical implications of working with data, from collecting it, to managing, using and deleting it.
Since 2013, we’ve been stewards of the Responsible Data community, a group of now more than 1200 people from diverse backgrounds who come together to discuss and develop best practices for using data in their advocacy and social justice work.
Being part of this space enables us to learn from the challenges that others are facing, and to draw upon a wealth of expertise in figuring out how best to address those challenges.
How We Practise Responsible Data
- We practise data minimisation: only collecting the data we need, and deleting it afterwards
- We use and support open source technology (more on this below)
- We incorporate responsible data support when working with partners to design and implement projects
- We make our sources of revenue transparent so that others can determine if there are any conflicts they may have in working with us
- When we collect data about people or organisations, we communicate clearly about that collection to allow for informed consent and an opt-in process
- We do not collect data about people, organisations, or activities that we do not have a clear intent to use productively
- We invest core resources to actively participate in and facilitate the responsible data community of practice
- We set high operational security standards for our team, and provide the support needed to meet these standards
To learn more about responsible data, we recommend:
- The Responsible Data Handbook
- Becoming RAD! – a resource to help organisations develop plans for data retention, archiving and disposal
- The Responsible Data Resource List
- Signing up to The Engine Room’s bi-monthly responsible-data focused newsletter, Mission: Responsible
- Joining the Responsible Data community listserv and exploring Responsibledata.io, which publishes writing and resources from the community.
Services We Use
As a remote organisation, we use a range of services and tools to keep our virtual doors open. Below, we describe some of our tools and choices, and how they might affect the way we collect and store data.
- We use the open-source analytics software Matomo, which we self-host. This is used on The Engine Room’s main site to gain insight about what visitors are looking for, and to improve the site with this in mind. Along with anonymised IP addresses (e.g. 192.168.xxx.xxx) the software tracks country, page visited, duration of visit, returning visits over time, device type and model (e.g. iPhone, Google Pixel), operating system (e.g Windows, MacOS), browser (e.g. Firefox, Chrome), and incoming and outgoing traffic (i.e. where people come from and where they go next. Read more on Matomo’s privacy-respecting configurations here.
- We use WordPress for this site and for the Responsible Data site, because it’s open source and has a lot of functionality that we can use and re-use.
- We use Greenhost’s and Koumbit’s servers. Greenhost offers an ‘ethical and sustainable approach’, valuing privacy, which we really appreciate. We use both to distribute and differentiate our infrastructure, selecting the proper location for different tools and services. We also host our email on Greenhost.
- We use Nextcloud for storing documents that may contain confidential or sensitive information, hosted on servers we control.
- We use Google Drive for collaboration involving non-sensitive information: as a completely remote organisation, real-time collaboration over documents is a key part of how we work, and we’ve not yet found an open source solution that offers even close to the same level of functionality – but we’re always on the lookout. (Note: we don’t use Google’s email services, though. As mentioned above, we run that through Greenhost.)
- When we need to gather information from others, we use Nextcloud Forms on our self-hosted Nextcloud instance.
- We host our code on Github, a platform which allows us to publish code publicly, collaborate internally and externally, and track changes to our projects and those of partners.
- We use different video/conference call software as their reliability changes. We’re regular users of end-to-end encrypted messaging platform Signal. For meetings, we use self-hosted instances of Big Blue Button and Jitsi Meet. However, for community calls we currently use Zoom, as a wide range of people already have familiarity with the platform and this removes some potential barriers to joining and participating in the calls.
- Internally, we encrypt our team emails using PGP, and we put our public keys up with our staff bios on our Team page. We all use Thunderbird as our email client, an open source software program formerly maintained by Mozilla. Encrypting our emails to each other – and, wherever possible, to partners – has the side benefit that none of us can read our emails on our mobile devices, which encourages a healthy work-life balance.
- We send our newsletter using MailChimp, which tracks links, and stores subscribers’ email addresses. We only send our newsletter to people who actively sign up for it.
- The only social media accounts we use (for now, at least) are Twitter and LinkedIn.
When we’re carrying out research projects, we generally start with a blog post to let others know what we’re working on, in the spirit of transparency and collaboration.
When we’re conducting interviews, we store interviewees’ personal data on Nextcloud (and sometimes temporarily on Google Docs), and we typically use Google Drive to take notes collaboratively, unless the topic is one of particular sensitivity. For coding interviews we use Dedoose, for the most part pseudonymising interview data in the tool using numbers to identify individual interviewees (rather than names or personal details).
We take stock of where information should be stored on a project-by-project basis at the start of a project, including how long it should be kept for, and when it should be deleted.
We always make sure that interviewees have the chance to see their quotes in context prior to being published. We also check with interviewees as to whether they’d like to be acknowledged by name in anything we say publicly about the report or the project. When we’re carrying out community calls, we use open source etherpads hosted by Mozilla or Riseup to take notes collaboratively with participants during the call. As a backup (if there are issues with a pad, for example), we occasionally resort to a shared Google Doc.