Sharing research data and participant privacy


(Jdittrich) #1

In the to-be-created OSD manifesto we talk about sharing, transparency and teaching as well as privacy. An interesting field where I find myself caught between the principles is sharing research data.

I wonder what the practices of others deal with evaluating, anonymizing and sharing research data which is more detailed than a report ?


(Jdittrich) #2

I did do the following things:

  • Anonymizing reports/notes by editing out names, specific places or products – Something like “On [Date] I always go to [a small, local cafe] to work”
  • I sometimes do drawing-based research and ask participants if they want to share it publicly (this are mainly workflow diagrams for software use).
  • If participants agree to it, I send back my notes from the observation/interviews and ask if I can share them and under which conditions (e.g. Replacing their name or actually giving that person credit by naming them, if they want)

In my experience, people are often very open to share the results in some form. I assume, though, my research is of comparatively low risk – I focus on workflows and I assume there are areas far more tricky.


(Jan-Christoph Borchardt) #3

On a usability test I did and published back in 2010 I did write a bit about the demographics and background of the participant(s): https://jancborchardt.wordpress.com/2010/08/09/shotwell/ – quotes or anything didn’t have specifics in it.

Back when I worked in an agency or at the university of course we had release forms and so on. Maybe we could provide some open boilerplate documents for people in the open source community to use?

Also cc @ei8fdb since we chatted about this topic as well. And @bumbleblue I suppose there’s also research done at Simply Secure, how is that handled? Does anyone maybe even have specific guidelines?


(Philip Durbin) #4

We have usability researchers on our team and they used our own open source software, Dataverse, to publish the results: https://doi.org/10.7910/DVN/ND1S3S

Here’s how it looks:

Dataverse defaults to being open and CC0 but the “Request Access” button and red padlock above indicates that access to this dataset has been restricted, probably for the privacy reasons you mentioned. There’s a “Contact” button if you’d like to get in touch with them. By the way, it’s free for any researcher to upload data to Harvard Dataverse. Most of the 33 installations of Dataverse only allow data to be deposited by researchers at the institution who hosts it.

Update: If you click on the “Terms” tab, you can see that access to the dataset has be restricted due to IRB requirements:

So I guess you should check with your IRB (that’s “institutional review board” for the uninitiated).


(Jdittrich) #5

If one works at a place having one – Many organizations are much too small.


(Felix A. Epp) #6

I recently competed the finish doctoral research ethics course and in the lecturers opinion open raw research data will become more and more relevant in the next years, especially because it will highly increase reproducibility, which is a huge problem in most sciences at the moment.

But on the same side the new EU GDPR coming in effect 25th May, will give participants even more control over their data. Generally people can withdraw consent at any point in time. Of course, published data can not be retracted, but it complicates things.

And anonymisation is hard, too. In studies were demographics are very relevant to understanding the data or field studies, a combination of identifiers can be used to deanonymise study subjects, especially in smaller samples.

As for my personal practices: I specify, how I will use the data and already acquire optional written consent for data i want to publish (pictures, drawings, etc.).

I would love to see opensourcedesign be involved in privacy aware sharing of open research data. I’ll keep it in mind during my research. I believe there are already initiatives for this cause, we could cooperate with or look for as an example.


(Loic Dachary) #7

@eppfel are there Nextcloud research data published somewhere? I would be super interested to read them :slight_smile:


(Felix A. Epp) #8

Ah, sorry. I’m not really active at nextcloud right now. You have to ask @jan for that.