Differentiating between privacy-sensitive and anonymous data #49

blootsvoets · 2017-04-10T07:37:02Z

We could consider to process privacy-sensitive data, but to encrypt it before sending it to the server. Keys to the encryption could be provided to researchers that are allowed to access those data (for example, using PEP). For non-sensitive data, we could send the data in plain text as we do now, so that the Kafka streams can aggregate it properly. Using the PEP mechanism, those data could also be encrypted, but the Kafka streams could get a key for only those data.

blootsvoets · 2017-04-10T08:11:51Z

Read through the PEP paper, this is based on a new encryption algorithm. It would be a complete infrastructure effort to implement this. The issue remains: everything that does not have to be shown in the dashboard or retrieved using the API, could be encrypted. This would involve especially privacy-sensitive stuff. In all cases, we would have to trust the researchers to handle their private keys with care though... Another note, for example Tresorit has a nice key exchange algorithm as well, and does not encrypt the data itself with those keys, but instead it encrypts the encryption keys for the data. That makes the data encryption less heavy, plus no re-encryption is needed if keys change. In any case, we'd have properly follow their protocol (or another well-documented protocol) to avoid any of the pitfalls in encryption.

fnobilia · 2017-04-10T08:21:35Z

If we encrypt data using a key that is unknown to the Platform, we cannot apply any analysis on this data. What kind of data would you encrypt with this method?

blootsvoets · 2017-04-10T08:30:19Z

Exactly, that's the point. So for example absolute locations, IP addresses or unprocessed voice data are privacy sensitive. However, we could choose to store them in encrypted way. The platform would not be able to read or process it, but just provide it as-is in the full data extracted from HDFS. Less sensitive data, such as battery levels, we'd send unencrypted so our platform can process it. We could decide on a stream-per-stream basis whether we want the data encrypted or unencrypted. Also, we could choose to leave the keys always unencrypted (anonymous patient ID), but just encrypt the values.

blootsvoets · 2017-04-10T08:34:35Z

Another alternative is to do the data processing on another "trusted" host, where we would provide the decryption key as well. Right now, I don't think we have the budget + motivation to have this additional infrastructure cost though.

fnobilia · 2017-04-10T08:37:38Z

The vast majority of collected variables are privacy sensitive (HR, Acc, ecc.. ). We can absolutely design something to provide also this functionality, but we should bring WP8 up in the discussion or wait a clear need/requirement.

blootsvoets · 2017-04-10T08:54:19Z

As long as the HR and Acc is not coupled to a specific person, I'd consider them anonymised data, which would be fine to process if we don't know the identity. However, something like absolute location can be used to find someones home and then identity. Likewise with voice recognition and IP address.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Differentiating between privacy-sensitive and anonymous data #49

Differentiating between privacy-sensitive and anonymous data #49

blootsvoets commented Apr 10, 2017

blootsvoets commented Apr 10, 2017 •

edited

fnobilia commented Apr 10, 2017

blootsvoets commented Apr 10, 2017 •

edited

blootsvoets commented Apr 10, 2017

fnobilia commented Apr 10, 2017

blootsvoets commented Apr 10, 2017

Differentiating between privacy-sensitive and anonymous data #49

Differentiating between privacy-sensitive and anonymous data #49

Comments

blootsvoets commented Apr 10, 2017

blootsvoets commented Apr 10, 2017 • edited

fnobilia commented Apr 10, 2017

blootsvoets commented Apr 10, 2017 • edited

blootsvoets commented Apr 10, 2017

fnobilia commented Apr 10, 2017

blootsvoets commented Apr 10, 2017

blootsvoets commented Apr 10, 2017 •

edited

blootsvoets commented Apr 10, 2017 •

edited