The health repos have four levels of access: The most open level is available to anyone without any prior permission required. As such, this would be 100% de-identified data. (This might be also potentially the least useful dataset, but so be it.) All other levels would require identification of the potential user via a login and authentication process. Data within these levels would be a “club good” with the club membership agreement that would strive to bind the club members to a specified set of norms of behavior and reuse.
The citizens would give their “consent to be governed,”1 but the important thing to keep in mind is that in order for me, the citizen, to give my consent to be governed by someone else, perhaps a collective acting on my behalf, I first have to have the rights over whatever I am giving away.2 This is a fundamental concept of rights-giving that most folks don’t think about. For me to let you use something, I have to own that something first. In this case, I have to have ownership of my data before I can give you, the researcher, my consent to use my data. And, a related concept is that just because I am giving my consent to you to use my data, I am not giving you ownership over my data.
Governance. As noted earlier, we have to establish clear ownership of data for only then can we figure out who can give permission/consent to the use of that data. In the case of research, if the collected data are from non-human subjects or non-identifiable human subjects, then the data ownership rests in the one who collects that data, usually the researcher. Keep in mind that the researcher may have employed others in her team to collect data, and also, the researcher herself may be subject to transfer-of-data-ownership via work-for-hire clauses with her funder or employer. In the case, of identifiable human subjects, data ownership resides with the source of the data, that is, the humans from whom the data have been collected irrespective of how has done the collecting. Once we have established that ownership, certain rights have to be non-exclusively transferred to a governing body in-charge of the repo that would act on behalf of the owners of the data contributed to the repo.
Business model. The repos will have to be funded. But, they would strive to do two things: One, they would solely represent the welfare of the data contributors, and do so in a non-partisan, non-governmental, non-profit manner, and; Two, they would make data available via APIs so that anyone who has the permission to access and use that data (remember my four levels of use), would have the freedom to add value-added services and products and monetize them. In other words, the data itself shall be freely accessible and usable to the maximum level of freedom by anyone authorized to do so, and any monetization would have happen for any innovative products and services added on top of that data. And, this monetization, in part, would help sustain the infrastructure services and products of the commons.3