Cases
This document is still a work in progress.
Case Study Research
In case-study research, cases represent discrete instances of a phenomenon that inform the researcher about it. The cases are not the subjects of inquiry, and instead represent unique sets of circumstances that frame or contextualize the phenomenon of interest (Stake 2006: 2).
Cases usually share common reference to the overall research themes, but exhibit variations that enable a researcher to capture different outlooks or perspectives on matters of common concern. Drawing from multiple cases thus enables comprehensive coverage of a broad topic that no single case may cover on its own (Stake 2006: 23). In other words, cases are contexts that ascribe particular local flavours to the activities I trace, and which I must consider to account fully for the range of motivations, circumstances and affordances that back decisions to perform activities and to implement them in specific ways.
Moreover, the power of case study research derives from identifying consistencies that relate cases to each other, while simultaneously highlighting how their unique and distinguishing facets contribute to their representativeness of the underlying phenomon. Case study research therefore plays on the tensions that challenge relationships among cases and the phenomenon that they are being called upon to represent (Ragin 1999: 1139-1140).
It should be noted that case study research limits my ability to derive generalized findings across the whole field of epidemiology. As such, my intent is instead to articulate some significant aspects of data harmonization as they are represented at the cases accessible through this study. In other words, I aim to make certain under-appreciated social and collaborative commitments that underlie data-sharing initiatives more visible and to draw greater attention to certain sensibilities, attitudes, and apprehensions that are relevant to contemporary discourse on the nature of epidemiological data and ongoing development of information infrastructures designed to support data integration and re-use.
Key Factors
To reiterate, this project investigates the social and collaborative apparatus that scaffold data-sharing initiatives in epidemiology. Through analysis of data obtained through interviews with various relevant stakeholders attached to data-sharing initiatives, the project will ascertain the actions taken and challenges experienced to mediate the varied motivations, needs and values of those involved. In effect, the project aims to articulate the collaborative commitments that govern the constitution and maintenance of epidemiological information commons, and to relate these to technological, administrative and epistemic factors.
Here I outline some key factors that will guide the selection of cases so as to ensure that the project meaningfully addressses its goals.
1. Longevity
Initiatives that have existed for different durations of time will have different capacity to reflect on their practices. Younger projects will not have had as much of a chance to produce any research outcomes, but may be valuable sources for insight on expectations. More established projects will be able to reflect on unexpected challenges they may have experienced.
It will be good to have at least one younger project representing an initiative still “in flux”, one or two “legacy” projects (no longer active), and one or two at intermediate stages (extracting data for meaningful analysis, expanding the initiative’s scope, etc)
2. Community membership
The size and composition of the community, degree of familiarity among its members, and the mechanisms through which connections are managed constitute additional important factors to consider. Communication and decision-making may take different forms when teams are either smaller and locally-concentrated or larger and dispersed. Decision-making may also be significantly impacted by diffferent governance models and degrees of community participation. It would be interesting to identify how leaders are differentiated from other participants, norms and expectations for getting involved in leadership positions, and considerations that are made when making decisions that impact the community.
3. Support structures
Data-sharing may be supported by diverse funding models or tech stacks to support the work, which may significantly impact how the work progresses. Comparing sources of support for data-sharing will help me to explore how data-sharing is either integrated into or supplemented as a distinct outgrowth of “normal” science.
Specifically, it will be interesting to compare the extent to which projects are left to cobble together their own data-sharing infrastucture, and how this impacts attitudes and norms regarding the curation and nature of research data. I wonder whether lack of government support fosters creative, entrepeneurial, experimental or community-led models, how funding is provided to supporting the development of collaborative research networks, and how these feed back into norms and attitudes regarding the independence of individual research projects and the formation of collectively-maintained information commons.
I expect a tendency for cases to be supported by limited-term, federally-funded grants, though it might be worth exploring how supplementary funding provided by non-government agencies, including private firms (through MITACS, for instance) and philanthropic organizations (such as the Gates Foundation) impact the work. I would therefore like to included cases funded through these kinds of initiatives in this project.
4. Disciplinary trends
Data-sharing is undoubtably impacted by attitudes concerning the nature of data and their roles in scientific knowledge production, and it is therefore necessary to account for different perspectives. Although I am still somewhat unfamiliar with the diversity of thought on such matters in epidemiology, I intuit that much of the open science movement is driven by rather positivist attitude. I would like to include cases that take on alternative approaches to science.
5. Historical or contextual factors
Science is beholden to political trends, which impact ability to obtain funding and collaborate accross borders (e.g. Brexit’s impact on trans-European funding, including initiatives to attract and retain talent). Moreover, certain events, such as the Covid-19 pandemic, trigger responses in the scientific community. It may be interesting in exploring how these events affect change in either the short- or long-term.
6. Kinds of data
The nature of the data will surely impact how they are shared. In epidemiology specifically, there are ethical limitations on sharing precise patient records. This may be especially salient in studies focusing in health in Indiginous populations. Moreover, controls on data collection procedures, including limited or controlled scope or decisions to account for specific factors (such as race, which is prevalent in American datasets but largely ignored elsewhere) may significantly impact what can be done with them when integrated at scale.
Selecting Cases
Since a significant aspect of this work is to compare different approaches to data-sharing that have not yet been systematically articulated, it will be necessary to loosely define the parameters through which each case will be initially characterized. I will rely on structured consultations with the research community to make sense of the data-sharing landscape and select cases accordingly. By consulting with key stakeholders, I will arrive at a consensus about which cases are worth approaching while documenting the rationale behind these selections.
The consultation process is meant to ensure that case selection adheres to community will and reasoning, while also ensuring that cases are logistically feasible. I will therefore ask for input from leading members of epidemioligical data-sharing initiatives who are familiar with the goals of the this project, and who are involved with the Maelstrom Project which establishes logistical boundaries around the scope of the project.
Fixed cases
Maelstrom will serve as a “fixed point” that limits the scope of the cases’ breadth, while also ensuring that participants (and myself) have a common frame of reference. Moreover, the practices and values that support Maelstrom’s operations have already been documented to a certain extent by its leaders (cf. Doiron et al. 2017; Fortier et al. 2017; Fortier et al. 2023; Bergeron et al. 2018), by its partners (cf. Doiron et al. 2013; Wey et al. 2021; Bergeron et al. 2021) and by scholars of scientific practice (cf. M. J. Murtagh et al. 2012; Demir and Murtagh 2013; Madeleine J. Murtagh et al. 2016; Tacconelli et al. 2022; Gedeborg et al. 2023). This prior work will serve as valuable resources supporting this project.
Additionally, the fact that all cases interact with Maelstrom for their technical infrastructure will greatly simplify the interviews by reducing the “overhead” of having to learn or be told about the technical systems, which may distract from the primary themes I seek to address during interviews.
CITF will also serve as a fixed case. This is partly for logistical reqasons, since the grant is meant to support the CITF Databank, and this project will align with concurrent research on user experiences pertaining to CITF specifically. At the same time, CITF is relevant to the project’s objectives in its own right, and will contribute meaningful insight in comparison with other cases.
Logistical Constraints and Sources of Bias
After identifying potential cases, I will reach out to project leaders to invite them to participate. I will prepare a document outlining this project’s objectives and the roles that cases will play in the work. I will also set up a meeting prior to them deciding whether they would like to participate so I can ascertain whether they understand the project and to help determine who may serve as people who can sit for interviews (I expect to hold 12-15 interviews ranging between 60-90 minutes in duration).
I may prioritize local connections, which provide favourable conditions for holding interviews (i.e., people are more willing to show things that can not be conveyed through a screen, and the pre- and post-interview phases provide meaningful insight). This may introduce bias in that I may obtain more in-depth and nuanced information from local initiatives than those occurring abroad. This can be mitigated by travelling to conduct interviews in person, however the costs of travel may introduce their own biases favouring cases that are easier to reach.