An exploratory mixed methods study using factor analysis and latent class analysis
I view our role as epidemiologists is to continually discover and learn methodologies that are applicable to public health (or clinical) problems, and I encourage my students to take methods courses rather than content area courses, to be generalists in their education, emerging from the program as applied methodologists. I've been particularly interested in learning qualitative study design, factor analysis, and latent class analysis for some time. Through my work at a local hospital with clinical fellows, I became involved in a project that allowed me to do all three at once.
The project is centered around defining treatment futility in a clinical setting. The fellow recruited about 200 healthcare providers who work in an intensive care unit (ICU) at the hospital and through an open-ended question asked them "How do you define futility?" Based on the responses, this fellow and another physician came up with 13 thematic categories (constructs of futility) that captured the qualitative responses, where these categories were based on best-guesses and visual inspection of the answers. All 200 survey respondents were coded redundantly so we can assess kappa for inter-rater agreement between the two physicians, and each respondent's answer to the question fit under one or more of the 13 thematic categories. The figure below is a snapshot of what the data look like. As you can see it is just a 0/1 coding for each category indicating the respondent defined futility accordingly.
At this point, the fellow could have published the findings as a qualitative study, drawing in some key quotes from the respondents and visual inspection of the data (for example, maybe futility was defined consistently by provider role: e.g., nurse vs attending vs resident). But there is also an opportunity to use quantitative methods to answer a few important questions, strengthening the research:
There are several ways that qualitative and quantitative studies can inform each other in mixed methods analyses (a.k.a. multimethod analyses). One way is to use qualitative research to identify unknown confounders for a subsequent quantitative study. Another way is to test key assumptions or results of qualitative work by adding a statistical component. This work falls under the latter reason. What we don't want to do is use qualitative research to be redundant to the quantitative results, as the data are already captured in the quantitative analysis. The NIH publishes a set of best practices for mixed methods research in health science.
To address the two questions from earlier, we can turn to quantitative methods as follows:
This analysis can readily be accomplished using R, and specifically two functions: factanal (in the "stats" package) to perform the FA and poLCA (in the "poLCA" package) to perform the LCA. I am also just scraping the surface of each of these two techniques - the point is to introduce these concepts and their application, and not necessarily master them.
The factor analysis will essentially tell us which variables hang together in groups. The number of groups (or factors) is informed by statistics as well as interpretability. That is, just because a solution makes statistical sense it also needs to be interpretable. We'll focus on the following concepts in the input and output from factanal function:
The factor analysis proceeds in an iterative fashion, adding or removing the number of factors, and possibility including or excluding some of the categories of futility if they are non-contributory in the overall data (i.e., unnecessary). It's an art. Results from my FA suggested a few things:
As mentioned, the LCA will group respondents together into latent classes based on similarity in defining futility. Informed by the factor analysis, I decided to omit the four categories that did not meaningfully contribute to the definition of futility. I also did not use the factors themselves as this did not yield new information (as assessed by the loadings being very high for a single variable under a given factor), rather I modeled the nine remaining futility categories. The following input and output can help drive the LCA:
As with the factor analysis, we proceed in an iterative fashion adding or removing latent classes from the LCA. We want not only statistical evidence for our model selection, but also a theoretical basis, meaning interpretability of the results. A difficult question to answer with is, "How many latent classes are appropriate for my data?" Some guidance is offered here. There are a variety of strategies to use, including model fit criteria (AIC/BIC) as well as class membership probabilities. Ultimately we chose a three class solution which optimized the BIC and entropy and did not result in excessive 0/1 probabilities: eventually the classes will become so unique that you lose the original intention of the LCA. The figure below demonstrates the class membership in our solution:
Latent class one (on left) seemed to define futility mainly by constructs 8 and 9; latent class two (middle) seemed to define futility by constructs 3, 5, 6; and latent class three (on right) seemed to define futility by constructs 2 and 3. One way to interpret this output is there are three general groups of individuals in the ICU who define futility in similar ways according to their class membership. Returning to the earlier question, "Do respondents define futility similarly by some common characteristic?", we can now assess class membership by the sociodemographic and worker characteristics, by joining the predicted class membership to the original survey data. Data can be inspected visually, by plotting graphs, or statistically, by assessing frequency tables. See below for proportional breakdown of latent class membership (y-axis) by provider role in the ICU (x-axis): attending physicians, fellows, residents, nurse practitioners, and nurses.
We tend to see attending physicians and fellows were least likely to define futility according to the latent class one constructs, while nurses were least likely to define futility according to the latent class two constructs. We can also compare and contrast provider role differences by latent class membership.