Our paper, ‘“How Did We Get Here?”: Topic Drift in Online Health Discussions’, has been accepted to the Journal of Medical Internet Research.
Albert Park, Andrea Hartzler, Jina Huh, Gary Hsieh, David McDonald, Wanda Pratt.“How Did We Get Here?”: Topic Drift in Online Health Discussions’. J Med Internet Res (forthcoming). doi:10.2196/jmir.6297 http://dx.doi.org/10.2196/jmir.6297
Background: Patients increasingly use online health communities to exchange health information and peer support. During the progression of health discussions a change of topic—topic drift—can occur. Topic drift is a frequently occurring phenomenon that is linked to incoherence and frustration in online communities and other forms of computer-mediated communication. For sensitive topics, such as health, such drift could have life-altering repercussions, yet topic drift has not been studied in these contexts.
Objective: Our goals were to understand topic drift in online health communities, and then to develop and evaluate an automated approach to detect both topic drift and efforts of community members to counteract such drift.
Methods: We manually analyzed 721 posts from 184 threads from seven online health communities within WebMD to understand topic drift, members’ reaction towards topic drift, and their effort to counteract topic drift. Then, we developed an automated approach to detect topic drift and counteraction effort. We detected topic drift by calculating cosine similarity between 229,156 posts from 37,805 threads and measuring change of cosine similarity scores from the threads’ first posts to their sequential posts. Using a similar approach, we detected counteractions to topic drift in threads by focusing on the irregular increase of similarity scores compared to the previous post in threads. Finally, we evaluated the performance of our automated approaches to detect topic drift and counteracting efforts by using a manually-developed gold standard.
Results: Our qualitative analyses revealed that in threads of online health communities, topics change gradually, but usually stay within the global frame of topics for the specific community. Members showed frustration when topic drift occurred in the middle of threads, but reacted positively to off-topic stories shared as separate threads. Although all types of members helped to counteract topic drift, original posters provided the most effort to keep threads on topic. Cosine similarity scores show promise for automatically detecting topical changes in online health discussions. In our manual evaluation, we achieve an F1-score of .71 and .73 for detecting topic drift and counteracting effort to stay on topic, respectively.
Conclusions: Our analyses expand our understanding of topic drift in a health context and highlight practical implications, such as promoting off-topic discussions as a function of building rapport in online health communities. Furthermore, the quantitative findings suggest that an automated tool could help detect topic drift, support counteraction efforts to bring the conversation back on topic, and improve communication in these important communities. Findings from this study have the potential to reduce topic drift and improve online health community members’ experience of computer-mediated communication.