题名 |
Traditional and GenAI Text Analysis of COVID-19 Pandemic Trends in Hospital Community Benefits IRS Documentation |
DOI |
10.6339/24-JDS1144 |
作者 |
Emily Hadley;Laura Marcial;Wes Quattrone;Georgiy Bobashev |
关键词 |
generative artificial intelligence ; hospital administration ; natural language processing ; text mining |
期刊名称 |
Journal of Data Science |
卷期/出版年月 |
22卷3期(2024 / 07 / 01) |
页次 |
393 - 408 |
内容语文 |
英文 |
中文摘要 |
The coronavirus disease 2019 (COVID-19) pandemic presented unique challenges to the U.S. healthcare system, particularly for nonprofit U.S. hospitals that are obligated to provide community benefits in exchange for federal tax exemptions. We sought to examine how hospitals initiated, modified, or disbanded community benefits programming in response to the COVID-19 pandemic. We used the free-response text in Part IV of Internal Revenue Service (IRS) Form 990 Schedule H (F990H) to assess health equity and disparities. We combined traditional key term frequency and Hierarchical Density-Based Spatial Clustering of Applications with Noise (HDBSCAN) clustering approaches with a novel Generative Pre-trained Transformer (GPT) 3.5 summarization approach. Our research reveals shifts in community benefits programming. We observed an increase in COVID-related terms starting in the 2019 tax year, indicating a pivot in community focus and efforts toward pandemic-related activities such as telehealth services and COVID-19 testing and prevention. The clustering analysis identified themes related to COVID-19 and community benefits. Generative Artificial Intelligence (GenAI) summarization with GPT3.5 contextualized these changes, revealing examples of healthcare system adaptations and program cancellations. However, GPT3.5 also encountered some accuracy and validation challenges. This multifaceted text analysis underscores the adaptability of hospitals in maintaining community health support during crises and suggests the potential of advanced AI tools in evaluating large-scale qualitative data for policy and public health research. |
主题分类 |
基礎與應用科學 >
資訊科學 基礎與應用科學 > 統計 |