Automatic quantitative stroke severity assessment based on Chinese clinical named entity recognition with domain-adaptive pre-trained large language model

Zhanzhong Gu; Xiangjian He; Ping Yu; Wenjing Jia; Xiguang Yang; Gang Peng; Penghui Hu; Shiyan Chen; Hongjie Chen; Yiguang Lin

doi:10.1016/j.artmed.2024.102822

Automatic quantitative stroke severity assessment based on Chinese clinical named entity recognition with domain-adaptive pre-trained large language model

Zhanzhong Gu, Xiangjian He, Ping Yu, Wenjing Jia, Xiguang Yang, Gang Peng, Penghui Hu, Shiyan Chen, Hongjie Chen, Yiguang Lin

School of Computer Science

Research output: Journal Publication › Article › peer-review

2 Citations (Scopus)

Abstract

Background: Stroke is a prevalent disease with a significant global impact. Effective assessment of stroke severity is vital for an accurate diagnosis, appropriate treatment, and optimal clinical outcomes. The National Institutes of Health Stroke Scale (NIHSS) is a widely used scale for quantitatively assessing stroke severity. However, the current manual scoring of NIHSS is labor-intensive, time-consuming, and sometimes unreliable. Applying artificial intelligence (AI) techniques to automate the quantitative assessment of stroke on vast amounts of electronic health records (EHRs) has attracted much interest. Objective: This study aims to develop an automatic, quantitative stroke severity assessment framework through automating the entire NIHSS scoring process on Chinese clinical EHRs. Methods: Our approach consists of two major parts: Chinese clinical named entity recognition (CNER) with a domain-adaptive pre-trained large language model (LLM) and automated NIHSS scoring. To build a high-performing CNER model, we first construct a stroke-specific, densely annotated dataset “Chinese Stroke Clinical Records” (CSCR) from EHRs provided by our partner hospital, based on a stroke ontology that defines semantically related entities for stroke assessment. We then pre-train a Chinese clinical LLM coined “CliRoberta” through domain-adaptive transfer learning and construct a deep learning-based CNER model that can accurately extract entities directly from Chinese EHRs. Finally, an automated, end-to-end NIHSS scoring pipeline is proposed by mapping the extracted entities to relevant NIHSS items and values, to quantitatively assess the stroke severity. Results: Results obtained on a benchmark dataset CCKS2019 and our newly created CSCR dataset demonstrate the superior performance of our domain-adaptive pre-trained LLM and the CNER model, compared with the existing benchmark LLMs and CNER models. The high F1 score of 0.990 ensures the reliability of our model in accurately extracting the entities for the subsequent automatic NIHSS scoring. Subsequently, our automated, end-to-end NIHSS scoring approach achieved excellent inter-rater agreement (0.823) and intraclass consistency (0.986) with the ground truth and significantly reduced the processing time from minutes to a few seconds. Conclusion: Our proposed automatic and quantitative framework for assessing stroke severity demonstrates exceptional performance and reliability through directly scoring the NIHSS from diagnostic notes in Chinese clinical EHRs. Moreover, this study also contributes a new clinical dataset, a pre-trained clinical LLM, and an effective deep learning-based CNER model. The deployment of these advanced algorithms can improve the accuracy and efficiency of clinical assessment, and help improve the quality, affordability and productivity of healthcare services.

Original language	English
Article number	102822
Journal	Artificial Intelligence in Medicine
Volume	150
DOIs	https://doi.org/10.1016/j.artmed.2024.102822
Publication status	Published - Apr 2024

Keywords

Automatic stroke severity assessment
Chinese electronic health records
Clinical named entity recognition
Domain-adaptive pre-training
Large language model

ASJC Scopus subject areas

Medicine (miscellaneous)
Artificial Intelligence

Access to Document

10.1016/j.artmed.2024.102822

Cite this

@article{5e5bb869c35049ba92b59baf79dd9bb3,

title = "Automatic quantitative stroke severity assessment based on Chinese clinical named entity recognition with domain-adaptive pre-trained large language model",

abstract = "Background: Stroke is a prevalent disease with a significant global impact. Effective assessment of stroke severity is vital for an accurate diagnosis, appropriate treatment, and optimal clinical outcomes. The National Institutes of Health Stroke Scale (NIHSS) is a widely used scale for quantitatively assessing stroke severity. However, the current manual scoring of NIHSS is labor-intensive, time-consuming, and sometimes unreliable. Applying artificial intelligence (AI) techniques to automate the quantitative assessment of stroke on vast amounts of electronic health records (EHRs) has attracted much interest. Objective: This study aims to develop an automatic, quantitative stroke severity assessment framework through automating the entire NIHSS scoring process on Chinese clinical EHRs. Methods: Our approach consists of two major parts: Chinese clinical named entity recognition (CNER) with a domain-adaptive pre-trained large language model (LLM) and automated NIHSS scoring. To build a high-performing CNER model, we first construct a stroke-specific, densely annotated dataset “Chinese Stroke Clinical Records” (CSCR) from EHRs provided by our partner hospital, based on a stroke ontology that defines semantically related entities for stroke assessment. We then pre-train a Chinese clinical LLM coined “CliRoberta” through domain-adaptive transfer learning and construct a deep learning-based CNER model that can accurately extract entities directly from Chinese EHRs. Finally, an automated, end-to-end NIHSS scoring pipeline is proposed by mapping the extracted entities to relevant NIHSS items and values, to quantitatively assess the stroke severity. Results: Results obtained on a benchmark dataset CCKS2019 and our newly created CSCR dataset demonstrate the superior performance of our domain-adaptive pre-trained LLM and the CNER model, compared with the existing benchmark LLMs and CNER models. The high F1 score of 0.990 ensures the reliability of our model in accurately extracting the entities for the subsequent automatic NIHSS scoring. Subsequently, our automated, end-to-end NIHSS scoring approach achieved excellent inter-rater agreement (0.823) and intraclass consistency (0.986) with the ground truth and significantly reduced the processing time from minutes to a few seconds. Conclusion: Our proposed automatic and quantitative framework for assessing stroke severity demonstrates exceptional performance and reliability through directly scoring the NIHSS from diagnostic notes in Chinese clinical EHRs. Moreover, this study also contributes a new clinical dataset, a pre-trained clinical LLM, and an effective deep learning-based CNER model. The deployment of these advanced algorithms can improve the accuracy and efficiency of clinical assessment, and help improve the quality, affordability and productivity of healthcare services.",

keywords = "Automatic stroke severity assessment, Chinese electronic health records, Clinical named entity recognition, Domain-adaptive pre-training, Large language model",

author = "Zhanzhong Gu and Xiangjian He and Ping Yu and Wenjing Jia and Xiguang Yang and Gang Peng and Penghui Hu and Shiyan Chen and Hongjie Chen and Yiguang Lin",

note = "Publisher Copyright: {\textcopyright} 2024 The Author(s)",

year = "2024",

month = apr,

doi = "10.1016/j.artmed.2024.102822",

language = "English",

volume = "150",

journal = "Artificial Intelligence in Medicine",

issn = "0933-3657",

publisher = "Elsevier B.V.",

}

TY - JOUR

T1 - Automatic quantitative stroke severity assessment based on Chinese clinical named entity recognition with domain-adaptive pre-trained large language model

AU - Gu, Zhanzhong

AU - He, Xiangjian

AU - Yu, Ping

AU - Jia, Wenjing

AU - Yang, Xiguang

AU - Peng, Gang

AU - Hu, Penghui

AU - Chen, Shiyan

AU - Chen, Hongjie

AU - Lin, Yiguang

PY - 2024/4

Y1 - 2024/4

N2 - Background: Stroke is a prevalent disease with a significant global impact. Effective assessment of stroke severity is vital for an accurate diagnosis, appropriate treatment, and optimal clinical outcomes. The National Institutes of Health Stroke Scale (NIHSS) is a widely used scale for quantitatively assessing stroke severity. However, the current manual scoring of NIHSS is labor-intensive, time-consuming, and sometimes unreliable. Applying artificial intelligence (AI) techniques to automate the quantitative assessment of stroke on vast amounts of electronic health records (EHRs) has attracted much interest. Objective: This study aims to develop an automatic, quantitative stroke severity assessment framework through automating the entire NIHSS scoring process on Chinese clinical EHRs. Methods: Our approach consists of two major parts: Chinese clinical named entity recognition (CNER) with a domain-adaptive pre-trained large language model (LLM) and automated NIHSS scoring. To build a high-performing CNER model, we first construct a stroke-specific, densely annotated dataset “Chinese Stroke Clinical Records” (CSCR) from EHRs provided by our partner hospital, based on a stroke ontology that defines semantically related entities for stroke assessment. We then pre-train a Chinese clinical LLM coined “CliRoberta” through domain-adaptive transfer learning and construct a deep learning-based CNER model that can accurately extract entities directly from Chinese EHRs. Finally, an automated, end-to-end NIHSS scoring pipeline is proposed by mapping the extracted entities to relevant NIHSS items and values, to quantitatively assess the stroke severity. Results: Results obtained on a benchmark dataset CCKS2019 and our newly created CSCR dataset demonstrate the superior performance of our domain-adaptive pre-trained LLM and the CNER model, compared with the existing benchmark LLMs and CNER models. The high F1 score of 0.990 ensures the reliability of our model in accurately extracting the entities for the subsequent automatic NIHSS scoring. Subsequently, our automated, end-to-end NIHSS scoring approach achieved excellent inter-rater agreement (0.823) and intraclass consistency (0.986) with the ground truth and significantly reduced the processing time from minutes to a few seconds. Conclusion: Our proposed automatic and quantitative framework for assessing stroke severity demonstrates exceptional performance and reliability through directly scoring the NIHSS from diagnostic notes in Chinese clinical EHRs. Moreover, this study also contributes a new clinical dataset, a pre-trained clinical LLM, and an effective deep learning-based CNER model. The deployment of these advanced algorithms can improve the accuracy and efficiency of clinical assessment, and help improve the quality, affordability and productivity of healthcare services.

AB - Background: Stroke is a prevalent disease with a significant global impact. Effective assessment of stroke severity is vital for an accurate diagnosis, appropriate treatment, and optimal clinical outcomes. The National Institutes of Health Stroke Scale (NIHSS) is a widely used scale for quantitatively assessing stroke severity. However, the current manual scoring of NIHSS is labor-intensive, time-consuming, and sometimes unreliable. Applying artificial intelligence (AI) techniques to automate the quantitative assessment of stroke on vast amounts of electronic health records (EHRs) has attracted much interest. Objective: This study aims to develop an automatic, quantitative stroke severity assessment framework through automating the entire NIHSS scoring process on Chinese clinical EHRs. Methods: Our approach consists of two major parts: Chinese clinical named entity recognition (CNER) with a domain-adaptive pre-trained large language model (LLM) and automated NIHSS scoring. To build a high-performing CNER model, we first construct a stroke-specific, densely annotated dataset “Chinese Stroke Clinical Records” (CSCR) from EHRs provided by our partner hospital, based on a stroke ontology that defines semantically related entities for stroke assessment. We then pre-train a Chinese clinical LLM coined “CliRoberta” through domain-adaptive transfer learning and construct a deep learning-based CNER model that can accurately extract entities directly from Chinese EHRs. Finally, an automated, end-to-end NIHSS scoring pipeline is proposed by mapping the extracted entities to relevant NIHSS items and values, to quantitatively assess the stroke severity. Results: Results obtained on a benchmark dataset CCKS2019 and our newly created CSCR dataset demonstrate the superior performance of our domain-adaptive pre-trained LLM and the CNER model, compared with the existing benchmark LLMs and CNER models. The high F1 score of 0.990 ensures the reliability of our model in accurately extracting the entities for the subsequent automatic NIHSS scoring. Subsequently, our automated, end-to-end NIHSS scoring approach achieved excellent inter-rater agreement (0.823) and intraclass consistency (0.986) with the ground truth and significantly reduced the processing time from minutes to a few seconds. Conclusion: Our proposed automatic and quantitative framework for assessing stroke severity demonstrates exceptional performance and reliability through directly scoring the NIHSS from diagnostic notes in Chinese clinical EHRs. Moreover, this study also contributes a new clinical dataset, a pre-trained clinical LLM, and an effective deep learning-based CNER model. The deployment of these advanced algorithms can improve the accuracy and efficiency of clinical assessment, and help improve the quality, affordability and productivity of healthcare services.

KW - Automatic stroke severity assessment

KW - Chinese electronic health records

KW - Clinical named entity recognition

KW - Domain-adaptive pre-training

KW - Large language model

UR - http://www.scopus.com/inward/record.url?scp=85186860362&partnerID=8YFLogxK

U2 - 10.1016/j.artmed.2024.102822

DO - 10.1016/j.artmed.2024.102822

M3 - Article

C2 - 38553162

AN - SCOPUS:85186860362

SN - 0933-3657

VL - 150

JO - Artificial Intelligence in Medicine

JF - Artificial Intelligence in Medicine

M1 - 102822

ER -

Automatic quantitative stroke severity assessment based on Chinese clinical named entity recognition with domain-adaptive pre-trained large language model

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this