• School of Life Science & Technology, University of Electronic Science & Technology of China, Chengdu 610051, P.R.China;
LI Ke, Email: colinlike@163.com
Export PDF Favorites Scan Get Citation

Objective To realize automatic risk bias assessment for the randomized controlled trial (RCT) literature using BERT (Bidirectional Encoder Representations from Transformers) as an approach for feature representation and text classification.Methods We first searched The Cochrane Library to obtain risk bias assessment data and detailed information on RCTs, and constructed data sets for text classification. We assigned 80% of the data set as the training set, 10% as the test set, and 10% as the validation set. Then, we used BERT to extract features, construct text classification model, and evaluate the seven types of risk bias values (high and low). The results were compared with those from traditional machine learning methods using a combination of n-gram and TF-IDF as well as the Linear SVM classifier. The accuracy rate (P value), recall rate (R value) and F1 value were used to evaluate the performance of the models.Results Our BERT-based model achieved F1 values of 78.5% to 95.2% for the seven types of risk bias assessment tasks, which was 14.7% higher than the traditional machine learning method. F1 values of 85.7% to 92.8% were obtained in the extraction task of the other six types of biased descriptors except "other sources of bias", which was 18.2% higher than the traditional machine learning method.Conclusions The BERT-based automatic risk bias assessment model can realize higher accuracy in risk of bias assessment for RCT literature, and improve the efficiency of assessment.

Citation: XIA Yuan, LIU Dongfeng, ZHANG Jinkui, LI Ke. BERT-based automated risk of bias assessment. Chinese Journal of Evidence-Based Medicine, 2021, 21(2): 204-209. doi: 10.7507/1672-2531.202006177 Copy

Copyright © the editorial department of Chinese Journal of Evidence-Based Medicine of West China Medical Publisher. All rights reserved

  • Previous Article

    Analysis of health Qigong-related clinical trial registration characteristics and reporting quality
  • Next Article

    Implementing Bayesian meta-analysis of binary data using PROC MCMC process step in the SAS software