Objective To evaluate the quality of studies assessing the value of serum hyaluronic acid in the diagnosis of liver fibrosis, to analyze the sources of bias and variation, and to estimate the accuracy of serum hyaluronic acid in diagnosing early liver cirrhosis and liver fibrosis in patients with chronic viral hepatitis.Methods We searched MEDLINE (1966 to June 2006), EMbase (1974 to June 2006), CBMdisc (1978 to April 2005), CNKI (2005 to June 2006) and VIP (2005 to June 2006) for studies assessing the diagnostic value of serum hyaluronic acid for liver fibrosis in patients with chronic viral hepatitis. We checked the references in the reports of included studies. QUADAS items were used for quality assessment. Meta-disc software was used to analyze sensitivity, specificity, positive likelihood ratio, negative likelihood ratio, diagnostic test odds ratio for the pooled analysis and heterogeneity test. DPS2005 software was used to draw SROC curves for those without heterogeneity. Results In total 24 studies were included: 12 published in Chinese and 12 published in English. Results of the pooled analysis showed that, as for radioimmunoassay (RIA), the pooled LR+ of the studies involving the diagnosis of liver cirrhosis and the differentiation of absent/present liver fibrosis were 7.029 and 3.608; and the pooled LR- were 0.198 and 0.319, respectively. As for enzyme-linked immunosorbent assay (ELISA), the pooled LR+ of the studies involving the diagnosis of liver cirrhosis, the differentiation of mild/severe and absent/present liver fibrosis were 6.093, 9.806 and 4.308; and the pooled LR- were 0.354, 0.347 and 0.563, respectively. Conclusion The biases identified from the 24 studies are mainly due to reference standard review bias. By both RIA and ELISA methods, serum hyaluronic acid has a sound value in diagnosing live cirrhosis. Its value in differentiating absent/present liver fibrosis is also acceptable. However, its value in differentiating mild/severe liver fibrosis needs to be further studied.
ObjectiveThis study proposes employing large language models (LLMs) for medical literature quality assessment, exploring their potential to establish a standardized and scalable intelligent evaluation framework for off-label drug use (OLDU). MethodsThe study used two freely available LLMs platforms in China, DeepSeek-R1 and Doubao. Following the medical literature quality assessment tools recommended in the evidence-based evaluation specification for OLDU issued by the Guangdong Pharmaceutical Association, we selected the Jadad scale and the MINORS criteria. These tools were employed to assess the quality of the two most prevalent types of medical literature in OLDU evidence evaluation: randomized controlled trials (RCTs) and non-randomized controlled trials (non-RCTs). Utilizing chain-of-thought (CoT) prompting techniques, we developed standardized evaluation templates. The quality scores generated by the LLMs were then compared against those reported in systematic reviews or assigned by clinical pharmacists. ResultsFor RCT, DeepSeek-R1 demonstrated consistency with human assessments in quality appraisal. However, discrepancies exist between the Doubao model and manual evaluation results, with three repeated evaluations yielding inconsistent outcomes and inaccurate identification of "allocation concealment" items. For non-RCT, all models achieved concordant quality assessment outcomes with human evaluators, while demonstrating unique capacity to detect systematic evaluation inaccuracies attributable to human subjective bias. ConclusionThis study demonstrates that prompt engineering-driven LLMs can efficiently conduct quality assessments of medical literature. However, the selection of models requires rigorous validation against domain-specific benchmarks, alongside mandatory expert validation of scoring outputs. Our findings further reveal the necessity of refining current quality appraisal criteria through granular operational definitions, thereby facilitating standardized automation. This approach not only enhances the efficiency and transparency of evidence-based decision-making for OLDU but also extends to systematic reviews and rapid health technology assessments. By replacing traditional literature quality evaluation models with automated scoring mechanisms, it enables a paradigm shift in the efficiency of evidence processing.
ObjectiveTo evaluate the quality of Chinese clinical practice guidelines published in domestic medical journals from 2012 to 2013 and compare with the quality of guidelines published before. MethodsCNKI, CBM and WanFang Data were searched to collect guidelines from January 1st, 2012 to December 31st, 2013. Two reviewers independently screened literature according to the inclusion and exclusion criteria and extracted data. The AGREE Ⅱ instrument was applied to assess methodological quality of included guidelines. ResultsA total of 78 guidelines were identified. Among them, 37 guidelines were published in 2012, and 41 in 2013. The scores of 6 domains' scores of AGREE Ⅱ were as follows:scope and purpose (24%), stakeholder involvement (11%), rigour of development (7%), clarity of presentation (32%), applicability (7%), and editorial independence (4%). The results of subgroup analysis indicated that, the scores in 5 domains (except applicability) of the guidelines published in CSCD journals were higher than those of non CSCD journals; the scores in 4 domains (except stakeholder involvement and applicability) of the guidelines received funds were higher than those of guidelines with no funds; and the scores in 5 domains (except editorial independence) of the guidelines published in 2013 were higher than those in 2012. ConclusionThe guidelines published from 2012 to 2013 have higher quality than guidelines published before 2012, but great discrepancies exist when comparing with international guidelines of average level. Chinese guidelines developers should attach importance to international methodology to develop guidelines, and use the AGREE Ⅱ instrument to develop and report guidelines.
ObjectivesTo assess the methodological quality of clinical practice guidelines of cervical cancer in China published from 2014 to 2018.MethodsCNKI, WanFang Data, CBM, VIP, Medlive.cn, the National Guideline Clearinghouse, PubMed, The Cochrane Library and EMbase were searched for cervical cancer clinical practice guidelines published in China from January 1st, 2014 to December 31st, 2018. Four reviewers searched and selected the literature independently according to the inclusion and exclusion criteria and assessed the methodological quality of the included guidelines by using AGREE Ⅱ.ResultsA total of 9 guidelines were included. The average score for each area was: scope and purpose 75.47%, stakeholders’ involvement 35.09%, the rigor of development 43.70%, clarity of presentation 87.74%, applicability 80.76%, and editorial independence 0%.ConclusionsThe quality of cervical cancer clinical practice guidelines in China requires further improvement.
ObjectivesTo evaluate the quality of health information on diabetes in Chinese internet, so as to understand the current status of diabetes network health information, and provide reference for improving and enriching the three-level prevention of diabetes.MethodsThe three most common Chinese search engines: Baidu, Sogou and Haosou Search were searched with the keywords " diabetes” and " diabetes treatment”, using the health information evaluation tool DISCERN score to evaluate the quality of the information, and the integrity and accuracy of information content were evaluated with reference to the " Guidelines for the Prevention and Treatment of Type 2 Diabetes in China (2017 Edition)” issued by the Chinese Diabetes Society of Chinese Medical Association.ResultsA total of 300 links were accessed and included in 17 websites. The DISCERN review showed that only 1 item's average score exceeded 3 points. According to website content score, the excellent part accounted for 11.7%, the good part accounted for 35.2%, the fair part accounted for 47.1%, and the poor part accounted for 5.8%. 50% of websites contained error messages, and the subject of the most error-prone information was diagnosis and treatment. There was a positive correlation between the content score and the DSCERN score in the credibility score and the verbosity score (r=0.71, 0.73, P<0.001). The websites were evaluated by attributes, and the quality evaluation of diabetes-related information in some general-purpose websites was higher than that of diabetes specialist websites.ConclusionsThe quality of diabetes health information on Chinese websites is insufficient. It is necessary for China to establish a web-based information platform for diabetes. China has not yet formed a unified network health information evaluation standard in line with its national conditions. The key to solving the problem lies in the collaboration between professional health personnel and website developers.
目的 系统评价可作为将基础生命科学研究从实验室转化到人体研究和健康保健措施的工具。对动物研究的系统评价是否系统且不存在偏倚,目前尚不知。 方法 我们检索了MEDLINE、EMBASE、已知系统评价的参考文献(1996~2004)并联系相关专家,至少检索一个对公众开放的数据库全面收集基础科学文献的引文,均未设定语言限制。我们从中纳入那些在动物身上测量实验室参数或在活体动物身上给予治疗以评价疗效研究的系统评价,并将其与在实验室中研究人体或动物组织、细胞系统或器官标本以进一步了解疾病机理体外研究的系统评价进行比较。 结果 动物研究的系统评价通常缺少如定义供检验的假设(9/30,30%);文献检索不设语言限制(8/30,26.6%);评价发表偏倚(5/30,16.6%)、研究的真实性(15/30,50%)和异质性(10/30,33.3%);及使用Meta分析用于定量合成(12/30,40%)等方法学特征。与体外研究的系统评价相比,动物研究的系统评价更多地限定了研究问题(96.6% vs. 80%,P=0.04)、综合检索了多个数据库(60% vs. 26.6%,P=0.01)、评价了研究质量(50% vs. 20%,P=0.01)及探讨了研究异质性(33.3% vs. 2.2%,P=0.001),因此更少出现偏倚。 结论 在各类系统评价中,方法学质量问题的出现频率似乎存在一定的梯度:虽然与人体临床试验的系统评价相比,整体动物研究的系统评价质量明显较差,但其仍优于体外研究的系统评价。在系统评价动物研究时,有必要更严格。
Objective To explore the condition and quality of domestic clinical therapeutic studies on integrated traditional Chinese and western medicine for posthepatitic cirrhosis in recent 30 years. Methods Jadad scale was used to score 121 literatures selected from January 1980 to January 2010 in periodicals of domestic authoritative resources databases, such as CNKI, VIP, WanFang Data, and CBM. Systematic reviews were conducted to 39 randomized controlled trials (RCTs) literatures of treating posthepatitic cirrhosis with integrated traditional Chinese medicine and western medicine scored two or more points. Results In 30 years, the main problems existing in domestic posthepatitic cirrhosis clinical research of integrated traditional Chinese and western medicine were as follows: the design of clinical RCTs was not strict enough; there was deficiency in the use of blind method; the standardized and uniformed research standard were insufficiency; the sample content was low without specific estimation methods; there was lack of analyses in compliance with cases falling off or without follow-up; and the report of adverse reaction and the quality of life research was neglected. Conclusion Posthepatitic cirrhosis therapy of integrated traditional Chinese and western medical is of “personalized” and “diversified” characteristics. Its therapeutic effects are significantly better than those of pure western medicine and worthy to be popularized in the clinic. However, the quality and level of its clinical scientific research methods still need further improvement.
Objective To assess the quality of randomised controlled trials on traditional Chinese medicine (TCM) for coronary heart disease (CHD) angina published from 1977 to 2002. Method We did electronic search in Medline, Embase and hand searched 83 journals of traditional Chinese medicine (the earliest published in 1977 and the latest in June 2002). We assessed the quality of obstained studies. Results Four hundred and forty articles met the criteria,of which 33 (7.5%) described randomization. None of them mentioned allocation concealment; 94.77% (417 studies) mentioned diagnosis criteria; only one mentioned the calculation basement of sample size; 84.09% (370 studies) mentioned comparability of baseline. Fifty three studies (12.05%) noted double-blind; 28 studies used single blind. Twenty-five studies used double-blind. Drop-outs were described in 7 cases without intention-to-treat (ITT); 159 studies applied statistical methods properly, while 4 did not. Ten studies never mentioned statistical methods; 73.18% (322 studies) used forms to express their results. Conclusions Till now, the quantity and quality of RCTs of traditional Chinese medicine for coronary heart disease angina were inadequate. Some well designed scientific methods were not adequately applied.
ObjectivesTo evaluate the methodology quality and report quality of the published systematic reviews/meta-analyses (SRs/MAs) of pediatric tuina domestically and abroad.MethodsCBM, VIP, CNKI, WanFang Data, PubMed, EMbase, and The Cochrane Library were electronically searched to collect published pediatric tuina SRs/MAs from inception to December 10th, 2018. The SRs/MAs which includes scale evaluation used AMSTAR2 and the PRISMA report quality evaluation tool to systematically review methodology, adopts Excel to carry out data collation and statistical analysis. ResultsA total of 18 studies (14 in Chinese and 4 in English) on the SRs/MAs of pediatric tuina were finally included. In terms of methodological quality, 6 studies were of low quality and 12 studies were of very low quality. All studies did not explain the reasons for adopting a particular research design type, and few of them explained the pre-plan, exclusion list, reasons and funding. In terms of report quality, 7 studies were relatively complete, 10 studies had certain defects and one study had serious defects. The existing problems were program and registration, comprehensive retrieval, information sources, financial support and so on. ConclusionsSRs/MAs of pediatric tuina have different degrees of issues in terms of methodological quality and report quality which still require further improvement and continuous strengthening.
ObjectiveWe constructed a real-world evidence evaluation system to provide reference for obtaining high-quality evidence in evidence-based medicine.MethodsThrough the investigation and analysis of the key factors influencing the real-world research evidence, combined with domestic and foreign literature and evaluation tools, we preliminarily constructed the indicators of the real-world evidence evaluation system, then consulted experts in related fields by the Delphi method, modified and determined the final evaluation indicators. ResultsThe indicators of the final real-world evidence evaluation system included 40 items. The recovery efficiencies of the two rounds of expert consultation were 88.2% and 100%; The expert coordination coefficients were 0.174 (P<0.001) and 0.189 (P<0.001). After the second round of consultation, the mean of Likert scale in the range of 3.73~4.93, and the coefficient of variation varied in the range of 0.05~0.21. ConclusionThe real-world evidence evaluation system constructed in this study has certain reliability and scientificity, which can provide a basis and help for the transformation of real-world research into high-quality evidence.