A-DISCERN: Developing an Automated Tool for Identifying Better Online Quality Information regarding Treatment Options

A-DISCERN: Developing an Automated Tool for Identifying Better Online Quality Information regarding Treatment Options

Ahmed Allam*, UniversitÃ della Svizzera italiana (University of Lugano), Lugano, Switzerland
Peter Johannes Schulz, UniversitÃ della Svizzera italiana (University of Lugano), Lugano, Switzerland
Kent Nakamoto, Virginia Tech University (Marketing Department), Blacksburg, VA, United States

Track: Research
Presentation Topic: Online decision technology
Presentation Type: Poster presentation
Submission Type: Single Presentation

Last modified: 2014-06-07

If you are the presenter of this abstract (or if you cite this abstract in a talk or on a poster), please show the QR code in your slide or poster (QR code contains this URL).

Abstract

Background: Many studies have reported that looking for medical information on the Internet nowadays is one of the primary means of health information seeking. However, the promise of the Internet as a decision support system is seriously marred by the fact that the quality of online health information varies tremendously. Initiatives proposing guidelines, checklists and quality indicators sought to help in solving the problem. However, these guidelines and quality criteria are laborious because of the effort and time needed to manually evaluate each criterion. DISCERN (http://www.discern.org.uk) is one of the tools developed to evaluate the reliability and the quality of information on treatment choices.

Objective: DISCERN can be used without specialist knowledge or reference to publications and advisors, focusing only on the textual content of online publications. This has made it one of the main tools used in scientific studies on health information quality. Therefore, we seek to develop an algorithm that automates the rating of the DISCERN tool. By exploiting natural language processing (NLP) and the use of unified medical language system (UMLS) Metathesaurus, following a supervised machine learning approach.

Method: We considered the first 15 DISCERN questions as criteria. A medical text (document, webpage, etc.) needs to be scored for each DISCERN criterion on a 5-point scale. The goal of the automated rating is to maximize the accuracy of classifying each criterion, and to consequently maximize the accuracy of predicting the overall score. Using Google Trends, we explored the medical topics having the highest search volume from 2004 until now. Breast cancer, arthritis and depression were among them and eventually chosen for the study. Searching Google and Yahoo for each topic, 271 articles focusing on treatment choices and options were extracted from various web domains. Two raters (Master students) were trained for a 2-months period on using the DISCERN. A Web-based platform was developed for independently rating the medical texts for each criterion and selecting parts of the text justifying raters' scoring. The rating process will finish by March 2014.

Results:
The development of the algorithm/classifiers will be described and evaluated using performance measures taking into consideration cross-validation techniques for avoiding overfitting. Moreover, the choice and the process of features/attributes selection from the medical texts will be explained in addition to possibility of the development of feature selection algorithm with the purpose of obtaining the best features subset to be used later in the model.

Conclusion: An automated DISCERN will have an implication on general health information consumers, patients and health information providers. In a next phase, we plan not only to construct an automated metric for quality evaluation but also to develop an educational tool (ex. Web-based application, browser extension) that would benefit the health information consumers. Similarly, health information providers would use it as automated checklist before publishing medical articles on Internet. Not to mention its benefit to the scientific community especially scholars concerned with the quality of health information. Finally, having an automatic evaluation criteria will allow for its integration in the ranking algorithm of search engines that would affect the ranking criteria; displaying webpages in the retrieved search result set that are of higher quality concerning health information content.

Medicine 2.0® is happy to support and promote other conferences and workshops in this area. Contact us to produce, disseminate and promote your conference or workshop under this label and in this event series. In addition, we are always looking for hosts of future World Congresses. Medicine 2.0® is a registered trademark of JMIR Publications Inc., the leading academic ehealth publisher.

This work is licensed under a Creative Commons Attribution 3.0 License.