Proof-of-Concept – Easily Identifying and Extracting Potential Patients in Facebook
|
If you are the presenter of this abstract (or if you cite this abstract in a talk or on a poster), please show the QR code in your slide or poster (QR code contains this URL). |
Abstract
Background: Online social networks are online services, platforms, or sites that focus on facilitating the building of social networks or social relations among people who share interests, activities, backgrounds, or real-life connections. They allow users to communicate and to share ideas, activities, events, and interests within their individual networks or for a broad public audience. With over 900 million users, Facebook is regarded as today’s largest and most successful social network. There are several small studies indicating the potential benefits for patients. Especially in the context of rare diseases social networks and their ability to organize in groups can make it easier for patients of a common illness to find each other and exchange experiences. However, in the process a lot of health-related personal data is generated and published, raising various security and privacy concerns. Objective: The objective of our study was a proof-of-concept that information can easily be extracted in a structured way and that potential patients could be identified. Methods: We chose several distinct English MeSH terms of rare diseases. Using Facebook developer tools we implemented a Python script which programmatically employed the Facebook graph API to search facebook’s content for the chosen diseases and aggregated data that were found. In our process we only included pages showing user comments. We had to add a manual work process of checking groups and pages for relevance. Relevance was reached when the group or the page really talked about the disease which was searched. Having a distinct group or page id we tagged those results in our script for the final output. Results: For each tested search string we found at least one group, one page or an open post. Our results only included public available information. We were able to extract UID (user ids issued by Facebook in a unique and public fashion), name, surname, gender, date of entry, entry and group name. Closed groups did not reveal UIDs. Hidden groups were not found at all. Several pages were restricted to certain countries. The search results therefore do not guarantee completeness. The only limitation we experienced was a limit of 60 search requests within one minute. Nevertheless we were able to extracted thousands of distinct UIDs. Manually browsing some of the openly accessible owner accounts, we could identify parents of patients, health professionals and patients who were in charge of the group. Conclusion: Our proof-of-concept shows that personal health-related data can be extracted in a disturbingly easy manner as long as no restrictions are implemented. Settings to protect patient-related data exist, but are seldom employed in health-related groups. Patients should question their usage of social networks, know about and employ the existing data protection settings within Facebook, or, even better, use platforms with appropriate default privacy settings.
Medicine 2.0® is happy to support and promote other conferences and workshops in this area. Contact us to produce, disseminate and promote your conference or workshop under this label and in this event series. In addition, we are always looking for hosts of future World Congresses. Medicine 2.0® is a registered trademark of JMIR Publications Inc., the leading academic ehealth publisher.

This work is licensed under a Creative Commons Attribution 3.0 License.