A Content-Based Approach to Identify the Target Users for Health Intervention in Social Media – A Case Study on SafetyMD



Haodong Yang, iSchool at Drexel University, Philadelphia, United States
Jiexun Li, iSchool at Drexel University, Philadelphia, United States
Christopher Yang, iSchool at Drexel University, Philadelphia, United States
Venk Kandadai*, Center for Injury Research and Prevention, Children’s Hospital of Philadelphia, Philadelphia, United States
Flaura Winston, Center for Injury Research and Prevention, Children’s Hospital of Philadelphia, Philadelphia, United States


Track: Research
Presentation Topic: Blogs, Microblogs, Twitter
Presentation Type: Oral presentation
Submission Type: Single Presentation

Building: Mermaid
Room: Room 3 - Upper River Room
Date: 2013-09-24 11:30 AM – 01:00 PM
Last modified: 2013-09-25
qrcode

If you are the presenter of this abstract (or if you cite this abstract in a talk or on a poster), please show the QR code in your slide or poster (QR code contains this URL).

Abstract


Background and Objectives
Community detection has become a popular research area in social media analysis. Finding groups of users who share common interests is of practical significance for health intervention such as optimizing the dissemination of healthcare information or providing recommendations for health consumers. However, most related studies aimed at detecting communities by analyzing general network structure, ignoring the importance of a specific actor in the network. In this study, we focus on an ego-centered network extracted from Twitter and conduct content analysis to identify clusters of Twitter users who follow the information of a particular health intervention Twitter user, safetyMD.
Methods
Twitter handle SafetyMD provides valuable information on health intervention of injury prevention and road safety for children and adolescents. The tweets posted by safetyMD covers information of prevention of teen driver crashes, child passenger safety, and secondary prevention of posttraumatic stress disorder after injury.
We randomly chose 40 followers of “safetyMD” and extracted the latest 50 tweets for each of those 40 followers. Then we presented each follower as a term vector based on his/her latest tweets. Principal Components Analysis (PCA) was used to perform dimensionality reduction. PCA is concerned with explaining the variance-covariance structure of a set of variables through a few linear combinations of them. Not only can PCA decrease the complexity of the data, each principal component extracted may be able to present topics or subjects of the tweets, thus reflecting those Twitter users’ interests. Based on those components extracted from PCA, hierarchical cluster analysis was performed to segment Twitter users. Furthest neighbor was chosen as cluster method and cosine similarity was considered as similarity metric.
Results
39 components explaining 100% total variance were extracted out of 1,262 unique terms. Each hashtag in Twitter indicates a specific topic and if one or more hashtags appear in a tweet, other words in this tweet are probably highly related with these hashtags. Only the first five components contained hashtags with loading greater than 0.5, and they can be named as Disease Research, Road Safety, Driving Issues, Injury Prevention, and Coastal Safety respectively, which reflected the interests of followers who have relatively higher loading on each of those five components.
At last, the 40 followers of “safetyMD” were grouped into five clusters, each of which had some common interests through summarizing their biography information and content of tweets. The following are common interests of followers in cluster (1): injury prevention, especially in sports injury prevention and sports safety promotion; (2): kids safety, youth safety, and driving safety especially for young people; (3): public health and education, injury prevention education, and injury recovery from traumatic events especially from driving accidents; (4): Children’s Hospital of Philadelphia and research of children diseases; and (5): accident research, road safety, traffic safety, medical transportation, and social media.
Conclusion
PCA and hierarchical cluster analysis are useful in segmenting Twitter followers. This method could also be used to segment all SafetyMD’s followers or followers of another Twitter handle. The results are helpful to understand the interest of users and facilitate information dissemination by targeting special groups of users.




Medicine 2.0® is happy to support and promote other conferences and workshops in this area. Contact us to produce, disseminate and promote your conference or workshop under this label and in this event series. In addition, we are always looking for hosts of future World Congresses. Medicine 2.0® is a registered trademark of JMIR Publications Inc., the leading academic ehealth publisher.
Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.