Determinants of Participation in Internet-Based Epidemiological Studies



Daniela Paolotti*, ISI Foundation, Turin, Italy
Paolo Bajardi, ISI Foundation, Turin, Italy
Lorenzo Richiardi, Unit of Cancer Epidemiology, University of Turin, Turin, Italy
Franco Merletti, Unit of Cancer Epidemiology, University of Turin, Turin, Italy
Emanuele Pivetta, Unit of Cancer Epidemiology, University of Turin, Turin, Italy
Corrado Gioannini, ISI Foundation, Turin, Italy
Vittoria Colizza, INSERM Universitè Pierre et Marie Curie, Paris, France
Alessandro Vespignani, Northeastern University, Boston, United States


Track: Research
Presentation Topic: Web 2.0 approaches for behaviour change, public health and biosurveillance
Presentation Type: Poster presentation
Submission Type: Single Presentation

Last modified: 2012-09-12
qrcode

If you are the presenter of this abstract (or if you cite this abstract in a talk or on a poster), please show the QR code in your slide or poster (QR code contains this URL).

Abstract


Background

Internet-based systems for recruitment and follow-up in epidemiological studies have advantages over the traditional approaches as they can potentially recruit and monitor a wider range of subjects in a relatively inexpensive fashion. Voluntary participation as well as recruitment through a dedicated web site is easy to manage and advertise and the data gathering can be conveniently implemented through electronic databases. The opportunistic enrollment among the general population has to rely on the individual willingness of being monitored and may lead to biased and fluctuating samples. This calls for further investigation on the profile of individuals under study to assess what are the determinants associated with a participation to follow-up. Such information would be important for the design of future internet-based epidemiological study. Lost to follow-up is a potential problem in any cohort study and it is not an issue specifically related to the use of online platforms. However the determinants of lost to follow-up have not been studied before in the context of internet-based cohort studies.
Objectives
The aim of this work is to provide insights to bridge the gap between the current knowledge about participation/withdrawal of internet-based and more traditional epidemiological studies. In particular, we aim at understanding whether the source of information of the existence of the internet-based study is a determinant of completeness of follow-up.

Methods

In this work we study the determinants of participation to follow-up of individuals involved in two different projects in which the data source is an Internet based platform: The data in this paper come from two web-based platforms operating in Italy. The studies involved in this research are:
• Ninfea (www.progettoninfea.it), an Italian birth cohort study developed to investigate chronic disease that are suspected to have perinatal origin and might be modified by postnatal exposures
• Influweb (www.influweb.it), an Italian surveillance platform aimed at monitoring the unfolding of seasonal and pandemic influenza. 

The differences in the target populations (pregnant women vs. whole population) and the studied outcomes (chronic vs. infectious diseases) as well as the different periods of follow up (days vs. months) represent a plus for our analysis and help for a wider comprehension of the research question by unveiling similarities and discrepancies between the two studies. Data collected from the baseline questionnaires are considered as determinants that can lead to lower or higher response at follow-up and whose effect is assessed by estimating the crude odds ratio. Furthermore, a multivariable logistic regression is used to evaluate the impact of the recruitment by different mean of information on the retention behavior.

Results

For both datasets, we have evaluated the correlation between the participation at the follow-up questionnaire and some socio-demographic features of the participants by means of the crude odds ratio. Some variables are common to both studies as age, smoking habits, and household composition, while other variables are used only for one of these datasets as gender (Influweb dataset only), or working status (Ninfea dataset only). We finally assess the impact of different means of advertising, considering two broad classes: on-line and off-line media. We perform for both Ninfea and Influweb datasets a multivariable logistic regression where the behavior to follow-up is the binomial outcome and the media that inform the individuals of the existence of the study is the independent variable that has been adjusted by age, smoking habits and socio-economical status.
Ninfea cohort study
When comparing the responders and the non-responders in the Ninfea study, the largest variations have been observed among different smoking behaviors and different educational levels (see Table 1). In general, as shown in Table 1, by considering indicators related to the socio-economical status as education level, we observed a higher retention among mothers belonging to the higher strata, on the contrary, unemployed are more prone to withdraw from the study.
To study how the various methods of source of information about the existence of the study have influenced the participation of the mother, we first assess the importance of being exposed at one of such methods at a time (see Table 2), and then we perform a multivariable logistic analysis in order to simultaneously account for the different methods and adjusting for potential confounders.
The enrollment question presented several options that have been rearranged in four categories:
• face-to-face advertisement includes the pre-delivery course or colloquiums with a gynaecologist or with personnel at a family planning clinic;
• poster
• leaflets
• websites
Table 1: Factors associated with the participation in the Ninfea cohort. Participation rate, crude odds ratios and confidence intervals are shown for different strata.
Variable (target population) Participants (%) Lost to
follow-up (%) OR (crude) 95% c.i.
Age (2707)
<30 (23%) 553 (89) 64 (11) 0.71 0.51 – 0.99
[30 ; 34] (45%) 1120 (92) 94 (8) 1.00
[35 ; 40] (27%) 685 (92) 59 (8) 0.97 0.69 – 1.37
>40 (5%) 121 (92) 10 (8) 1.01 0.51 – 2.00
Smoking status (2647)
Non-smoker (19%) 1684 (93) 124 (7) 1.00
Ex-smoker (13%) 315 (92) 26 (8) 0.89 0.57 -1.38
Smoker (68%) 442 (89) 56 (11) 0.58 0.42 – 0.81
Education (2664)
None/primary/secondary school (6%) 31 (20) 128 (80) 0.27 0.17 – 0.41
High school (37%) 90 (9) 883 (91) 0.63 0.47 – 0.86
University + (57%) 93 (6) 1439 (94) 1.00
Working status (2659)
Permanent (63%) 1557 (93) 117 (7) 1.00
Temporary (12%) 287 (92) 26 (8) 0.83 0.53 – 1.29
Free Lance/ Self-employement (13%) 329 (94) 22 (6) 1.12 0.70 – 1.8
Unemployed/ Housewife/ Student/ Other (12%) 273 (85) 48 (15) 0.43 0.3 – 0.6
Cohabiting partner(2330)
Yes (97%) 2101 (93) 162 (7) 1.00
No (3%) 56 (84) 11 (16) 0.39 0.20 – 0.76
Common chronic disease/condition1 (2707)
Yes (50%) 1277 (94) 81 (6) 1.93 1.45 – 2.55
No (50%) 1202 (89) 147 (11) 1.00
1 Common chronic disease/condition include the mothers with diagnoses of asthma, allergic rhinitis, hyperthyroidism, anaemia, ovarian cyst, candida, urinary tract infection, hemicranias, cephalalgia, anxiety. The selection of the common chronic diseases is based on a prevalence threshold value. The diseases/anomalies with prevalence higher than 5% in the Ninfea cohort have been included in such category.


Table 2: Participation rate, crude odds ratios and confidence intervals are shown for different source of information of individuals enrolled in the Ninfea study.
Source of information (2555) Participants (%) Lost to follow-up (%) OR (crude) 95% c.i.
face-to-face (29%) 719 (95) 39 (5) 1.51 1.05 – 2.18
Poster (10%) 251 (95) 14 (5) 1.35 0.77 – 2.38
Leaflets (42%) 1000 (93) 72 (7) 1.04 0.76 – 1.42
Websites (18%) 426 (90) 46 (10) 0.61 0.43 – 0.87

Since the question allows a multiple choice , each participant could check more than one option. More than 10% of the participants have answered with more than one answer. For this reason, each participant who has checked more than one category is counted as 1 in each category.
For the multivariable analysis we re-define the categories as mutually exclusive so that each participant is counted in only one of the media she has checked. Such categories are:
• face-to-face advertisement defined as before;
• off-line methods include poster, leaflets and other, excluding those who have been exposed to at least one face-to-face advertisement
• on-line methods include those who have been aware of the study only through on-line media, excluding those who were also exposed to face-to-face or off-line advertisement.
The results of the multivariable logistic analysis are reported in Table 3.
In this case, the recruitment by means of face-to-face colloquiums has been used as the baseline. With respect to it, both the other two categories have OR <1 suggesting that participants recruited by means of posters, leaflets, etc. or by following links from other websites are more prone to drop the study compared to participants enrolled by face-to-face colloquiums.
Since the platform has undergone a migration, for which the question about the source of information was shown at the end of the questionnaire, in the first version of the platform, and at the beginning in the second version, the logistic regression analysis has been performed separately on the data for the two different versions of the platform (not shown). The outcome doesn’t seem to be affected by the repositioning of the question from one version to the other.

Table 3: Multivariable logistic regression of participation in the Ninfea cohort.
Variable OR (adjusted) 95% c.i.
Source of information
face-to-face 1.0
off-line 0.73 0.49 – 1.09
on-line 0.49 0.31 – 0.78
Age
<30 1.0
[30 ; 34] 1.13 0.75 – 1.71
[35 ; 40] 0.99 0.64 – 1.54
>40 1.14 0.49 – 2.63
Smoking status
Non-smoker 1.00
Ex-smoker 0.99 0.60 – 1.66
Smoker 0.58 0.4 – 0.84
Education
None/primary/secondary school 0.34 0.2 – 0.57
High school 0.72 0.51 – 1.02
Bachelor/Master/Other 1.00


Influweb cohort
As reported in Table 4, the larger variations in the retention behavior are observed for different gender, smoking status and whether the person lives alone or not. Interestingly the number of estimated sickness episodes per year or the diagnosis of chronic disorders (both respiratory and non-respiratory) do not seem to affect the volunteers’ respondence (see Table 4).
Table 4: Factors associated with the participation in the Influweb cohort. Participation rate, crude odds ratios and confidence intervals are shown for different strata.
Variable (target population) Participants (%) Lost to
follow-up (%) OR (crude) 95% c.i.
Gender (937)
Female (40%) 221 (59) 151 (41) 1.00
Male (60%) 377 (67) 188 (33) 1.37 1.04 – 1.80
Age (943)
<30 (15%) 78 (55) 63 (45) 1.00
[30 ; 39] (27%) 155 (61) 98 (39) 1.28 0.84 – 1.94
[40 ; 49] (25%) 131 (56) 101 (44) 1.05 0.69 – 1.60
[50 ; 59] (20%) 134 (71) 55 (29) 1.97 1.25 – 3.11
>60 (13%) 103 (80) 25 (20) 3.33 1.92 – 5.76
Sickness episodes (920)
<2 (72%) 428 (65) 235 (35) 1.00
2-5 (26.5%) 149 (61) 95 (39) 0.86 0.64 – 1.16
5+ (1.5%) 7 (54) 6 (46) 0.64 0.21 – 1.9
Chronic disorders (905)
No (86%) 489 (63) 290 (37) 1.00
Chr. Respiratory (8%) 49 (66) 25 (34) 1.12 0.62 – 2.02
Chr. non-respiratory (6%) 34 (65) 18 (35) 1.16 0.7 – 1.92
Smoking status (914)
non-smoker (77%) 464 (66) 239 (34) 1.00
occasionally smoker (8%) 39 (56) 31 (44) 0.65 0.40 – 1.06
Smoker (15%) 76 (54) 65 (46) 0.60 0.42 – 0.86
Cohabitation (911)
Yes (87%) 491 (62) 304 (38) 1.00
No (13%) 110 (74) 38 (26) 1.86 1.19 – 2.89


Similarly to what has been observed for the pregnant women of the Ninfea cohort, a higher retention behavior has been observed among the non-smoking participants. Futhermore, participants living alone or in a household composed only by adults are less prone to leave the study.
In the same fashion as the Ninfea study, we investigated how the various methods of source of information about the existence of the study have influenced the participation of the Influweb volunteers (see Table 5).
The question about the recruitment method presented the following options: television, radio, newspapers, word of mouth (both friends and people belonging to the Influweb team), conferences and websites. We define the category “face-to-face” that includes conferences and word of mouth.

Table 5: Participation rate, crude odds ratios and confidence intervals are shown for different source of information of individuals enrolled in the Ninfea study.
Source of information (558) Participants (%) Lost to
follow-up (%) OR (crude) 95% c.i.
Television (20%) 82 (74) 29 (26) 2.64 1.66 – 4.2
Radio (4%) 16 (76) 5 (24) 2.6 0.93 – 7.16
Newspaper (13%) 50 (67) 25 (33) 1.67 1.00 – 2.79
face-to-face (15%) 44 (55) 36 (45) 0.95 0.59 – 1.53
Websites (46%) 112 (44) 145 (56) 0.38 0.27 – 0.54

The sample size of some categories was rather small, thus leading to wide confidence intervals. Furthermore, even if the question was a multiple choice, meaning that each participant could check more than one option, in this case more than the 99% of the participants checked just one choice.
Similarly to the analysis of the Ninfea dataset, we perform a multivariable analysis. We re-define the categories as mutually exclusive so that each participant is counted in only one of the media she/he has checked. Such categories are:
• face-to-face defined as above;
• off-line media, that includes television, radio, newspapers and others;
• online methods.
The results of the multivariable logistic analysis are reported in Table 6.

Table 6: Multivariable logistic regression of participation in the Influweb cohort.
Variable OR (adjusted) 95% c.i.
Source of information
face-to-face 0.63 0.34 – 1.14
off-line 1.00
on-line 0.38 0.25 – 0.59
Age
<30 1.00
[30 ; 39] 1.38 0.76 – 2.52
[40 ; 49] 1.23 0.67 – 2.27
[50 ; 59] 2.21 1.13 – 4.3
>60 3.34 1.52 – 7.25
Gender
male 1.11 0.75 – 1.64
female 1.00
Deprivation index
< -2 0.39 0.10 – 1.4
[-2 ; 2] 1.00
[2 ; 5] 0.72 0.44 – 1.16
> 5 0.59 0.27 – 1.28
Smoking status
Non-smoker 1.0
Occasionally smoker 0.85 0.41 – 1.76
Smoker 0.51 0.30 – 0.87

For Influweb, the recruitment via off-line methods (TV, radio, newspaper etc) has been used as the baseline. With respect to it, both the other two categories have OR <1. This seems to correspond to the fact that participants recruited by for example by means of conferences, word of mouth, etc as well as those recruited via on-line advertisements are more prone to drop the study compared to participants enrolled by conventional off-line media.


The two studies, Influweb and Ninfea, examined in this work are quite different from several points of view. One targets the general population, including all the age brackets, while the other targets only pregnant women. The outcome in exam is a seasonal airborne illness in one case and chronic diseases in another. The period of follow up is days, in the case of Influweb, and months in the case of Ninfea.
Means of recruitment used to enroll participants in the two studies are quite different as well (TV, radio, newspapers vs leaflet, posters, colloquiums with a medical doctor). But they do have a common trait, i.e. is the duality offline versus on line media. In both the Influweb and Ninfea studies, individuals recruited by means of online advertisement are more prone to drop the study compared with individuals enrolled with classical offline methods.
The on-line advertisement in both cases has appeared mainly in generic websites, as newspapers, magazines, portals and secondly in more health-oriented web pages. The first kind of web links generates a high rate of access from the general public but it doesn’t convert in effective enrollment. In the future studies, the web “advertisement” on health care dedicated web sites could be taken into account and the effectiveness of this enrollment could be compared with the generic web advertisement and the offline media.


Conclusions
As far as age, education and smoking behavior as factors influencing a lower or higher retention rate in follow-up studies are concerned, results for internet-based studies are similar to what can be found in the literature for traditional non-web based epidemiological studies. Moreover, the potential bias in Internet-based studies concerning the selection of computer-educated participants seems to have an opposite trend on the level of retention rates at the follow-up. Participants enrolled by means of Internet-related media seem to be rather less prone to continue participating at follow-up with respect to participants enrolled by means of more traditional offline media.
We can conclude that an internet-based enrollment campaign for internet-based epidemiological studies seems to be less effective than the off-line advertising in enrolling participant volunteers. This suggests that also for internet-based epidemiological studies an on-line enrollment campaign cannot be the only means of communication.
 Further analysis in different countries may enrich our understanding making possible epidemiological web based researches.




Medicine 2.0® is happy to support and promote other conferences and workshops in this area. Contact us to produce, disseminate and promote your conference or workshop under this label and in this event series. In addition, we are always looking for hosts of future World Congresses. Medicine 2.0® is a registered trademark of JMIR Publications Inc., the leading academic ehealth publisher.
Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.