Study data
We conducted a secondary analysis of the WASH Benefits Bangladesh cluster randomized controlled trial. The study design and rationale41 and the primary outcomes6 of this trial have been previously published. There was no data collection software used in this pre-specified secondary analysis.
Children aged less than 3 years at enrollment living in the compound were eligible for the caregiver-reported diarrhea. The analysis focused on index children in the birth cohort, and other children living within the same compound that were younger than 3 years at the time of study enrollment. Children with missing outcome data were excluded. Our analysis only focused on survey rounds 1 and 2 (2014 and 2015, respectively).
The trial was conducted in rural communities in Gazipur, Kishoreganj, Mymesingh and Tangail districts. These districts are in central rural Bangladesh where the main source of livelihood of the population is agriculture. Between May 31, 2012, and July 7, 2013, the trial enrolled pregnant women who were identified during the community-based surveys who were expected to deliver in the 6 months following enrollment. Written informed consent was obtained from participants prior to enrollment.
Compounds were enrolled within 720 geographically matched clusters with eight clusters per matched block. Within a matched block, the trial randomly allocated eight clusters to receive: improved water (W), improved sanitation (S), improved handwashing (H), improved nutrition (N), combined WSH, combined WSH + N, and a double-sized control arm. This study focused on children enrolled in control clusters and clusters that received the combined WSH intervention. Within each geographically matched block, the present analysis included four clusters: 2 controls, 1 WSH, and 1 WSH + N. The combined WSH clusters were compared with the control clusters.
Participants and the data collectors were not masked to intervention assignment due to the nature of the interventions. However, the data collection and intervention teams were different individuals. The results were unmasked after the primary outcome analyses were completed.
Intervention group definition
The WASH Benefits Bangladesh trial comprised seven arms: water (W), sanitation (S), handwashing (H), nutrition (N), combined WSH, combined WSH + N, and double-sized control. The water intervention encompassed the use of an insulated storage container for drinking water, along with the utilization of Aquatabs. Sanitation efforts involved the implementation of a sani-scoop, potty, and an improved double-pit pour flush latrine. Handwashing interventions included the provision of a handwashing station, a storage bottle for soapy water, and laundry detergent sachets for the preparation of soapy water. In addition, the nutrition arm entailed the use of lipid nutrient supplements (LNS) between 6 and 24 months, along with a storage container for LNS, exclusive breastfeeding until 180 days, and the introduction of complementary food at 6 months.
Here, we focused on the combined WSH, combined WSH + N and double-sized control arms to ensure statistical power. The intervention arm included the combined WSH and combined WSH + N arms. We excluded single arms to ensure a consistent WASH package in the intervention group40. We found no evidence for any added benefit of combined WSH + N with respect to combined WSH in diarrhea6.
Outcome
The outcome of interest was the caregiver-reported diarrhea in the past 7 days among children enrolled in the trial. We defined diarrhea as having at least three or more episodes of loose or watery stools in 24 h or the latest one stool with blood based on caregiver-reported symptoms in the past 7 days. This variable was binary—0 as non-event and 1 as having the event.
Defining the effect modifiers
Socioeconomic position
The wealth index was the main socioeconomic position indicator to measure wealth. It is an asset-based composite measure of wealth based on a set of household assets and characteristics. In constructing the wealth index, we used a principal component analysis of asset-based variables measured for all participants at enrollment (Supplementary Table 1)13. We excluded all WASH-related variables as recommended by UNICEF14. We excluded asset-based variables with near-to-zero variation (i.e., owning a radio and improved roof) and high levels of missingness (at least 10%, i.e., owning a clock)42,43. Missing values for continuous variables were replaced by the mean13, and for missing factor levels, another level was created. We used continuous wealth index scores in the GAMs and three quantiles (i.e., tertiles) to average over more observations to improve statistical power in the effect modification analyses. We have pre-specified tertiles compared to quintiles or quartiles after a review of the number of children that would be analyzed in each subgroup without consideration of the outcome or effect. Tertiles ensured that there was adequate statistical power when conducting the subgroup analyses, while still allowing for us to assess a pattern from lower to higher socioeconomic position. For the continuous wealth index scores, we specifically used the relative wealth rank of the participants in the cumulative distribution of the wealth index score which is a continuous variable ranging from 0 to 1. Based on the factor loadings, some of the asset-based variables that contribute the most to the wealth variation are having one or more ‘khat’ or bed, electricity in the household, owning a TV and having one or more chair. Meanwhile, the presence of a ‘chouki,’ a specific type of stool, demonstrates a negative factor loading, suggesting that wealthier households tend to have Western-style chairs, whereas the utilization of ‘choukis’ is linked to manual labor involving squatting. Another possible explanation could be the likelihood that wealthier households simply prefer chairs over ‘choukis.’
Monsoon season
We defined monsoon season dates using weekly precipitation data from the Multi-Source Weighted-Ensemble Precipitation from the GloH2044 matched to the study cohort7,38. Monsoon season was marked by the weeks with elevated precipitation and persistent rainfall, where the rolling 5-day average was above 10 millimeters (May 27–September 27 in 2014 and April 1–September 26 in 2015) based on previous analyses of the trial7,38. Meanwhile, dry season included other weeks. This was a binary variable—0 was coded for the dry season and 1 for the monsoon season. We calculated the monthly mean of key climate variables during the trial to characterize the monsoon versus dry seasons in the study.
Geospatial layers
WorldPop provides a high-resolution map of the total number of people per grid-cell at a 1-km resolution in Bangladesh in 201418. We estimated and mapped the total number of children under 3 years by first obtaining the proportion of those between 0 and 14 years (0.31 based on the World Bank)45 and then multiplying it by 0.21 (or \(\frac{3}{14}\)). We assumed a uniform age distribution within the 0-14 range, thus employing the proportion of 0.21.
The wealth index layer was also obtained from WorldPop. Specifically, we obtained the 2011 estimates of the mean DHS wealth index score per grid square. It is based on a Bayesian model-based geostatistics in combination with high-resolution gridded spatial covariates and aggregated mobile phone data applied to GPS-located household survey data on poverty from the DHS Program17. We then estimated and mapped the relative wealth rank per grid square based on the WorldPop wealth layer.
The urban and rural layer was obtained from the Global Human Settlement46. We used this layer to mask the non-urban areas from the analysis. The shapefiles were obtained from GADM47.
Statistical analyses
The analysis was by intention-to-treat. First, we conducted descriptive statistics of the baseline characteristics of the children’s mothers and asset-based household assets. We then conducted descriptive statistics of the effect modifiers used in the analyses—wealth index scores (at baseline) and season (surveys 1 and 2).
Second, we measured socioeconomic inequalities in child diarrhea by calculating the RII and SII to measure inequalities at the relative and absolute scales, respectively16. These are regression-based indicators commonly used to measure inequalities.
Third, we estimated the effects of WASH interventions by socioeconomic position, season and jointly by socioeconomic position and season. The general approach that we used to estimate the effects of combined WASH on diarrhea, with wealth tertiles and season as effect modifiers, is using a generalized linear model (GLM) using a binomial family with an identity link for the absolute effects and log link for the relative effects. We used robust standard errors to account for clustering at the block level. We estimated the 95% confidence intervals of the effect estimates using a linear combination of regression coefficients.
In assessing the effect of WASH on diarrhea by the continuous wealth score, we fitted a generalized additive model (GAM) to capture any non-linear relationships48 between continuous wealth index and child diarrhea. We used a non-parametric smoother, specifically the penalized cubic regression spline to avoid overfitting49 and fit using the restricted maximum likelihood50. Block-level random effects were included in the model to account for clustering of observations at the block level (which was the level of matched randomization). This model allowed for the relationship between socioeconomic position and diarrhea to vary by control and combined WASH group (reference group). We fitted GAM51 using a Gaussian family with identity link to estimate the prevalence difference, and using a binomial family with log link to estimate the prevalence ratio together with their 95% confidence intervals using the tidymv package52.
Fourth, we assessed the effect modification of WASH by socioeconomic position, season and jointly by socioeconomic position and season. We primarily assessed effect modification on the additive scale, which is a measure that is more relevant in public health53. We additionally assessed effect modification on the multiplicative scale. We assessed effect modification by comparing the models with and without the interaction term through a Wald-type F test to test for statistical significance (Supplementary Text 1).
Projecting the impact of an efficacious WASH intervention across Bangladesh to estimate preventable burden
Lastly, we projected diarrhea cases for children under 3 years prevented by the WASH intervention conditional on socioeconomic position, monsoon, age (<3 years), geographic setting (excluded urban areas) and population density (grid-cells with less than or equal to 2 children below 3 years were excluded) throughout rural Bangladesh by combining intervention trial effects from splines estimated across a wealth gradient during the monsoon season with national surfaces of wealth and population. First, we used the WorldPop-projected wealth index layer based on the 2011 DHS wealth index. Second, we used WorldPop population layer to estimate the number of children under 3 years in each grid-cell at a 1-kilometer resolution. Third, we restricted the projections to rural areas of Bangladesh by masking urban areas defined using the Global Human Settlement. Fourth, we used GAM model prevalence difference estimates between the control and WASH based on wealth index (specifically wealth rank) to project diarrhea cases by integrating steps 1–4. We assumed that the control group reflects the baseline WASH conditions for the projection. We applied the GAM model for diarrhea prevalence difference between control and WASH arms by wealth rank to estimate the number of prevented cases by multiplying it by the proportion of children under 3 years indexed by wealth rank during the monsoon season. We projected effect estimates by continuous wealth rank, age and limited our inference to only rural areas and populations that were similar as the WASH Benefits study population. We projected diarrhea cases prevented per month per grid-cell at a 1-kilometer resolution in rural Bangladesh using this formula:
$${diarrheapreventedbyWASH}={childrenunder}3{y\,x\,P}{D({control}\!-\!{WASH})}_{i.{monsoon}.{rural}}\;x\,4\,{weeks},$$
where PD is the diarrhea prevalence difference between control and WASH and i is the wealth rank based on the continuous wealth index scores. We multiplied the estimate by 4 weeks to calculate the diarrhea prevalence prevented per month since diarrhea was measured in the past 7-day period. The estimates assume that 7-day prevalence reflect incident episodes of diarrhea, which accords with a recent, high-resolution longitudinal cohort in Bangladesh that found 1350 out of 1526 (88.5%) of diarrhea episodes from birth to 24 months that lasted less than seven days54.
We then aggregated the grid-cell estimates at the district level to estimate the diarrhea cases prevented per 1000 children under 3 years per month in each district (i.e., zilas) whilst excluding estimates from urban areas. We defined urban areas based on the Global Human Settlement46. We aggregated grid-cell level standard errors estimated from the GAM within each administrative district to calculate the district-level 95% CI.
Sensitivity analysis
We conducted a sensitivity analysis using maternal education instead of wealth index as a measure of socioeconomic position to estimate diarrhea prevalence, prevalence ratio, and prevalence difference. We also used precipitation data from the Multi-Source Weighted-Ensemble Precipitation from the GloH2044. We created binary variables of rainfall (no heavy rain versus heavy rain) under 1-week lag, which indicates whether there was at least one day in the prior week where total precipitation was above the 80th percentile of all daily totals of precipitation. We considered this indicator based on a previous analysis of the trial that demonstrated strong associations between this indicator and the effect of the WASH intervention on child diarrhea7. We assessed the joint effect modification between socioeconomic position and rainfall by comparing the models with and without the interaction term through a Wald-type F test. We also projected the diarrhea cases prevented for children under 3 years throughout rural Bangladesh by wealth averaged over the entire year by combining trial estimates from the GAM that captured non-linear patterns.
Inclusion and ethics
The protocol of the original study (Clinical Trial Registration NCT01590095) was approved by the Ethical Review Committee at the International Centre for Diarrhoeal Disease Research, Bangladesh (PR-11063), the Committee for the Protection of Human Subjects at the University of California, Berkeley (2011-09-3652), the Institutional Review Board at Stanford University (25863) and at the University of California, San Francisco (22-36722).
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.