SPATIAL PATTERNS OF ADVERSE BIRTH OUTCOMES AMONG BLACK AND WHITE WOMEN IN MASSACHUSETTS – THE ROLE OF POPULATION-LEVEL AND INDIVIDUAL-LEVEL FACTORS

. This study explores spatial distribution of adverse birth outcomes (ABO), defined as low birth weight (<=2500 g) and preterm deliveries (gestational age <37 weeks), in black and white mothers in the state of Massachusetts, USA. It uses 817877 individual birth records from 2000-2014 aggregated to census tracts (census enumeration unit with population of approximately 4500 people). To account for small numbers of births in some tracts, an Empirical Bayes smoother algorithm is used to calculate ABO rates. The study applies ordinary least squares (OLS) and spatial regression to examine the relationship between ABO rates, seven individual-level factors from birth certificates and nine population-level factors (income level, education level, race) from census data. Explanatory power of these factors varies between the two races. In models based only on individual-level factors, all seven factors were significant (p<0.05) in the black mothers’ model while only three were significant in the white mothers’ model. Models based only on population-level variables produced better results for the white mothers than for black mothers. Models that included both individual and population-level variables explained 40% and 29% of ABO variance for black and white women respectively. The findings from this study give health-care providers and health-care policy-makers important information regarding ABO rates and the contributing factors at a local level, thus enabling them to isolate specific areas with the highest need for targeted interventions.


INTRODUCTION
Adverse birth outcomes (ABO), which include preterm deliveries (gestational age <37 weeks) and low birth weight deliveries (birth weight <=2,500 g) are a complex public health issue in the entire world. According to the World Health Organization, about 15 million babies are born preterm, and this number is on the rise, including in some high-income countries (World Health Organization 2018). In the U.S., the rate of preterm deliveries (PTD) in 2017 was 9.93% (up from 9.57% in 2014). For low birthweight (LBW) deliveries, the rate has also increased since 2014, rising from 8.00% to 8.28% in 2017 (Martin et al. 2018).
While the underlying cause of this increased prevalence of ABO is uncertain, potential hypotheses include a cultural transition to older age of women conceiving, increased use of assisted conception methods, and elevated prevalence of Cesarean sections and induced labor methods (American College of Obstetricians and Gynecologists 2016; Institute of Medicine 2007).
In the United States, the rate of PTD and LBW is higher in non-Hispanic black women than in non-Hispanic white women; rates of PTD and LBW in 2017 in black women were 13.93% and 13.89% while for white women they were 9.05% and 7.00% respectively (Martin et al. 2018). Despite many studies, the underlying causes of this disparity are not well understood (Burris and Hacker 2017;Kent et al. 2013;Lu and Halfon 2003;Manuck 2017).
Health risks associated with an ABO are severe and include an increased likelihood of respiratory problems, brain hemorrhage, heart complications, cerebral palsy, learning disabilities, and delayed motor and social skills (Centers for Disease Control and Prevention 2016; Clark et al. 2009;March of Dimes 2013;Rosenthal and Lobel 2011). Both PTD and LBW are also associated with increased infant mortality. According to the Center for Disease Control, PTD and LBW accounted for about 17% of infant deaths in 2015 (https:// www.cdc.gov/reproductivehealth/maternalinfanthealth/ pretermbirth.htm).
Previous studies have investigated the role that an individual woman's health status may have on birth outcomes. Their results showed that diabetes, hypertension, substance use, previous ABO, and lower socioeconomic status increase the risk of an ABO (American College of Obstetricians and Gynecologists 2016;Berghella 2007;Goldenberg and Culhane 2007;Honein et al. 2009). It is important to consider the individual mother's health circumstances, though modeling ABO through these variables alone often do not capture the full risk present during pregnancy.
Several studies used various regression techniques to analyze the relationship between ABO rates, mother-level health characteristics and population-level characteristics (income, poverty, education, racial composition, population density, and environmental exposures). The geographical scope and unit of analysis varied from counties for the entire U.S. (Carmichael et al. 2014;DeFranco et al. 2008) to just metropolitan areas (Kramer and Hogue 2018), to zip (postal) codes and regional units within a particular state or province (Insaf and Talbot 2016;Kent et al. 2013;(Meng, Thompson et al. 2013). Researchers found that among mother-level characteristics, previous PTD, chronic hypertension, low pre-pregnancy weight, diabetes, maternal smoking during pregnancy, and elevated maternal age at delivery were associated with the likelihood of an ABO. Among populationlevel characteristics, percent population in poverty, percent with low education level, racial composition, and racial segregation were found to be significantly correlated with PTD and LBW (DeFranco et al. 2008;Insaf and Talbot 2016;Kent et al. 2013;Kramer and Hogue 2008).
The mechanisms, or pathways, through which population-level factors are transferred to individual risk factors, are complex and not fully understood. Research suggests that psycho-social stressors, associated with low socio-economic status (stressful work and living environment, reduced levels of social and financial support, deprivation, low access to health care facilities, and exposure to physical hazards, etc.), have impact on individual feelings and lead to depression and to unhealthy behaviors, such as smoking, drinking, substance abuse, delayed prenatal care and poor diet. These stressors also cause changes in neuroendocrine and immunological processes, increasing the risk of adverse birth outcomes (Meng, Thompson et al. 2013).
Although previous studies have investigated potential correlations between both socioeconomic and health related variables and birth outcomes, none evaluated correlations using more than a decade of individual birth data for an entire state at a detailed spatial scale (census tract). Census tract is the smallest geographical unit for which detailed socio-economic information is available from the Census Bureau.
To address these gaps in previous research, this study aims to: (1) to analyze geographic variability of ABO among black and white women in the state of Massachusetts, and (2) to examine the relationship between individual, area-level socio-demographic, and health-related factors and ABO rates at census tract level. Massachusetts ranges from densely populated metropolitan areas (Boston) and their suburbs to sparsely populated rural areas in the west and presents a wide variety of environmental and socio-demographic conditions. The state is divided into 14 counties, consisting of 39 cities and 312 towns. In Massachusetts, the distinction between a city and a town is based on the form of government chosen by the residents (https://www.sec.state. ma.us/cis/cislevelsofgov/ciscitytown.htm).

MATERIALS AND METHODS
We obtained individual birth data for 2000-2014 from the Massachusetts Department of Public Health and selected only singleton live births to non-Hispanic white and non-Hispanic black mothers for the analysis. Birth data were geocoded by the Department of Public Health to the census block level (the smallest enumeration unit in the U.S. Census). Six percent of births lacked census block information and were excluded from the analysis. Our final dataset included 725,582 births to white mothers, and 92,295 births to black mothers.
In order to facilitate the analysis of associations with socio-economic and demographic data, individual birth data was aggregated to census tracts. Census tract boundaries were obtained from the Office of Geographic Information, Commonwealth of Massachusetts (www.mass.gov/mgis/ massgis.htm). There are 1472 census tracts with an average population of 4500 people in each census tract. The tracts that did not have any singleton live births to non-Hispanic white or non-Hispanic black mothers during the entire 15 years of study were excluded from the analysis, leaving 1467 census tracts for the analysis of births to white mothers, and 1449 tracts for the analysis of births to black mothers.
Each birth was assigned to a category based on birth weight and gestational age as follows: low birth weight (weight <= 2500 g) or normal birth weight (weight > 2500 g), and a full-term birth (gestational age >= 37 weeks) or preterm birth (gestational age < 37 weeks). A birth that was either preterm or low birth weight was considered an ABO. Total number of births and the number of ABOs for each census tract were calculated for the entire period (all 15 years combined).
Each birth record also contained mother-level data, such as mother's age, smoking during pregnancy, presence of gestational diabetes, gestational hypertension, chronic hypertension, and previous preterm delivery. Using this data, we calculated for each census tract the percentage of mothers who had these health conditions and percent of teenage (younger than 20 years) and older (older than 35 years) mothers, separately for non-Hispanic black and non-Hispanic white mothers. This data is summarized in Table 1.
To explore geographic variation of ABO outcomes in more detail, we obtained boundaries of urban, suburban, towns, and rural locales from the National Center for Education Statistics (https://nces.ed.gov/programs/edge/Geographic/ LocaleBoundaries) and overlaid them with census tract boundaries. Urban locales corresponds to principal cities with population over 100 thousand people; suburban locales have population between 50 and 100 thousand people and are located within the urbanized area adjacent to principal cities; towns are locales with population between 2.5 and 50 thousand people, located outside an urbanized area; and rural locales are all remaining territories. For the purposes of this research, we included towns into rural category because in Massachusetts both are similar in population density and types of land uses. Thus, we designated each census tracts as either primarily urban, suburban, or rural.
Population-level socioeconomic and demographic factors relevant to this study -education level, income, race, population density -were obtained from the 2006-2010 American Community Survey and the 2010 Census at a census tract level (https://factfinder.census.gov/faces/nav/jsf/pages/ index.xhtml). Table 1 provides detailed information about each variable. These variables were selected based on the findings of previous studies of ABO (Carmichael et al. 2014;DeFranco et al. 2008;Insaf and Talbot 2016;Kent et al. 2013;Kramer and Hogue 2008).
Raw ABO rates for each census tract were calculated for each census tract (dividing the number of ABO births by the total number of births), separately for both races. This approach produced potentially unreliable rates in areas with a small number of births. For example, if there were only two births in a census tract, and one of them was low birth weight, then the resulting ABO rate was 50%. This is often referred to as a "small numbers problem" or variance instability. One common approach to addressing this problem is to calculate adjusted, smoothed rates using Bayesian statistics. Using this approach, an estimate is obtained by combining the raw rates with "prior" information, such as the overall mean for the entire study area, i.e. an entire state (Anselin et al. 2006a). This smoothing method adjusts rates toward the overall mean, reduces variance instability, and produces robust and reliable rate estimates even for small samples (Kang et al. 2016;Mollalo et al. 2017). Adjusted ABO rates using Empirical Bayes smoother algorithm were calculated in the GeoDa software (Anselin et al. 2006b) for each census tract and these adjusted ABO rates were used in our analyses (Figure 1).
To contextualize ABO rates further, we used spatial selection tools in GIS and calculated ABO rates separately for urban, suburban and rural environments. A similar study found that urban areas in the state of Alabama had higher ABO rates  (Kent et al. 2013), and we wanted to see if the same is true in our state. To characterize ABO rates' spatial pattern, a Global Moran's Index for both races was calculated. This index classifies the spatial pattern of a measured value (i.e. ABO rate) as random, clustered or dispersed, based on the index value and corresponding Z score. If the pattern is clustered or dispersed, it indicates that the observed pattern is not due to a random chance and that an underlying spatial process leads to a particular spatial pattern (Mitchell, 2005).
To determine the strength and the nature of the relationship between ABO rates and population-level and individuallevel factors, we used regressions techniques. We applied multivariate ordinary least squares (OLS) regressions with the average rate of ABO for all 15 years as the dependent variable. Three separate regressions were run for each race: regressions with only individual-level variables (health-related data from birth certificates), with only population-level variables (Census data), and with all variables together. First run of each regression identified statistically significant variables (at 95% confidence level), and then only these variables were included in the final run of each regression. Independent variables included in the final OLS models are shown in Figure 2.
After each regression run, a Moran's I was calculated to test for spatial autocorrelation of the residuals. Z-scores were significant in all six regressions, indicating that residuals were not randomly distributed and suggesting a model misspecification. To address this problem, the two best-fit OLS regressions were selected (one for black and one for white births) and the same variables were used as the input into a spatial regression model. We followed Anselin's (2005) process for selecting the appropriate spatial regression model. This process compares multiple test statistics calculated in GeoDa software and indicates what model -spatial lag or spatial error -would be the best choice. Both models include an additional variable that explicitly captures spatial relationships in the data. Spatial lag model includes spatially lagged dependent variable as an additional independent variable in the analysis, and the spatial error model includes a spatial autoregressive error term (Anselin 2005).

RESULTS
Similar to previous studies, we found that there was a racial disparity in ABO rates between white and black women; statewide raw 15-year average ABO rate for white mothers was 7%, and for black mothers -12%. When analyzed at a census tract level, maximum ABO rates are also very different between the two races, with the maximum rate for white women (18%) being much lower than for black women (32%). When stratified by urban-suburban-rural locations, the tract-level ABO rates for both white and black mothers showed the highest values in areas designated as urban and the lowest values in rural environments (Table 2).

Table 2. Mean ABO rate (standard deviation) per census tract for black and white mothers in urban, suburban and rural locations
Results of global Moran's I analysis indicate statistically significant clustering of ABO rates for both races (white Moran's I z-score = 4.86; black Moran's I z-score = 17.33). Maps of ABO rates confirm this finding; rates do not appear to be randomly distributed and there are several clusters of high values for both races located in different parts of the state (Figure 1).
Regression models based on individual-level variables explained 39 % of the variability in ABO rates for black women and considerably lower amount (23%) for white women ( Table  3). Models that only contained population-level variables, explained similar percentages of the variance for black (21%) and white women (24%). Mixed models, containing both individual-level and population-level variables, explained 40% and 29% respectively, for black and white women.
Of the seven variables related to mothers' health and age, all were statistically significant at 95% confidence level in individual-level models for black mothers, and three -for white mothers. Percent teenage mothers has the largest effect on the ABO rates in black mothers (standardized coefficient 0.235), followed by the percent with chronic hypertension (standardized coefficient 0.182). For white mothers, the percent of mothers smoking and percent teenage mothers are the two strongest predictors (standardized coefficients are 0.361 and 0.143 respectively).
Of the nine population-level variables, education and income-related variables were statistically significant for both races at 95% confidence level, as well as percent corresponding race in the census tract. Unemployment was never a significant variable in any models.
When individual-level and population-level variables were combined in the mixed model, some of the variables remained statistically significant while others did not. For example, percent race and percent with chronic hypertension remained significant for both white and black mothers. For black mothers, all individual-level variables, except for one (percent smoking) remained significant. On the other hand, only two individual-level variables remained significant for white mothers in this mixed model. For population-level variables, an opposite pattern is present in the mixed model; while four variables remained significant for the white mothers' model (two education variables, median household income and percent white), only one variable (percent black) remained significant for black mothers' model.
The signs of the regressions coefficients were mostly in agreement with what we expected (e.g., higher percent college-educated and higher median household income were associated with lower ABO rates for both black and white mothers). Per capita income was statistically significant for white mothers, but its coefficient was the opposite of what  Population-level variables Percent with a Bachelor's Degree or Higher n/a n/a -0.075 -0.301 n/a -0.099 Median Household Income n/a n/a -0.108 -0.176 n/a -0.152 Percent Race n/a n/a 0.241 -0.173 0.120 -0.230 Percent with Less than High School Diploma n/a n/a n/a -0.054* n/a -0.080 Percent Below Poverty n/a n/a 0.143 0.108* n/a n/a Per Capita Income n/a n/a n/a 0.116 n/a n/a Median Earnings, Fulltime Female Employees n/a n/a n/a n/a n/a 0.107 Population Density n/a n/a 0.062* n/a n/a n/a Individual-level variables Diabetes 0.113 n/a n/a n/a 0.115 n/a Pregnancy-Related Hypertension 0.152 n/a n/a n/a 0.120 n/a Chronic Hypertension 0.182 0.138 n/a n/a 0.150 0.080 Previous Preterm Infant 0.091 -0.044* n/a n/a 0.110 n/a Cigarette Use 0.064 0.361 n/a n/a n/a 0.338 Older Mom 0.155 n/a n/a n/a 0.137 n/a Teen Mom 0.235 0.143 n/a n/a 0.232 n/a was expected (i.e., higher per capita income was associated with higher ABO rate). Percent race was a significant variable in all models, but had different signs -positive sign for black mothers, and negative sign for white mothers. We selected two models with the highest R 2 (the mixed models) and used GeoDa software to calculate diagnostic statistics for spatial dependence. These tests showed that residuals were spatially autocorrelated in both models (z score for Moran's I for black model was 2.46; for white model -2.48). Following model selection decision rule outlined by Anselin (2005), a spatial lag model was developed for black mothers, and spatial error model -for white mothers. To create spatial term in the equations, we experimented with different weights configurations and selected the weights that produced the best fitting regressions. We applied queen first order contiguity weights to black model, and queen second order contiguity weights -to the white model. In first order queen contiguity, census tracts that share common edges and corners are considered neighbors.
In both spatial regression models, all input variables remained statistically significant, and their coefficient signs were the same as in the OLS models. Spatial terms in both regressions (spatial lag term in the black model and spatial autoregressive error term in white model) had statistically significant coefficients with positive signs, representing spatial influence of the neighboring census tracts on ABO rates (Table 4).
Spatial regression models produced a psedo-R2 which is not directly comparable to the R2 from OLS models (Anselin 2005), so we used Log-Likelihood and AIC as measures of fit to compare these models to OLS models. In both spatial regressions, AIC value decreased and the Log-Likelihood value increased, suggesting an improvement of fit for the spatial models (Table 4).
After running spatial models, spatial autocorrelation was no longer present in the residuals (residuals Moran's I z-score was -0.1566 for black model; for white model 1.6653). This means that including the spatially lagged dependent variable term in the black model and spatially autoregressive error term in the white model has successfully eliminated all spatial autocorrelation.

DISCUSSION
In this study, we found that the ABO rates in Massachusetts varied considerably across census tracts and their distributions were very distinct for white and black mothers. Urban locations had higher ABO rates than suburban and rural locations. ABO rates with similar values were clustered in both races, but stronger clustering was observed in black mothers, as evidenced by their much higher z-score for Moran 's I (17.33 vs. 4.86).
Most previous studies conducted the analysis at the scale of counties, metropolitan statistical areas or zip codes. Our study used census tract as the unit of analysis, because census tracts are small enough to allow for modeling of local variation in ABO rates. This spatial scale also facilitates the linking of the individual-level factors with census data and provides enough spatial detail to design a meaningful intervention or develop policy at regional or city/town level (Insaf and Talbot 2016).
When taken together, the selected socio-economic, demographic and health-related factors explained close to 30% of variability in ABO rates in white, and 40% in black mothers. When analyzed separately, individual-level and populationlevel factors explained the same amount of variability in ABO rates for white mothers (24%). For black mothers, individuallevel factors explained almost twice the amount of ABO variability explained by the population-level factors (39% vs. 21%).
Among individual-level factors, smoking was the strongest predictor for white mothers and percent teenage mothers for black mothers (both in individual-level model, and in mixed model). To illustrate how these important findings can be useful to the health care providers and policy makers, the top 10% of tracts with the highest smoking rates for white mothers were selected, and 34 towns and cities that contain these census tracts were identified. The same process was repeated, and 31 towns and cities with the highest percent of teenage mothers for black mothers were identified. Twenty towns and cities were included in both lists, meaning that these locations have the highest smoking rates among white mothers during pregnancy and the highest rates of teen births to black mothers (Table 5). Health care providers and policy-makers in these towns and cities, armed with the findings from this study, could design targeted public outreach programs aimed at reducing smoking, especially among women, and delaying pregnancy among teenage women.
Our study had several limitations related to data sources and methodology. While birth data from the state department of Public Health is very detailed, we could not verify its quality and reliability. This data did not have information about father's health. The definition of some variables changed in the middle of our study period, rendering them unusable in the analysis (e.g., marital status and mother's education). We included population level variables measuring education level, but using mother's education level data would have been more relevant. Another limitation of the study is its inability to include other factors, which could have influenced ABO rates, such as other health conditions of the mother, psychosocial factors and potential environmental exposures. While many studies focus on either PTD or LBW, we combined them together in our analysis, recognizing that each may have a different, albeit overlapping, set of individual and population level factors. We combined them in order to increase the number of ABO outcomes in each census tract, thus alleviating the "small number problem" and increasing stability of our Empirical Bayes rates estimates.
These limitations notwithstanding, our study makes important contributions to the growing body of literature. It is the first to analyze ABO at the census tract level for an entire state, using 15 years of individual birth records. Additionally, this study is unique as it examines correlations of mother's health factors and socioeconomic factors separately as well as through a mixed model, which considers potential influences of both sets of characteristics. Spatial term coefficient and its z-value -0.1900 (5.64) -0.3614 (6.06)

Table 4. OLS and spatial regressions: Measures of fit and spatial term coefficients
The findings from our study provide Massachusetts health care providers and health-care policy makers with information regarding ABO rates and the contributing factors at a local level, giving them the ability to isolate specific areas with the highest need for targeted interventions. Examples of community-oriented public health interventions include improving access to healthy food and to primary prenatal care in low socio-economic areas, improving quality of health care accessible to expectant mothers, and increased social support in local communities (Lorch and Enlow 2016).
These interventions would help alleviate the impacts of some psycho-social stressors and reduce the risk of adverse birth outcomes. In support of these interventions, the analysis of annual ABO rates thought time at census-tract level would also be very useful.