NIH to Fund our Study of Children’s Obesity in LA using Systems Science Approach

Just heard good news!  NIH is funding our study with Professors May Wang and Mike Prelip at UCLA to apply Systems Science modeling approaches to data from the Women, Infants, and Children (WIC) program in Los Angeles.  This study will evaluate the efficacy of community-based strategies to reduce obesity in this large cohort.  This new study will build upon causal inference modeling approaches my group has been applying with the help of Biostats Professor Alan Hubbard at Berkeley.  I look forward to working with the many collaborators from UCLA, UCB, Center for Weight and Health, PHFE WIC, LA County Public Health, and Samuels and Associates.

Agent-Based Model of Obesity

Dunk the Junk Anti-Soda Mural in Richmond, CA

With funding from NIH NIDDK, we have developed an agent-based model of obesity that considers changes in individual’s diet and physical activity over time.  Using data from the National Heart, Lung, and Blood Institute Growth and Health Study (NGHS) cohort, we have applied both novel causal inference modeling approaches, as well as agent-based modeling.  Along the way, this study has quantified and mapped food environment exposures for the girls in the study, and has assessed the associations between changes in food environment and diet and obesity risk.

The National Heart, Lung, and Blood Institute Growth and Health Study (NGHS)

The NGHS was a 10-year prospective cohort study of African American and White girls initially aged 9 and 10 years old recruited from three main sites in the United States beginning in 1987. The study aimed to collect longitudinal data to better understand both physiological and behavioral factors that may contribute to obesity development in Black females.  Our current study concerns the cohort of 887 girls from one of the sites, University of California, Berkeley, which was recruited from public and private schools in the Richmond Unified School District.

Food Environment Assessment

West Contra Costa food environment from NETS

An important component of our study was to evaluate different options for characterizing  food environment exposures for the NGHS cohort.  We collected, cleaned, and compared various databases including: Yellow Pages, Internet listings on Yelp, Parcel data, local restaurant inspection data, and Dun and Bradstreet data.  Comparisons of these databases in relationship to nieghborhood social demographic characteristics was presented at the American Public Health Association (Hua, et al., 2011 presentation).  We ultimately licensed and used the National Establishment Timeseries (NETS) Dun and Bradstreet data because it could be used to reconstruct changes in food environment exposures by year for each NGHS participant.  Because NETS includes all business establishments, our research also developed SIC code definitions specific to food environments.

Causal Inference Framework

Our study’s hypothesis is that there are upstream determinants of diet and physical activity that affect growth trends in girls.  Using longitudinal data from the NGHS cohort allowed us to explore this conceptual framework by applying both structured equation model and G-computation causal inference methods, as well as agent-based modeling.

To explore this framework, we have applied longitudinal G-computation to a set of structured equations, and have used both generalized linear modeling as well as machine learning approaches to assess the impact of the built environment on mediators of BMI change.  The machine learning approach is based on ensemble prediction using a method called Superlearner, available here as a package in R.

For those interested in how we use Superlearner for modeling, here is a basic example from our modeling scripts that illustrates how a set of longitudinal structured equations are estimated using the method in R.  Details can be found in our forthcoming papers. The vignette that comes with the Superlearner R package provides more basic use examples.

Study findings

Manuscripts documenting our study’s findings are currently in review.  Until peer-reviewed, these are preliminary findings:
  • Controlling for race, stress, and other covariates, individual diet and physical activity had small, but significant effects on BMI trends.
  • Individual BMI trends were sensitive to the timing of changes in diet and physical activity.
  • Increased exposure to fast food increases caloric intake, but not BMI.  Those more exposed to fast food had:
    • 3 kcal’s/day more intake at baseline than those less exposed;
    • 31 kcals/day more intake at year 5; and
    • 63 kcals/day more intake at year 10.
  • Agent-based modeled trajectories of the impact of fast food exposure on different races (shown below), illustrate the disparate effects of fast food exposure on white versus black girls.  Black NGHS girls had higher BMI z-scores throughout the study period compared to the white NGHS girls. Exposure to fast food had opposing effects on Black girls versus White girls.  These effects are highly modified by income, and to a lesser extent self-reported perceived stress.
Efficient Simulation of the Agent-Based Model

As commonly done for ABM simulation studies, our study required many repeated simulations to observe differences between various scenarios. We ran separate scenarios considering hypothetical changes to the community food environment, neighborhood income levels, and for white versus black girls.  Running these many scenarios required  exploration into parallel computing. Because our ABM was implemented in R, instead of developing our own parallel computing algorithms from scratch, we leveraged existing libraries developed for both multi-CPU and GPU processing libraries.  For example:

Ultimately, both our statistical modeling using Superlearner and ABM models utilized the multi-CPU approach, which scaled quite well with additional CPUs, and which allowed us to use computer clusters on the Berkeley campus.

Next Steps

Our group continues research into the food environment and its relationship to diet and metabolic syndrome risk.  These studies and activities are related our ABM modeling study:

  • APHA 2012 presentation validating the use of 3D street view data to map the food environment.
  • Our website allows researchers to map and download food environment data from around the world.
  • Our CalFit smartphone system can be used to assess individual-level exposures to food and other built environment factors, physical activity, diet, and emotional state.


Interested in our study, please contact May Wang <> and Edmund Seto <>.


NIH/NIDDK RC1DK086038  (P.I. May Wang, UCLA;  subaward P.I. Edmund Seto, UCB)

Other key personnel on this grant included:

  • Kate Crespi, UCLA
  • Rob Mare, UCLA
  • Gilbert Gee, UCLA
  • Alan Hubbard, UCB
  • Pat Crawford, UCB
Additionally, this project provided research experience to students at both UCLA and UCB.