The California Wildfires Project seeks to focus on Data Readiness, Methods Readiness and Translational Readiness for predicting and responding to dynamic health-system demands in the context of the CA wildfires and PSPS events, with a particular focus on the most vulnerable.


We seek a data scientist with formal training in either data science, engineering, biostatistics, epidemiology, and/or ecological modeling, with strong quantitative/statistical skills, strong programming abilities (Python or R preferred) and experience or interest in working with US healthcare datasets.

The project will involve the design and implementation of a decision support system to help policy makers allocate resources and relocate patients in the midst of a crisis. This will require developing optimization strategies capable of taking multiple constraints into account to provide likely outcome scenarios.

Preference will be given to candidates with experience in geospatial modeling and public health research. Both graduates and postdoctoral candidates will be considered for either a part-time or full-time appointment commensurate with their experience.

The candidate will be supervised by an interdisciplinary team comprising professors Caroline BuckeeSatchit Balsari, and Mauricio Santillana at Harvard, and Andrew Schroeder at Direct Relief, and Mathew Kiang at Stanford. The candidate is expected to work in a fast-paced environment with multiple stakeholders including Harvard-based investigators, and collaborators from government and academia in the US and overseas, including the Idaho National Laboratory, and the California Dept. of Public Health.


The annual wildfire season in California results in population displacements of varying durations. Among the evacuees, are patients on durable medical equipment, i.e. medical devices that are electricity dependent. Loss of power (from the wildfires as well as from the preemptive public safety power shutoffs) threatens the wellbeing of these medically vulnerable populations – in some cases, imminently. Critical care patients and long-term nursing home patients on ventilators need to be safely transferred to receiving facilities without any interruption in care. Those on dialysis have to be re-matched with dialysis centers in host communities receiving the evacuated populations, and may clinically tolerate two or three days of delay. Patients on home nebulizers may not suffer acutely from the loss of power or access to care, but may experience a longer span of morbidity from prolonged sub-optimal therapy.

The data required to map where these vulnerable populations are, exist, though in different silos across healthcare systems. Population mobility data required to observe post-event evacuation patterns are more easily accessible in the post-COVID world. Information on EMS transport and healthcare bed capacity though not publicly available, can be accessed by public health agencies. The timely access to these various data streams requires significant pre-negotiations around data use agreements, and safe secure use of these data. Timely analytic outputs in near-real-time, during the disaster will require prior simulations and modeling. And finally, response agencies need to be familiar with these analytic outputs and find them actionable, for them to be of use.

Join Our Newsletter