Yang Liu PhD – Emory Environmental Remote Sensing Group

Dr. Liu has 20 years of experience on studying the spatial and temporal characteristics of air pollution using remote sensing data from various satellite instruments. His research interests also include satellite aerosol retrieval and product design, the potential impacts of global climate change on public health, GIS and spatial statistics. Dr. Liu has been funded by NASA, CDC, NIH, EPA, HEI, and WHO to apply satellite data in air quality modeling and study the impact of climate change on air quality and human health using remote sensing and model simulations. Specifically, he has been working with the Oak Ridge National Lab to generate statistically and dynamically downscaled high-resolution regional climate projections under various RCPs. Using these simulation results, Dr. Liu conducted research to estimate spatially resolved population risks associated with extreme heat, wildfires, ambient air pollution such as ozone and PM2.5, and infectious diseases that can be attributed to future climate change. He has published 200+ peer-reviewed journal articles and contributed to two books in these areas. He is a member of the NASA MAIA and Terra MISR science team, the NASA Aura science team, a PI member of the NASA Air Quality Applied Science Team (AQAST) and the following Health and Air Quality Applied Science Team (HAQAST). He is a member of the Scientific Steering Committee of the WHO Platform on Air Quality and Health, and a GBD expert on ambient air pollution.

Gupta et al.: Boosting for regression transfer via importance sampling

Gupta S, Bi J, Liu Y, Wildani A. 2023. Boosting for regression transfer via importance sampling. Int J Data Sci Anal. https://doi.org/10.1007/s41060-023-00414-8.

Instance transfer learning methodologies are extremely efficient for continuous-valued, regression datasets. However, these methodologies can suffer negative transfer due to distribution shifts between the training and test data as well as skewness in training caused by the large-sampled source dataset. To mitigate this, we introduce S-TradaBoost.R2, a boosting-based instance transfer learning methodology that utilizes importance sampling to reduce the skewness in training and a balanced weighing approach for the distribution shift. We tested the performance of our approach on 8 standard regression datasets with varying complexities and found that S-TrAdaBoost.R2 performs better than the competitive transfer learning methodologies 63% of the time. Moreover, It also displayed a consistent performance as opposed to the sporadic results observed for other transfer learning methodologies.

Article link

Bi et al.: Combining Machine Learning and Numerical Simulation for High-Resolution PM2.5 Concentration Forecast

Forecasting ambient PM_2.5 concentrations with spatiotemporal coverage is key to alerting decision makers of pollution episodes and preventing detrimental public exposure. In this study, we developed a PM_2.5 forecast framework by combining the robust Random Forest algorithm with a publicly accessible global CTM forecast product, NASA’s Goddard Earth Observing System “Composition Forecasting” (GEOS-CF), providing spatiotemporally continuous PM_2.5 concentration forecasts for the next 5 days at a 1 km spatial resolution. Our forecast experiment was conducted for a region in Central China including the populous and polluted Fenwei Plain. Our model showed satisfactory forecast performance and substantially reduced the large biases in GEOS-CF. Our proposed framework requires minimal computational resources compared to running CTMs at urban scales, enabling near-real-time PM_2.5 forecast in resource-restricted environments.

Read online: https://doi.org/10.1021/acs.est.1c05578

Stowell et al.: Asthma exacerbation due to climate change-induced wildfire smoke in the Western US

Climate change and human activities have drastically altered the natural wildfire balance in the Western US and increased population health risks due to exposure to pollutants from fire smoke. Using dynamically downscaled climate model projections, we estimated additional asthma emergency room visits and hospitalizations due to exposure to smoke fine particulate matter (PM2.5) in the Western US in the 2050s. Under the ICLUS A2 population scenario, we estimated the smoke-related asthma events could increase at a rate of 15.1 visits per 10 000 persons in the Western US, with the highest rates of increased asthma (25.7–41.9 per 10 000) in Idaho, Montana, Oregon, and Washington. Estimated healthcare costs of smoke-induced asthma exacerbation to be over $1.5 billion during a single future fire season. Open access article link here.

Emory contributes to Lancet Countdown global report and US country brief

RSPH reported our group’s contribution to the 2020 Lancet Countdown global report in estimating population direct exposure to wildfires and population risks to wildfires. Since 2019, We have served as a reviewer of the Lancet Countdown US Country Brief and Emory has been an official sponsor of the Lancet Countdown US launch.

Liang et al.: The 17-y spatiotemporal trend of PM2.5 and its mortality burden in China

Estimation of the chronic health effects of PM_2.5 exposure has been hindered by the lack of long-term PM_2.5 data in China. To support this, high-performance machine-learning models were developed to estimate PM_2.5 concentrations at 1-km resolution in China from 2000 to 2016, based on satellite data, meteorological conditions, land cover information, road networks, and air pollution emission indicators. By adopting imputation techniques, relatively unbiased spatiotemporally continuous exposure estimates were generated. Annual mortality burdens attributable to long-term PM_2.5 exposure were estimated at the provincial scale, and the national total adult premature deaths were estimated at 30.8 million over the 17-y period in China.

This study was published in PNAS (link).

Geng et al.: Random forest models for PM2.5 speciation using MISR data

Random forest models were developed to predict ground-level daily PM2.5 speciation concentrations in California from MISR fractional AODs and other supporting data such as ground measurements, chemical transport model simulations, land use variables and meteorological fields. Sensitivity tests were also conducted to explore the influence of variable selection on model performance. Results shows that fractional AODs and total AOD have similar predicting power in estimating PM2.5 species if there are sufficient ground measurements and predictor data to support the most sophisticated model structure. Otherwise, models using fractional AODs outperform those with total AOD. PM2.5 speciation concentrations are more sensitive to land use variables than other supporting data such as CTM simulations and meteorological information.

This article was published in Environmental Research Letters in March 2020.

She et al.: Hourly PM2.5 levels during heavy winter episodes in the Yangtze River Delta

In this publication, we quantitatively investigated the feasibility of using the aerosol optical depth (AOD) data retrieved by the Geostationary Ocean Color Imager (GOCI) to estimate hourly PM_2.5 concentrations during winter haze episodes in the Yangtze River Delta (YRD). We developed a three-stage spatial statistical model, using GOCI AOD and fine mode fraction, as well as corresponding monitoring PM_2.5 concentrations, meteorological and land use data on a 6-km modeling grid with complete coverage in time and space. The 10-fold cross-validation R² was 0.72 with a regression slope of 1.01 between observed and predicted hourly PM_2.5 concentrations. We further analyzed two representative large regional episodes, i.e., a “multi-process diffusion episode” during December 21–26, 2015 and a “Chinese New Year episode” during February 7–8, 2016. We concluded that AOD retrieved by geostationary satellites could serve as a new valuable data source for analyzing the heavy air pollution episodes.

Publication link

Murray et al.: new method on PM2.5 Bayesian Ensemble Models

We develop a method to combine PM2.5 estimated from satellite-retrieved aerosol optical depth (AOD) and chemical transport model (CTM) simulations using statistical models. While most previous methods utilize AOD or CTM separately, we aim to leverage advantages offered by both data sources in terms of resolution and coverage using Bayesian ensemble averaging. Our approach differs from previous ensemble approaches in its ability to not only incorporate uncertainties in PM2.5 estimates from individual models but also to provide uncertainties for the resulting ensemble estimates. In an application of estimating daily PM2.5 in the Southeastern US, the ensemble approach outperforms previously developed spatial-temporal statistical models that use either AOD or bias-corrected CTM simulations in cross-validation (CV) analyses.

Publication Link

Our work on satellite-based UV exposure in the US got reported by NASA Observatory

This project was funded by the NASA Applied Science Program to generate secondary data product using NASA satellite data for societal benefits. Led by Emory, the research team include researchers from CDC and University of Iowa. The full publication led by CDC researchers can be found here.

Zhou et al.: Compilation and spatio-temporal analysis of publicly available total solar and UV irradiance data in the United States

Skin cancer is the most common type of cancer in the United States, the majority of which is caused by overexposure to ultraviolet (UV) irradiance, which is one component of sunlight. Our group worked with University of Iowa and the National Environmental Public Health Tracking Program at CDC to develop and disseminate county-level daily UV irradiance (2005 – 2015) and total solar irradiance (1991 – 2012) data for the contiguous United States (See intro on CDD Tracking webiste). These datasets are freely available at the CDC Tracking Portal. Trend analysis also showed that national annual average daily solar and UV irradiances increased significantly over the years by about 0.3% and 0.5% per year, respectively. These datasets can help us understand the spatial distributions and temporal trends of solar and UV irradiances, and allow for improved characterization of UV and sunlight exposure in future studies.

The Full publication link is here.

Emory Environmental Remote Sensing Group

Gangarosa Department of Environmental Health, Rollins School of Public Health, Emory University

Author: Yang Liu PhD