Machine Learning Model Predicts Indoor Ozone Exposure Using Accessible Environmental and Behavioral Data
TL;DR
Researchers developed a machine learning model that predicts indoor ozone exposure, giving public health officials an advantage in targeting interventions for vulnerable populations.
The model uses random forest algorithms with outdoor ozone, meteorological data, and window-opening behavior to predict hourly indoor concentrations across 18 Chinese cities.
This research helps create healthier indoor environments by accurately assessing ozone exposure, potentially reducing health risks for people who spend most of their time inside.
Indoor ozone levels are 40% lower than outdoors during the day, and window-opening behavior significantly impacts exposure, revealed by this innovative machine learning study.
Found this article helpful?
Share it with your network and spread the knowledge!

A new study has developed the first large-scale machine learning model capable of predicting hourly indoor ozone concentrations using easily accessible predictors, including outdoor ozone levels, meteorological conditions, and window-opening behavior. This advancement addresses a critical gap in air pollution exposure assessment, as people typically spend 70% to 90% of their time indoors, where actual ozone levels differ from outdoor measurements.
Ozone is a key air pollutant formed by chemical reactions between nitrogen oxides and volatile organic compounds under sunlight. In 2021, long-term ozone exposure contributed to nearly 490,000 deaths worldwide. Traditional exposure assessments have relied heavily on outdoor data, but ventilation, indoor sources, and building materials significantly affect indoor ozone levels. Previous modeling approaches, including mechanistic models requiring detailed indoor parameters and linear regression models struggling with nonlinear relationships, have been limited in large-scale applications.
Researchers from Fudan University and the Chinese Academy of Sciences built a machine learning model to predict hourly indoor ozone levels across 18 Chinese cities. The study, published in Eco-Environment & Health on July 9, 2025, used random forest algorithms trained on low-cost sensor measurements combined with meteorological and ventilation data. The research is available at https://doi.org/10.1016/j.eehl.2025.100170.
The team collected over 8,200 hours of indoor ozone data using portable electrochemical sensors in 23 households. Predictor variables included outdoor ozone levels from high-resolution datasets, meteorological parameters such as temperature, humidity, wind, solar radiation, boundary-layer height, and surface pressure, and window-opening status recorded manually by volunteers. By comparing two models—one excluding and one including window status—the researchers demonstrated that incorporating ventilation behavior significantly improved prediction accuracy.
Including window behavior raised cross-validation R² from 0.80 to 0.83 and lowered root mean square error from 7.89 to 7.21 parts per billion. The model accurately captured hourly ozone fluctuations and regional differences, performing better in southern than northern China and in cold rather than warm seasons. Predictor-importance analysis identified surface pressure, temperature, and ambient ozone as dominant factors, with ventilation emerging as a crucial behavioral determinant. Diurnal comparisons revealed indoor ozone concentrations were 40% lower than outdoor levels during daytime hours.
"Most exposure studies still rely on outdoor ozone data, but that's only half the story," said Prof. Xia Meng, senior author of the study. "Our findings show that ventilation behavior—something as simple as whether a window is open or closed—can change exposure dramatically. By integrating such behavioral data with meteorological information through machine learning, we can finally estimate indoor ozone more precisely at large scales."
This research introduces a practical, low-cost strategy for predicting indoor ozone exposure in real time across large geographic areas. The model can be integrated into health-risk assessments, smart-home monitoring systems, and public-health surveillance platforms, enabling policymakers and scientists to better understand indoor-outdoor exposure differences. Future work could extend the framework to other pollutants such as fine particulate matter or nitrogen dioxide, incorporate smart sensors for automated window tracking, and expand monitoring to diverse climatic zones. This machine-learning approach bridges environmental modeling with daily life, promoting healthier indoor environments in rapidly urbanizing regions.
Curated from 24-7 Press Release

