High PM2.5 in Bangkok during winter has recently caught public attention. Yet, the Thai government’s response seems slow and fails to provide any long-term solution. Moreover, back in 2019, government officials still disagree on what are the major sources of the air pollution in Bangkok. There are four possible sources of PM2.5: the temperature inversion effect, industrial activities, traffic, and agricultural burning. Identifying the major sources of the air pollution is key for finding the solutions. However, modeling PM2.5 levels is a multi-variable problem with a time lag, and finding an analytical solution is almost impossible, given limited historical data. In this talk, I will talk about my machine learning approach to this problem. I will go through the data-science steps, which are gathering data, exploratory data analysis, imputation, feature engineering, and modeling. I hope this talk could be useful for any aspiring data scientists, and all suggestions are welcome!