The Computer Society
Addressing Data Biases in Machine Learning Models
Pages
18
Time to read
51 mins
Publication
Language
English
Pages
18
Time to read
51 mins
Publication
Language
English
This technical report examines the critical issue of data biases in machine learning (ML) models and their implications for accuracy and fairness. It discusses the limitations of traditional methods that often fail to address the root causes of these biases, leading to superficial solutions. The paper advocates for the integration of causal modeling as a means to enhance data cleaning, preparation, and quality management processes. By analyzing existing research, the report presents how causal reasoning can effectively identify and rectify data biases, thereby improving the integrity of ML models. It emphasizes the importance of addressing various types of data biases, including selection bias, measurement errors, and confounding variables, which can distort data distribution and lead to unfair outcomes. The report highlights the need for ongoing research into causal approaches to foster more reliable and unbiased data-driven technologies.