Detecting and Mitigating Gender Bias in Language Models

Khoshtinat, Ali; Beigy, Hamid

Please enable javascript in your browser.

Detecting and Mitigating Gender Bias in Language Models

Khoshtinat, Ali | 2025

0 Viewed

Type of Document: M.Sc. Thesis
Language: Farsi
Document No: 58306 (19)
University: Sharif University of Technology
Department: Computer Engineering
Advisor(s): Beigy, Hamid
Abstract:
Recent advancements in deep learning methods have led to significant progress in Language Models. However, training these models on vast amounts of real-world and internet data has resulted in gender bias. Given the increasing application of these models, identifying and mitigating this bias is of particular importance. Previous efforts to address this issue often required extensive datasets, long training times, and heavy hardware resources, which also led to the forgetting of the model’s prior knowledge. Furthermore, existing evaluation metrics only assessed bias across the entire dataset and did not consider different topics separately. Therefore, the dependency of these metrics on various topics and different datasets is also unclear. This could result in a seemingly good model output based on the metric, but poor performance on specific tasks. Thus, the most significant challenges in this area are providing appropriate methods for evaluating language model bias across different topics within datasets, and proposing methods to reduce bias and re-evaluate bias in these models with fewer time and hardware resources, without compromising model accuracy. To overcome these challenges, we first quantitatively evaluated the gender bias of language models using specific metrics and investigated the impact of different topics on this bias. We also thoroughly examined the dependency of these metrics on the datasets used. Additionally, by modifying existing metrics, we provided not only the overall model results but also results based on the importance and influence of each topic on the level of bias, which makes the metric output closer to reality. Subsequently, we proposed a method that, by utilizing Adapters and a new loss function for reducing gender bias, achieved a nearly 20% reduction in BERT model bias with a 99% reduction in parameters. Despite using fewer parameters compared to previous methods, we were able to reduce bias by 10% more than prior approaches
Keywords:
Language Bias ; Bias in Artificial Intelligence ; Language Model ; Natural Language Processing ; Deep Learning ; Adapter

Digital Object List

محتواي کتاب
view

Bookmark

Friend's email
Your name
Your email
enter code