The Effects of Content-Based Features on Improving Code Review Automation

Sadri, Marzieh; Fazli, Mohammad Amin

Please enable javascript in your browser.

The Effects of Content-Based Features on Improving Code Review Automation

Sadri, Marzieh | 2024

0 Viewed

Type of Document: M.Sc. Thesis
Language: Farsi
Document No: 57087 (19)
University: Sharif University of Technology
Department: Computer Engineering
Advisor(s): Fazli, Mohammad Amin
Abstract:
In the world of software development, Code Review is one of the most vital processes to ensure code quality and security. The textual content features in code review comments play a significant role in assessing quality and guiding the review process. This research aims to examine the importance and role of these features in identifying anti-social comments and improving code review processes. In this study, we first challenge the concept of toxicity in code review comments, which had previously been accepted as a concept in the field of code review. We focus on enhancing and automating code review processes by accurately and reliably detecting anti-social comments based on relevant features. To achieve our research goals, various methods were employed. Initially, hypothesis tests were used to challenge the comprehensiveness of the toxicity concept. Then, we used other statistical tests like analysis of variance (ANOVA) to investigate and discover relationships between anti-social features. The relationships found between anti-social features were also examined from a psychological perspective. Finally, using classical machine learning models, ensemble learning, and neural networks, we trained and evaluated model accuracy in detecting anti-social comments. The results of hypothesis tests showed that more than 20% of comments previously labeled as non-toxic are indeed anti-social, confirming the lack of comprehensiveness of the toxicity concept in code review. Additionally, the developed models in this research were able to accurately identify approximately 83.4% of anti-social comments. This research takes a significant step in rejecting the concept of toxicity and providing accurate models for detecting anti-social comments, contributing to the improvement of code review processes. The findings of this research can be utilized to enhance automated code review tools and methods, improving the efficiency and effectiveness of software development teams. This research can be seen as a fundamental step in improving culture and communication in software development environments
Keywords:
Machine Learning ; Code Review ; Toxic Comments Detection ; Anti-Social Comments

Digital Object List

محتواي کتاب
view

Bookmark

مقدمه
- تعریف مسئله
- اهمیت موضوع
- ساختار پایان‌نامه
مفاهیم اولیه
- بازبینی‌کد
- ویژگی‌های محتوایی در بازبینی‌کد
- مفهوم ضداجتماعی بودن
  - حملات شخصی
  - تهدید یا ارعاب
  - تمسخر
  - عدم شفافیت
  - دلسرد کردن بدون راهکار
  - بی توجهی به زمان یا مرزهای دیگران
  - سوگیری ناخودآگاه
  - نگرش طرد کننده
  - کنترل بیش از حد
- مدل‌های یادگیری ماشین
  - مدل‌های کلاسیک یادگیری ماشین
  - مدل‌های یادگیری جمعی
  - مدل‌های مبتنی بر شبکه عصبی
کارهای پیشین
- مقدمه
- خودکارسازی بازبینی‌کد
- تحلیل‌های محتوایی در بازبینی کد
  - تحلیل احساسات
  - تحلیل تعارض
  - تحلیل سردرگمی
  - تحلیل مفیدبودن
- شناسایی و تحلیل نظرات ضداجتماعی
  - شناسایی و تحلیل نظرات سمی
  - شناسایی و تحلیل انتقاد غیرسازنده
  - شناسایی و تحلیل بی‌ادبی
- نتیجه‌گیری
دادگان
- مقدمه
- جمع‌آوری و برچسب‌زنی دادگان
  - معرفی دادگان اولیه
  - جمع‌آوری و برچسب‌زنی دادگان با استفاده از برنامه وب
- پیش‌پردازش دادگان
  - کوچک‌سازی حروف
  - حذف نشانی اینترنتی
  - گسترش اختصارات
  - حذف کلمات توقف
  - حذف نمادها
  - حذف تکرارها
  - شناسایی الگوهای مخالف
  - تجزیه شناسه‌ها
  - حذف کلیدواژه‌های برنامه‌نویسی
- معرفی و بررسی دادگان
  - توصیف آماری داد‌گان
  - بررسی نحوه توزیع کلمات و عبارات در کلاس‌های مختلف
  - نمایش دو بعدی داد‌گان بااستفاده از کاهش ابعاد
- نتیجه‌گیری
روش تحقیق
- مقدمه
- بررسی جامعیت مفهوم سمی‌بودن و ارتباط بین ویژگی‌های رفتاری ضداجتماعی در نظرات بازبینی‌ کد
  - بررسی جامعیت مفهوم سمی‌بودن
  - شرح آزمون
- تحلیل آماری ویژگی‌های ضداجتماعی
  - آزمون تحلیل واریانس
  - آزمون HSD
  - نمایش گرافیکی
- تحلیل روانشناختی ارتباط بین ویژگی‌های رفتاری
  - رویکرد روان‌شناختی به ویژگی‌ها
  - معرفی اصطلاح SID
- آموزش مدل برای پیش‌بینی نظرات ضداجتماعی
  - تبدیل نظرات بازبینی‌کد به بردار
  - آموزش مدل برای پیش‌بینی هر یک از ویژگی‌های رفتاری
  - نمونه برداری
- نتیجه‌گیری
ارزیابی
- مقدمه
- معیارهای ارزیابی
- ارزیابی مدل‌ها
  - تهدید یا ارعاب
  - مسخره کردن
  - عدم شفافیت
  - دلسردکردن بدون ارائه راهکار
  - SID
- نتیجه‌گیری
  - ارزیابی مدل کلی براساس منفی نادرست
  - ارزیابی مدل کلی براساس مثبت نادرست
نتیجه‌گیری و کارهای آتی
- مقدمه
- نتیجه‌گیری
- کارهای آتی
مراجع
واژه‌نامه

Friend's email
Your name
Your email
enter code