Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ
Tải xuống
Open issue trackers are a type of social media that has received relatively little attention from the text-mining community. We investigate the problems inherent in learning to triage bug reports from time-varying data. We demonstrate that concept drift is an important consideration. We show the effectiveness of online learning algorithms by evaluating them on several bug report datasets collected from open issue trackers associated with large open-source projects. We make this collection of data publicly available. . | Learning from evolving data streams online triage of bug reports Grzegorz Chrupala Spoken Language Systems Saarland University gchrupala@lsv.uni-saarland.de Abstract Open issue trackers are a type of social media that has received relatively little attention from the text-mining community. We investigate the problems inherent in learning to triage bug reports from time-varying data. We demonstrate that concept drift is an important consideration. We show the effectiveness of online learning algorithms by evaluating them on several bug report datasets collected from open issue trackers associated with large open-source projects. We make this collection of data publicly available. 1 Introduction There has been relatively little research to date on applying machine learning and Natural Language Processing techniques to automate software project workflows. In this paper we address the problem of bug report triage. 1.1 Issue tracking Large software projects typically track defect reports feature requests and other issue reports using an issue tracker system. Open source projects tend to use trackers which are open to both developers and users. If the product has many users its tracker can receive an overwhelming number of issue reports Mozilla was receiving almost 300 reports per day in 2006 Anvik et al. 2006 . Someone has to monitor those reports and triage them that is decide which component they affect and which developer or team of developers should be responsible for analyzing them and fixing the reported defects. An automated agent assisting the staff responsible for such triage has the potential to substantially reduce the time and cost of this task. 1.2 Issue trackers as social media In a large software project with a loose not strictly hierarchical organization standards and practices are not exclusively imposed top-down but also tend to spontaneously arise in a bottom-up fashion arrived at through interaction of individual developers testers and users. The .