Organizers
Sung Kim (chair)
MIT, USA
Ahmed E. Hassan
Queen's University, Canada
Michele Lanza
University of Lugano, Switzerland
Michael W. Godfrey
University of Waterloo, Canada
Jury
Thomas Zimmermann (U. of Calgary, Canada)
Marco D'Ambros (U. of Lugano, Switzerland)
Peter Weißgerber (U. of Trier, Germany)
Christian Bird (University of California, Davis)
Shivkumar Shivaji (Yahoo)
Miryung Kim (University of Washington)
Location
Co-located with ICSE 2008,
Leipzig, Germany
Sponsor
We are grateful to our generous sponsor, "Static Analysis, Software
Quality for C, C++, and Java" - Coverity Inc.
|
Submissions for Challenge are open!
Overview
This year's Working Conference on Mining Software Repositories
(MSR 2008) will host a mining challenge. The MSR Mining Challenge
brings together researchers and practitioners who are interested in
applying, comparing, and challenging their mining tools and approaches
on software repositories for an open source projects:
Eclipse.
There will be two challenge tracks: #1: general and
#2: prediction.
The winner of
each track will be given the Coverity MSR 2008 Challenge Award and
will also receive an iPod Nano and an iPod Touch respectively.
Challenge #1: General
In this category you can demonstrate the usefulness of your mining
tools. The main task will be to find interesting insights by analyzing
the software repositories of Eclipse. Eclipse
is large in size, several years mature, and provides lots of input for
mining tools.
The idea of this track is that by using the same data set, and the data set serves as a
benchmark set. In this challenge, researchers can compare results
to each other. Submissions are limited to 4 pages and will be included in the MSR proceedings.
Participation is straightforward:
- Select your mining area (one of bug analysis, change analysis, architecture and design, process analysis, team structure).
- Get project data of Eclipse.
- Formulate your mining questions.
- Use your mining tool(s) to answer them.
- Write up and submit your 4-page challenge report.
The challenge report should describe the results of your work
and cover the following aspects: questions addressed, input data,
approach and tools used, derived results and interpretation of them,
and conclusions. Keep in mind that the report will be evaluated by
juries. Reports must be at most 4 pages long
and in the ICSE
format.
Data
Feel free to use any data source for the Mining
Challenge. For your convenience, we provide mirrors for some of the
repositories of Eclipse.
- Eclipse CVS repository:
- Eclipse Bugzilla export (in XML):
Challenge #2: Predict
This year, the MSR Mining Challenge will have a special task:
For Eclipse, predict the number of bugs per each module that will be reported between 2008/2/7 and 2008/5/7 (both days included).
Suppose we are interested in some JDT modules, and the numbers of bug reports between 2001/10/10 and 2007/12/14.
Your job is predicting the future bug numbers (per modules) using all possible resources such as previous bug report numbers, change numbers, or your intuition.
(The numbers of bug reports are counted using a Java program at http://bugminer.googlecode.com/svn/trunk/. Feel free to check it out and use it for your project.)
Components | Bug reports from 2001/10/10 to 2007/12/14 | Bug reports from 2008/2/7 to 2008/5/7 |
---|
JDT.APT | 246 | ? |
JDT.Core | 10880 | ? |
JDT.Debug | 6657 | ? |
... | ... | ? |
Participation is as follows:
- Pick a team name, e.g., SCHNITZL.
- Come up with predictions for bug reports based on some criteria or prediction model. A very simple model is for instance the number of past changes/bugs.
- Annotate the corresponding files with your predictions
- Predict the numbers of bug reports for these modules components.html.
- Write a paragraph (max 200 words) that describes how you computed your predictions.
- Submit everything before Feb 7(Apia time) by email to hunkim@csail.mit.edu.
Obviously, the team with the best predictions will win. However, to
increase the competition, we will organize a set of "benchmark"
predictions.
Bug Prediction
The predictions for bugs should be on the component level. A
component is specified directly in the bug reports. For instance bug report
42233 was reported for the component "UI" of the product
"JDT". For the challenge, we will consider the core products of
Eclipse: Equinox, JDT, PDE, Platform. A complete list of relevant
products and components is in the file components.html. Note, that we will not
remove duplicates from the final counts.
Frequently Asked Questions
-
Do I need to give a presentation at the MSR conference?
For challenge #1, the jury will select finalists that are expected to
give a short presentation at the conference. Then the audience will
select a winner. For challenge #2, there is no presentation at the
conference. The winners will be determined with statistical methods
(correlation analysis) and announced at the conference.
-
Does the challenge report have to be four
pages? — No, of course you can submit less than four
pages. The page limit was set to ease the presentation of
space-intensive results such as visualizations.
-
Wow, Eclipse data is soooo big! My tool won't finish in
time. What can I do? — Just run your tool on a subset
of the projects. For instance, you could use JDT for Eclipse.
Especially when you are doing
visualizations, it is almost impossible to show everything.
-
Predicting bugs? But, I have no clue how to build prediction models. — That's the fun thing about this
category: you don't need to build sophisticated models. Of course, some
people will, but others will just build simple predictors. In the end,
we will see (a) whether we can predict future development events and
(b) who does it best.
-
My cat is a visionary...can I submit its predictions or is the
challenge #2 only for tools? — Of course, go ahead and
submit its predictions as a benchmark. However, your cat will run out of
competition—only predictions generated by tools or by humans in
a systematic way are eligible to win challenge #2.
-
For the challenge #2-predict, is it acceptable if our team
submit more than one prediction file? — Only one submission from a team (person) is allowed.
Important Dates
- Submission of reports and predictions: 7th February (Apia time)
- Acceptance notification: 14th February 2008
- Camera-ready: 21st February 2008
- Conference date: 10-11 May 2008
Previous Challenges
|