Bug Detection and Repair Dataset with Exploratory Data Analysis examples and demo applications for Bug Detection and Repair algorithm.
Source code related tasks for machine learning have become important with the large need of software production. In this project our main goal is to create a dataset for bug detection and repair and also provide examples of code repair algorithms.
This repository contains the scripts used for dataset generation. We have also
added some exploratory data analysis notebooks for the generated datasets. The
bugnet
folder contains the script used to generate the BugNet dataset. The
repair-pipeline
folder contains the demo applications for the models trained
only on the Python code from BugNet. The aoc-dataset
folder contains the
source code used to generate the AoC dataset. The hint
folder
contains the source code used to generate the hint in natural language
of the bugs. The repair
folder contains the source code used to evaluate
different models on the data that we collected.
To install the dependencies for development create a venv:
python -m venv .venv
source .venv/bin/activate
make install
-
To run the repair pipeline see repair-pipeline
-
To generate the
BugNet
dataset see bugnet -
To visualize the
AoC
dataaset see aoc-dataset -
To visualize the results of the hint generation on the
BugNet
(ofAoC
) dataset see hint -
To visualize the result of the repair generation on the
BugNet
(orAoC
) dataset see repair