Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run SWE-agent on Commit0 #843

Draft
wants to merge 13 commits into
base: main
Choose a base branch
from
114 changes: 114 additions & 0 deletions commit0_prepare/default_from_url_commit0.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@
system_template: |-
SETTING: You are an autonomous programmer, and you're working directly in the command line with a special interface.

The special interface consists of a file editor that shows you {WINDOW} lines of a file at a time.
In addition to typical bash commands, you can also use the following commands to help you navigate and edit files.

COMMANDS:
{command_docs}

Please note that THE EDIT COMMAND REQUIRES PROPER INDENTATION.
If you'd like to add the line ' print(x)' you must fully write that out, with all those spaces before the code! Indentation is important and code that is not indented correctly will fail and require fixing before it can be run.

RESPONSE FORMAT:
Your shell prompt is formatted as follows:
(Open file: <path>) <cwd> $

You need to format your output using two fields; discussion and command.
Your output should always include _one_ discussion and _one_ command field EXACTLY as in the following example:
DISCUSSION
First I'll start by using ls to see what files are in the current directory. Then maybe we can look at some relevant files to see what they look like.
```
ls -a
```

You should only include a *SINGLE* command in the command section and then wait for a response from the shell before continuing with more discussion and commands. Everything you include in the DISCUSSION section will be saved for future reference.
If you'd like to issue two commands at once, PLEASE DO NOT DO THAT! Please instead first submit just the first command, and then after receiving a response you'll be able to issue the second command.
You're free to use any other bash commands you want (e.g. find, grep, cat, ls, cd) in addition to the special commands listed above.
However, the environment does NOT support interactive session commands (e.g. python, vim), so please do not invoke them.
instance_template: |-
Here is your task:

You need to complete the implementations for all functions (i.e., those with pass
statements) and pass the unit tests.

Do not change the names of existing functions or classes, as they may be referenced
from other code like unit tests, etc.

When you generate code, you must maintain the original formatting of the function
stubs (such as whitespaces), otherwise we will not able to search/replace blocks
for code modifications, and therefore you will receive a score of 0 for your generated
code.

Do not install any package as you are in a docker that has all the dependencies installed.

The source dir you need to edit is in: /testbed/{src_dir}

Here is the test you can use: {test_cmd} /testbed/{test_dir}

The first ever thing to do before you do anything else is to go to /testbed and do: {git_reset}

IMPORTANT TIPS:
1. If you run a command and it doesn't work, try running a different command. A command that did not work once will not work the second time unless you modify it!

2. If you open a file and need to get to an area around a specific line that is not in the first 100 lines, say line 583, don't just use the scroll_down command multiple times. Instead, use the goto 583 command. It's much quicker.

3. Always make sure to look at the currently open file and the current working directory (which appears right after the currently open file). The currently open file might be in a different directory than the working directory! Note that some commands, such as 'create', open files, so they might change the current open file.

4. When editing files, it is easy to accidentally specify a wrong line number or to write code with incorrect indentation. Always check the code after you issue an edit to make sure that it reflects what you wanted to accomplish. If it didn't, issue another command to fix it.

5. You should keep editing files until you pass the tests. Do not discard the changes you made to the code immediately or easily. It normally takes multiple iterations to get the code right.

6. Even eventually you did not pass the tests, partial progress is still better than no progress. So do not discard partial progress easily.


(Open file: {open_file})
(Current directory: {working_dir})
bash-$
next_step_template: |-
{observation}
(Open file: {open_file})
(Current directory: {working_dir})
bash-$
next_step_no_output_template: |-
Your command ran successfully and did not produce any output.
(Open file: {open_file})
(Current directory: {working_dir})
bash-$
demonstration_template: |
Here is a demonstration of how to correctly accomplish this task.
It is included to show you how to correctly use the interface.
You do not need to follow exactly what is done in the demonstration.
--- DEMONSTRATION ---
{demonstration}
--- END OF DEMONSTRATION ---
state_command:
name: state
code: |
state() {
local working_dir="/testbed";
if [ -z "$CURRENT_FILE" ]; then
echo '{"open_file": "n/a", "working_dir": "'$working_dir'"}';
else
echo '{"open_file": "'$(realpath "$CURRENT_FILE")'", "working_dir": "'$working_dir'"}';
fi
};
parse_function: ThoughtActionParser
env_variables:
WINDOW: 100
OVERLAP: 2
CURRENT_LINE: 0
CURRENT_FILE: ''
SEARCH_RESULTS: ()
SEARCH_FILES: ()
SEARCH_INDEX: 0
command_files:
- config/commands/defaults.sh
- config/commands/search.sh
- config/commands/edit_linting.sh
- config/commands/_split_string.py
- config/commands/commit0/{submit.sh}
parse_command: ParseCommandDetailed
history_processor: Last5Observations
demonstrations:
- trajectories/demonstrations/replay__marshmallow-code__marshmallow-1867__default__t-0.20__p-0.95__c-2.00__install-1___install_from_source/marshmallow-code__marshmallow-1867.traj
1 change: 1 addition & 0 deletions commit0_prepare/empty_repo
Submodule empty_repo added at 2ea339
155 changes: 155 additions & 0 deletions commit0_prepare/generate_prompt_for_each_repo.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,155 @@
from __future__ import annotations

import os
import shutil

from datasets import load_dataset


def copy_to_subdirs(source_file: str, target_dir: str) -> None:
"""
Copy a file to all subdirectories in a given directory.

Args:
source_file: Path to the file to copy
target_dir: Directory containing subdirectories to copy to
"""
# Verify source file exists
if not os.path.isfile(source_file):
raise FileNotFoundError(f"Source file {source_file} does not exist")

os.makedirs(target_dir, exist_ok=True)
# Verify target directory exists
if not os.path.isdir(target_dir):
raise NotADirectoryError(f"Target directory {target_dir} does not exist")

# Get source filename
filename = os.path.basename(source_file)

# Get all immediate subdirectories
subdirs = [
"simply",
"wcwidth",
"parsel",
"chardet",
"minitorch",
"tinydb",
"deprecated",
"voluptuous",
"cachetools",
"imapclient",
"marshmallow",
"jinja",
"cookiecutter",
"portalocker",
"pyjwt",
"babel",
"statsmodels",
"python-progressbar",
"xarray",
"imbalanced-learn",
"web3.py",
"scrapy",
"seaborn",
"pypdf",
"pexpect",
"pytest",
"pylint",
"joblib",
"dulwich",
"virtualenv",
"networkx",
"requests",
"sphinx",
"jedi",
"moviepy",
"loguru",
"paramiko",
"geopandas",
"bitstring",
"fastapi",
"tornado",
"python-prompt-toolkit",
"attrs",
"PyBoy",
"pydantic",
"filesystem_spec",
"tlslite-ng",
"graphene",
"mimesis",
"dnspython",
"python-rsa",
"more-itertools",
"click",
"fabric",
"flask",
"sqlparse",
]

# Copy file to each subdirectory
for idx, subdir in enumerate(subdirs):
target_path = os.path.join(target_dir, subdir, filename.replace(".md", f"_{subdir}.md"))
os.makedirs(os.path.dirname(target_path), exist_ok=True)
try:
shutil.copy2(source_file, target_path)
print(f"Copied {filename} to {subdir}")
except Exception as e:
print(f"Failed to copy to {subdir}: {str(e)}")


commit0_dataset = load_dataset("wentingzhao/commit0_combined", split="test")

file = "commit0_prepare/my_issue.md"
target_dir = "commit0_prepare/repos/"

copy_to_subdirs(file, target_dir)

# Create directory if it doesn't exist
os.makedirs("config/commands/commit0/", exist_ok=True)

for i in commit0_dataset:
repo_name = i["repo"].split("/")[1]
test_cmd = i["test"]["test_cmd"]
test_dir = i["test"]["test_dir"]
src_dir = i["src_dir"]
base_commit = i["base_commit"]
reset_cmd = f"git reset --hard {base_commit}"
# submit_cmd = f"`git diff {base_commit} -- . ':(exclude)spec.pdf.bz2' > /patch.diff`"

submit_sh = f"{repo_name}_submit.sh"
# Read the default yaml file
with open("commit0_prepare/default_from_url_commit0.yaml") as f:
yaml_content = f.read()

yaml_content = yaml_content.replace("{src_dir}", f"{src_dir}")
# Replace {test_cmd} with actual test command and directory
yaml_content = yaml_content.replace("{test_cmd}", f"{test_cmd}")
yaml_content = yaml_content.replace("{test_dir}", f"{test_dir}")
yaml_content = yaml_content.replace("{git_reset}", f"{reset_cmd}")
# yaml_content = yaml_content.replace('{submit_cmd}', f'{submit_cmd}')
yaml_content = yaml_content.replace("{submit.sh}", f"{submit_sh}")

submit_path = f"config/commands/commit0/{repo_name}_submit.sh"

# Create submit.sh content for this repo
submit_content = f"""# @yaml
# signature: submit
# docstring: submits your current code and terminates the session
submit() {{
cd /testbed

git add -A
git diff {base_commit} -- . ':(exclude)spec.pdf.bz2' > model.patch
echo "<<SUBMISSION||"
cat model.patch
echo "||SUBMISSION>>"
}}
"""
# Write submit.sh file
with open(submit_path, "w") as f:
f.write(submit_content)
# Create new yaml file for this repo
output_path = f"config/commit0/prompt/{repo_name}.yaml"
os.makedirs(os.path.dirname(output_path), exist_ok=True)
with open(output_path, "w") as f:
f.write(yaml_content)
12 changes: 12 additions & 0 deletions commit0_prepare/my_issue.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
Here is your task:

You need to complete the implementations for all functions (i.e., those with pass
statements) and pass the unit tests.

Do not change the names of existing functions or classes, as they may be referenced
from other code like unit tests, etc.

When you generate code, you must maintain the original formatting of the function
stubs (such as whitespaces), otherwise we will not able to search/replace blocks
for code modifications, and therefore you will receive a score of 0 for your generated
code.
78 changes: 78 additions & 0 deletions commit0_prepare/run_commit0.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
#!/bin/bash

export ANTHROPIC_API_KEY=XXX
repos=(
"simpy"
"wcwidth"
"parsel"
"chardet"
"minitorch"
"tinydb"
"deprecated"
"voluptuous"
"cachetools"
"imapclient"
"marshmallow"
"jinja"
"cookiecutter"
"portalocker"
"pyjwt"
"babel"
"statsmodels"
"python-progressbar"
"xarray"
"imbalanced-learn"
"web3.py"
"scrapy"
"seaborn"
"pypdf"
"pexpect"
"pytest"
"pylint"
"joblib"
"dulwich"
"virtualenv"
"networkx"
"requests"
"sphinx"
"jedi"
"moviepy"
"loguru" #####
"paramiko"
"geopandas"
"bitstring"
"fastapi"
"tornado"
"python-prompt-toolkit"
"attrs"
"PyBoy"
"pydantic"
"filesystem_spec"
"tlslite-ng"
"graphene"
"mimesis"
"dnspython"
"python-rsa"
"more-itertools"
"click"
"fabric"
"flask"
"sqlparse"
)

for repo in "${repos[@]}"; do
echo "Processing $repo..."

python run.py \
--model_name claude-3-5-sonnet-20240620 \
--data_path "commit0_prepare/repos/$repo/my_issue_$repo.md" \
--repo_path "commit0_prepare/empty_repo" \
--config_file "config/commit0/prompt/$repo.yaml" \
--image_name wentingzhao/$repo:v0 \
--per_instance_cost_limit 1.00 \
--apply_patch_locally > "log/commit0/$repo.log" 2>&1

echo "Completed $repo"
done

echo "All repos processed"
Loading