Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run SWE-agent on Commit0 #843

Draft
wants to merge 13 commits into
base: main
Choose a base branch
from

Conversation

nanjiangwill
Copy link

@nanjiangwill nanjiangwill commented Nov 27, 2024

Reference Issues/PRs

Commit0 is a from scratch AI coding challenge.

The benchmark consists of 57 core Python libraries. The challenge is to rebuild these libraries and pass their unit tests. All libraries have:

This PR adds necessary functions to run commit0 with SWE-agent but runs into bugs

Issues

  1. Without using any tool or command line and no hint in prompt/context, in response SWE-agent will just do "calling cmd and self-answered with wrong output", example in commit0_prepare/traj_example/97af1e.traj
  2. sometimes, it will even generate(instead of actual running pytest) the test result

To Reproduce

  1. run python commit0_prepare/generate_prompt_for_each_repo.py
  2. run commit0_prepare/run_commit0.sh with API key

What is changed in source code

  1. we use existing docker and have SWE-agent run on it, so we do not need to have --repo_path argument since files are already in docker, that is why we specify --repo_path "commit0_prepare/empty_repo"
  2. we are using uv in docker, so the env setup code is changed
  3. commit0 docker need to run with platform flag specified

@nanjiangwill nanjiangwill marked this pull request as draft November 27, 2024 21:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant