Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automate Model Solution Evaluation Against Official Solutions #12

Open
lgabs opened this issue Nov 12, 2024 · 0 comments
Open

Automate Model Solution Evaluation Against Official Solutions #12

lgabs opened this issue Nov 12, 2024 · 0 comments
Labels
help wanted Extra attention is needed

Comments

@lgabs
Copy link
Owner

lgabs commented Nov 12, 2024

Automate the evaluation process of model-generated solutions by comparing them against official solutions. This would streamline the validation and accuracy assessment. Yet, final human review would be still be necessary to avoid mistakes, but would be a much more faster phase for reviewers.

An simple and efficient process would be:

  • store answers in a subfolder of exam_path to be solved
  • during the solving process, at the end of each question, make a prompt asking to gpt-4o (cheaper compared to o1) to compare model's answer and offical answer and produce a concise result, much like langchain correctness prompts; save this comparison
  • when compiling the pdf, put the comparison after the model's answer for easy assessment
@lgabs lgabs added the help wanted Extra attention is needed label Nov 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

1 participant