Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question]: How can I reproduce the FullAttention results on the Ruler dataset #87

Open
LfieLike opened this issue Nov 25, 2024 · 1 comment
Assignees
Labels
question Further information is requested

Comments

@LfieLike
Copy link

Describe the issue

Thank you for such a solid piece of work. I followed the instructions in the README and ran the following command:
bash run.sh gradientai/Llama-3-8B-Instruct-262k hf ./result

However, I couldn't reproduce the 4K context length results mentioned in the paper.
The results are as shown in the attached image.
Am I doing something wrong?
Image

 0              1              2              3                4                5                6                7                8      9     10     11    12    13

0 Tasks niah_single_1 niah_single_2 niah_single_3 niah_multikey_1 niah_multikey_2 niah_multikey_3 niah_multivalue niah_multiquery vt cwe fwe qa_1 qa_2
1 Score 100.0 100.0 100.0 100.0 100.0 100.0 83.0 100.0 100.0 94.4 93.33 64.0 44.0
2 Nulls 0/25 0/25 0/25 0/25 0/25 0/25 0/25 0/25 0/25 0/25 0/25 0/25 0/25

@LfieLike LfieLike added the question Further information is requested label Nov 25, 2024
@iofu728 iofu728 self-assigned this Nov 26, 2024
@iofu728
Copy link
Contributor

iofu728 commented Nov 26, 2024

Hi @LfieLike, thanks for your interest in MInference.

Indeed, we used this script to run the results, and they align quite closely with those in the RULER repo. It seems that QA1 and QA2 experienced a noticeable drop in points.

Could you try running the following command?

bash run.sh gradientai/Llama-3-8B-Instruct-262k minference ./result

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants