[Question]: How can I reproduce the FullAttention results on the Ruler dataset #87

LfieLike · 2024-11-25T06:11:55Z

Describe the issue

Thank you for such a solid piece of work. I followed the instructions in the README and ran the following command:
bash run.sh gradientai/Llama-3-8B-Instruct-262k hf ./result

However, I couldn't reproduce the 4K context length results mentioned in the paper.
The results are as shown in the attached image.
Am I doing something wrong?

 0              1              2              3                4                5                6                7                8      9     10     11    12    13

0 Tasks niah_single_1 niah_single_2 niah_single_3 niah_multikey_1 niah_multikey_2 niah_multikey_3 niah_multivalue niah_multiquery vt cwe fwe qa_1 qa_2
1 Score 100.0 100.0 100.0 100.0 100.0 100.0 83.0 100.0 100.0 94.4 93.33 64.0 44.0
2 Nulls 0/25 0/25 0/25 0/25 0/25 0/25 0/25 0/25 0/25 0/25 0/25 0/25 0/25

The text was updated successfully, but these errors were encountered:

iofu728 · 2024-11-26T09:19:48Z

Hi @LfieLike, thanks for your interest in MInference.

Indeed, we used this script to run the results, and they align quite closely with those in the RULER repo. It seems that QA1 and QA2 experienced a noticeable drop in points.

Could you try running the following command?

bash run.sh gradientai/Llama-3-8B-Instruct-262k minference ./result

LfieLike added the question Further information is requested label Nov 25, 2024

iofu728 self-assigned this Nov 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Question]: How can I reproduce the FullAttention results on the Ruler dataset #87

[Question]: How can I reproduce the FullAttention results on the Ruler dataset #87

LfieLike commented Nov 25, 2024

iofu728 commented Nov 26, 2024

[Question]: How can I reproduce the FullAttention results on the Ruler dataset #87

[Question]: How can I reproduce the FullAttention results on the Ruler dataset #87

Comments

LfieLike commented Nov 25, 2024

Describe the issue

iofu728 commented Nov 26, 2024