You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
thanks for providing the dataset as a download. I downloaded the dataset from the location mentioned in #12 (comment)
But it appears that the format of the dataset is different from the files you receive if you dowload the data yourself.
See this gist, the first file 12092740.data I downloaded myself from archive.org, while the second file was part of the dowloaded dataset.
As you can see the downloaded file contains the attributes [XSUM]URL[XSUM], [XSUM]INTRODUCTION[XSUM] and [XSUM]RESTBODY[XSUM]. But the file from the dataset has [SN]URL[SN], [SN]TITLE[SN], [SN]FIRST-SENTENCE[SN] and [SN]RESTBODY[SN].
Hi,
thanks for providing the dataset as a download. I downloaded the dataset from the location mentioned in #12 (comment)
But it appears that the format of the dataset is different from the files you receive if you dowload the data yourself.
See this gist, the first file
12092740.data
I downloaded myself from archive.org, while the second file was part of the dowloaded dataset.As you can see the downloaded file contains the attributes
[XSUM]URL[XSUM]
,[XSUM]INTRODUCTION[XSUM]
and[XSUM]RESTBODY[XSUM]
. But the file from the dataset has[SN]URL[SN]
,[SN]TITLE[SN]
,[SN]FIRST-SENTENCE[SN]
and[SN]RESTBODY[SN]
.My problem is that if I follow the tutorial at https://github.com/EdinburghNLP/XSum/tree/master/XSum-Dataset the scripts don't work with the unmodified files.
Which changes do I need to make to the scripts?
Best,
Pyfisch
The text was updated successfully, but these errors were encountered: