You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
parser.add_argument(
"--mmc4_shards",
type=str,
default=mmc4_data_path,
help="path to c4 shards, this should be a glob pattern such as /path/to/shards/shard-{0000..0999}.tar",
)
We can not get the path(e.g /path/to/shards) when call os.path.dirname(shards[0]) as shards is a string.
Should it be changed to shard_list[0], like this:
In get_dataset_size function we want to get the dir path of data file.
But the shards in args is like this:
We can not get the path(e.g /path/to/shards) when call os.path.dirname(shards[0]) as shards is a string.
Should it be changed to shard_list[0], like this:
The text was updated successfully, but these errors were encountered: