You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm trying to split training set and test set in a 80:20 ratio using predicate. And I got the following error:
/home/xzk/.local/lib/python3.7/site-packages/petastorm/hdfs/namenode.py:270: FutureWarning: pyarrow.hdfs.connect is deprecated as of 2.0.0, please use pyarrow.fs.HadoopFileSystem instead.
return pyarrow.hdfs.connect(hostname, url.port or 8020, **kwargs)
Worker 3 terminated: unexpected exception:
Traceback (most recent call last):
File "/home/xzk/.local/lib/python3.7/site-packages/petastorm/workers_pool/thread_pool.py", line 62, in run
self._worker_impl.process(*args, **kargs)
File "/home/xzk/.local/lib/python3.7/site-packages/petastorm/arrow_reader_worker.py", line 150, in process
all_cols = self._load_rows_with_predicate(parquet_file, piece, worker_predicate, shuffle_row_drop_partition)
File "/home/xzk/.local/lib/python3.7/site-packages/petastorm/arrow_reader_worker.py", line 258, in _load_rows_with_predicate
erase_mask = match_predicate_mask.map(operator.not_)
AttributeError: 'bool' object has no attribute 'map'
Iteration on Petastorm DataLoader raise error: AttributeError("'bool' object has no attribute 'map'")
Is this a bug? Or I'm using predicate in a wrong way? Please help, thank you!
My code:
def train_model(num_epochs=100, batch_size=1000):
for epoch in range(num_epochs):
with DataLoader(
make_batch_reader(dataset_url, num_epochs=reader_epochs, schema_fields=None,
transform_spec=None, seed=1, shuffle_rows=False, shuffle_row_groups=False,
predicate=in_pseudorandom_split([0.8, 0.2], 0, "some_column_name")),
batch_size=150) as dataloader:
for raw in dataloader:
print(raw)
break
The text was updated successfully, but these errors were encountered:
Hello team,
I'm trying to split training set and test set in a 80:20 ratio using
predicate
. And I got the following error:I notice that:
Where
do_include(...)
seems to returnbool
only.Is this a bug? Or I'm using predicate in a wrong way? Please help, thank you!
My code:
The text was updated successfully, but these errors were encountered: