Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

'generator raised StopIteration' error when running 'randomstats' with multiple processes #377

Open
tparket opened this issue Nov 20, 2022 · 4 comments

Comments

@tparket
Copy link

tparket commented Nov 20, 2022

Hi,

First of all - thank you for your amazing work. pybedtools has been super useful for my research so far and I am very grateful.

I'm trying to run 'randomstats' with the following args:

results_dict = a.randomstats(b, iterations=1000, new=True, genome_fn=chromsizes_fn, processes=4, shuffle_kwargs={"chrom": True}, intersect_kwargs={"f": 1})

`---------------------------------------------------------------------------
StopIteration Traceback (most recent call last)
~/.local/lib/python3.7/site-packages/pybedtools/bedtool.py in parallel_apply(self, iterations, func, func_args, func_kwargs, processes, _orig_pool)
2932 for it in range(iterations):
-> 2933 yield func(*func_args, **func_kwargs)
2934 raise StopIteration

~/.local/lib/python3.7/site-packages/pybedtools/stats.py in random_intersection(x, y, genome_fn, shuffle_kwargs, intersect_kwargs)
16 result = len(zz)
---> 17 helpers.close_or_delete(z, zz)
18 return result

~/.local/lib/python3.7/site-packages/pybedtools/helpers.py in close_or_delete(*args)
547 if hasattr(x.fn, "throw"):
--> 548 x.fn.throw(StopIteration)
549

StopIteration:

The above exception was the direct cause of the following exception:

RuntimeError Traceback (most recent call last)
in

~/.local/lib/python3.7/site-packages/pybedtools/bedtool.py in randomstats(self, other, iterations, new, genome_fn, include_distribution, **kwargs)
2846 )
2847 distribution = self._randomintersection(
-> 2848 other, iterations=iterations, genome_fn=genome_fn, **kwargs
2849 )
2850

~/.local/lib/python3.7/site-packages/pybedtools/bedtool.py in _randomintersection(self, other, iterations, genome_fn, intersect_kwargs, _orig_pool, shuffle_kwargs, processes)
3038 ),
3039 processes=processes,
-> 3040 _orig_pool=_orig_pool,
3041 )
3042 )

RuntimeError: generator raised StopIteration`

The thing is that when I remove the 'processes' argument the 'randomstats' works just fine, but everytime I try to run it with 'processes' (even with a value of 1), I get the aformentioned error.

Other relevant data:

  • 'a' and 'b' are both bedtool objects generated from a df. A regular a.intersect(b, f=1) works perfectly.
  • 'chromsizes_fn' is the name of a genome file generated from a dict with:
    chromsizes_fn = pybedtools.chromsizes_to_file(chromsizes_dic, fn=temp_genome.name)
    I tried using both fn=False and fn=temp_genome.name
  • I tried to run it with both new=True and without it. It crashed on both tries.

I would really appreciate your help. I'm planning to run 'randomstats' on a large number of files, with at least 1000 iterations for each time, and being able to use multiprocessing will make it feasible.

@daler
Copy link
Owner

daler commented Nov 20, 2022

Great to hear you find pybedtools useful.

Can you provide an example of the files you're using for a and b so I can test locally?

@tparket
Copy link
Author

tparket commented Nov 21, 2022

Thanks for getting back to me so soon.
Please find the files* attached.

Archive.zip

*these are not the original files, but randomly generated intervals. Nevertheless, I’m getting the same errors.

@igoronzy
Copy link

igoronzy commented Aug 8, 2023

I'm getting the same error. Have there been any updates to fix this issue?

@bentyeh
Copy link

bentyeh commented Aug 9, 2023

Bumping this. Might be a Python versioning issue.

Prior to Python 3.7, the StopIteration raised by the generator (parallel_apply()) would have just signaled the end of the iteration. Starting in Python 3.7, a StopIteration raised by a generator is converted into a RuntimeError: see https://docs.python.org/3/library/exceptions.html#StopIteration

A workaround that seems to work for now is to:

  1. Comment out these 2 lines in the close_or_delete() function from helpers.py

    if hasattr(x.fn, "throw"):
        x.fn.throw(StopIteration)
    
  2. Replace the 2 instances (here and here) of raise StopIteration in BedTool.parallel_apply() with a simple return.

I'm happy to submit a pull request, but this may be part of a larger issue of dealing with Python versions in pybedtools.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants