Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

from_ms should do something useful with replicates #31

Open
grahamgower opened this issue Jul 10, 2021 · 3 comments
Open

from_ms should do something useful with replicates #31

grahamgower opened this issue Jul 10, 2021 · 3 comments
Labels
enhancement New feature or request

Comments

@grahamgower
Copy link
Member

grahamgower commented Jul 10, 2021

Currently, from_ms() just converts the first replicate. It might be nice to instead return an iterator over the tree sequences, like when one calls msprime.simulate() with num_replicates>1.

$ mspms 3 2 -T -r 5 6 -seeds 1 2 3
/home/grg/.local/bin/mspms 3 2 -T -r 5 6 -seeds 1 2 3
1 2 3

//
[1](3:0.620,(1:0.150,2:0.150):0.470);
[1](3:2.573,(1:0.150,2:0.150):2.423);
[1](3:0.910,(1:0.150,2:0.150):0.760);
[3](3:0.702,(1:0.150,2:0.150):0.552);

//
[1](1:1.195,(2:0.220,3:0.220):0.975);
[1](1:1.791,(2:0.220,3:0.220):1.572);
[1](1:1.205,(2:0.220,3:0.220):0.986);
[1](2:1.205,(1:0.555,3:0.555):0.651);
[1](2:0.621,(1:0.555,3:0.555):0.066);
[1](1:0.555,(2:0.518,3:0.518):0.036);
$ mspms 3 2 -T -r 5 6 -seeds 1 2 3 | python -c "import tsconvert, sys; ts =
tsconvert.from_ms(sys.stdin.read()); print(ts.num_trees)"
4
@jeromekelleher
Copy link
Member

Actually looking through the current implementation of from_ms, it's not really fit for purpose at the moment and needs a full rewrite to use the newick library rather than dendropy, and to harden it against various things (like this). Probably best to avoid using it for now and to maybe reuse some code from msprime's scrm handling code in verification.py for the comparisons in Demes.

@grahamgower
Copy link
Member Author

I ended up splitting the ms output at // and passing the separate strings into from_ms. A bit hacky, but no big deal. Having the tree sequence is quite an advantage for, e.g. getting the SFS, compared with parsing out stats manually like the scrm verification code.

@benjeffery
Copy link
Member

I agree that the from_ms code needs a rewrite to use the common, modular newick parser. Adding this to the milestone.

@benjeffery benjeffery added the enhancement New feature or request label Jul 18, 2021
@benjeffery benjeffery added this to the Initial 0.1 release milestone Jul 18, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants