-
Notifications
You must be signed in to change notification settings - Fork 88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Experimental yaml input format #1842
base: main
Are you sure you want to change the base?
Conversation
Codecov Report
@@ Coverage Diff @@
## main #1842 +/- ##
==========================================
- Coverage 90.46% 90.28% -0.19%
==========================================
Files 20 21 +1
Lines 10682 10733 +51
Branches 2167 2174 +7
==========================================
+ Hits 9664 9690 +26
- Misses 572 597 +25
Partials 446 446
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
Two thoughts:
|
# TODO nasty going back to JSON here - can we make a demes.fromdict() | ||
# function to do this directly? | ||
demes_model = demes.loads(json.dumps(demes_dict), format="json") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aha! Thanks @grahamgower.
I agree with @petrelharp that it maybe needs to have separate |
Thanks, great points @petrelharp and @grahamgower ! I think a combined WRT to the CLI, I've already created an |
Update: I've added the proposed mutations/ancestry sections and the config looks like this now: ancestry:
sequence_length: 100000
recombination_rate: 1e-8
samples: {A: 100, B: 100}
ploidy: 1
model: hudson
demography:
time_units: generations
demes:
- name: X
epochs: [{end_time: 1000, start_size: 2000}]
- name: A
ancestors: [X]
epochs: [{start_size: 2000}]
- name: B
ancestors: [X]
epochs: [{start_size: 2000}]
mutations:
rate: 1e-8
model: blosum62 To make this fully general we'd need to
|
This looks really nice to me. Agree that ancestry/mutations/output blocks makes a lot of sense, and those updates look clean. If I'm reading the changes correctly, you can place any valid argument to sim_ancestry and sim_mutations into this yaml? So specify seeds, or more complicated models (e.g. dtfw then switch to hudson), etc. For an "output" block, it might be nice to be able to specify "trees" vs "vcf", plus all the bells and whistles that go with those. Not sure how general you intend this input approach to be. Overall, I think this would be a nice middle ground between avoiding both python scripting and the cli (which can sometimes be confusing for some). Looking forward to discussing more today in a bit. |
I like the approach overall. I think embedding the |
This is an experiment to see what a yaml/json input format (building on demes) would look like. It mostly works I think, except for the basic confusion about the direction of time. We can easily imagine adding to this to allow for things like recombination maps.
Here's an example input file:
The idea is that we embed the Demes yaml description within the larger simulation configuration context. When we're parsing the input yaml, we just hand-off the parsing of the
demography
object todemes-python
which will do all the hard work for us.I'm not suggesting this as a general specification for popgen simulations, I just want to illustrate the power that we get from keeping Demes simple and self-contained. To me, the ability to make a simple configuration file for a specific simulator like this is a powerful argument for not over-specifying the standard. The more bells and whistles we add to the spec the less likely it is that it'll be compatible across different simulators.
Any thoughts @molpopgen @grahamgower @apragsdale? I've been talking about simulation configurations being able to "refer" to elements of the Demes model for a while, and this is an attempt to make things concrete. (I guess we shouldn't get into detailed discussions about Demes itself here though: if someone wants to follow up, maybe create an issue on the spec repo to discuss?)