-
Notifications
You must be signed in to change notification settings - Fork 88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add mus mus domesticus demographic model #1485
add mus mus domesticus demographic model #1485
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry it took me so much time to get to this. I reviewed the Domesticus demographic model and left some comments. Only one has to do with actual content. The other two are more about format and documentation. Let me know if you have any questions.
|
||
return stdpopsim.DemographicModel( | ||
id="M_musculus_domesticus_Europe", | ||
description="M. musculus domesticus piecewise constant size", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have certain conventions for the demographic model ids. You can find them described here. According to these conventions, I would suggest something like DomesticusEurope_1F22
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. I also added similar names for the other demographic models.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good.
This model comes from MSMC using four randomly sampled | ||
individuals (DEU01,DEU03,DEU04,DEU06) from a German population. | ||
The model is estimated with 57 time periods. | ||
""", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would also explicitly mention Fig 3 of Fujiwara et al 2022 and say that the population sizes and time changes were supplied by the authors (maybe even mention a specific author?).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added these in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great
This model comes from MSMC using four randomly sampled | ||
individuals (KOR01,KOR02,KOR03,KOR05) from a Korean population. | ||
The model is estimated with 57 time periods. | ||
""", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See comments above about the model id and long description
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added.
2040, | ||
3844, | ||
90428, | ||
145603, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
check that there is no offset here. From examination of fig 3, it looks like between 300-400 generations ago the Ne was ~150K. The way the arrays are set up here, I think that Ne=145,603 is associated with the time range 420-570.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When you say offset here do you mean something being added at the stage of plotting?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There appears to be an inconsistency between the plot in fig 3 of Fujiwara et al 2022 and your implementation. I don't know if the problem is in fig 3 or your implementation. If I understand the implementation correctly, then each Ne in your table is being associated with the wrong time range (shifted one range back in time). The 5th Ne (145603) is associated in your implementation with the 5th time interval (420 - 570 generations ago). However, in Fig. 3, it appears to be associated the time range somewhere between 300 to 400 generations ago, which fits your 4th time interval. If you remove the first element in the sizes
array, this should fix it IMO.
Thanks a lot! Also note from the automatic checks that we're missing a table of parameters:
|
Codecov ReportAll modified lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #1485 +/- ##
==========================================
- Coverage 99.85% 99.85% -0.01%
==========================================
Files 125 122 -3
Lines 4217 4210 -7
Branches 588 591 +3
==========================================
- Hits 4211 4204 -7
Misses 3 3
Partials 3 3
☔ View full report in Codecov by Sentry. |
I have added these parameter tables in and the automatics checks seem to be passing now. |
|
||
return stdpopsim.DemographicModel( | ||
id="M_musculus_domesticus_Europe", | ||
description="M. musculus domesticus piecewise constant size", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good.
Time (yrs.),291,Begining of 54th time interval | ||
Time (yrs.),180,Begining of 55th time interval | ||
Time (yrs.),83,Begining of 56th time interval | ||
Time (yrs.),0,Begining of 57th time interval |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Parameter table looks okay. I think that the standard is to have one more entry in the list of Nes than in the list of times. This should correspond to the Ne before the first time interval (the ancestral Ne). Other Nes should map to time intervals. This also relates to my comment on the "shift" in the demographic model
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. See my comment about the number of Ne entries in the Domesticus model
@igronau I removed the first time interval from the demographic history file and tables, please let me know if any additional changes are needed. |
@peterdfields , Sorry for the delayed response. I just came back from a long vacation. I think that your changes didn't fix the problem. You just removed he last entry in the |
@igronau No worries. I'm happy to discuss over zoom. What sort of windows of time work best for you? |
I'm now in UCLA (Pacific time) and quite flexible in my schedule until next Wednesday. What time works best for you? |
I messaged you over slack. Let me know if you would prefer to continue here. |
ping @igronau @peterdfields - what's going on with this? |
@petrelharp @igronau I made some small changes to the configuration and tables files but it's not obvious to me how these changes caused the (runner?) errors. I just saw the update on Slack about getting the demography for mouse figured out and I remembered I needed to implement @igronau suggestions. |
Those errors are not your fault - looks like random CI failure. I'd say (a) implement the changes you need to make (and push); (b) run tests locally to identify errors; then (c) we'll get CI sorted if there's still an issue. |
@petrelharp the updates pushed yesterday seem to be working fine on the functional side. @igronau do the implemented changes fix the problems you identified before? |
I verified the model for Domesticus and it seems fine. I created a figure below to validate that the pop size between |
I now also validated the demographic model for Mus Mus (red line in fig 3). In particular, I checked the first peak in Ne, which according to the plot occurred around 900-750 generations ago. The first local max in |
I now also validated the demographic model for Mus Cas (green line in fig 3). I checked the first four intervals and they seem consistent with the figure (although the Ne of 938111 between 1886 and 3011 generations ago is not clear in the figure because it is out of bounds). Also here, the parameter table in CastaneusIndia_1F22.csv requires tweaking. Again, size[1] (64853) is associated with the 55th time interval and not the 56th interval. This can be fixed by adding another 344802 value to the table. |
@igronau I've updated the table files with (I hope correct!) the additional values/intervals you described. Please let me know if I should add any other modifications. |
Looks good. I think that this PR can finally be merged, and then I can formally QC it |
@petrelharp - can you merge this? |
yay! On to the DFE...I think this one will probably be the most tricky. |
I've run the tests locally, and they've passed. We'll deal with the CI failures elsewhere (e.g., maybe bumping the cache number like Ive just done in #1526 will do it). So, I'll merge this. Can one of you open the QC issue, please? |
Hi @igronau It's taken me a bit of time to get back to this but you mentioned I should ping you when I submit things for the demographic models and DFE for mouse. Here is a draft demographic model for Mus musculus domesticus from Fujiwara et al. 2022. I have models for the other two subspecies of mouse as well from the same publication, I figured I'd just start with this one to make sure the format was okay. I haven't heard back anything about the genetic map, should I try pinging anyone about that? Anyway, if this looks okay I can add the two other models and then start on the DFE.