Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add inference #79

Merged
merged 59 commits into from
Oct 16, 2024
Merged

add inference #79

merged 59 commits into from
Oct 16, 2024

Conversation

ejd99
Copy link
Collaborator

@ejd99 ejd99 commented Feb 24, 2023

No description provided.

@richardreeve richardreeve self-requested a review February 24, 2023 14:19
@coveralls
Copy link

coveralls commented Feb 24, 2023

Pull Request Test Coverage Report for Build 6700074205

  • 2 of 224 (0.89%) changed or added relevant lines in 3 files are covered.
  • 46 unchanged lines in 4 files lost coverage.
  • Overall coverage decreased (-5.6%) to 54.527%

Changes Missing Coverage Covered Lines Changed/Added Lines %
src/Interface.jl 0 1 0.0%
src/inference.jl 0 221 0.0%
Files with Coverage Reduction New Missed Lines %
src/Interface.jl 1 55.56%
src/TreeSet.jl 1 42.42%
src/trim.jl 2 95.0%
src/rcall.jl 42 0.0%
Totals Coverage Status
Change from base Build 5476180660: -5.6%
Covered Lines: 1361
Relevant Lines: 2496

💛 - Coveralls

@mkborregaard
Copy link
Member

Nice! What does this PR do?

@richardreeve
Copy link
Member

richardreeve commented Mar 1, 2023

Hi @mkborregaard - Emily is my PhD student and she is writing code to do trait inference and calculations of phylogenetic signal on trees, so she's initially reproducing some R phylolm code to do a ML approach, and then she'll be developing a Bayesian algorithm.

@mkborregaard
Copy link
Member

Sounds amazing! I guessed as much :-) would it be nicest if I interact/comment on the PRs or better if I just keep hands off?

@richardreeve richardreeve marked this pull request as draft March 1, 2023 14:58
@richardreeve
Copy link
Member

It would be great to have your input in a bit, though if you have any obvious thoughts feel free to chip in. It's very much WIP at the moment though (I've formally converted it to a draft), so feedback once it's a bit more advanced would be very helpful.

@richardreeve
Copy link
Member

richardreeve commented Apr 3, 2024

@ejd99 - I've updated the code to use a union of a Dual and a Float64 by default, but you can select a Number (2-3x as slow!) or Float64 (5 times faster, but doesn't work with lambda). The code still scales linearly with number of tips, and on my M1 Max, it looks like it would run a 200,000 tip tree with just β and σ is about 3 hours, and β, σ and λ in about 20 hours. Obviously if I can work out how to select the data type better so it's completely type stable it'll hopefully speed up more, but this is a good start.

Model Tips Float64 Union Number
covariance β and σ 100 13.66s = 0.137s / tip N/A N/A
covariance β and σ 200 95.17s = 0.476s / tip N/A N/A
threepoint β and σ 100 5.63s = 0.056s / tip 29.36s 69.11s
threepoint β and σ 200 11.04s = 0.055s / tip 66.5s
threepoint β, σ and λ 100 N/A 25.14s = 0.251s / tip 78.5s
threepoint β, σ and λ 200 N/A 51.01s = 0.255s / tip 159.14s
threepoint β, σ and λ 50k N/A 19093s = 0.382s / tip
threepoint β, σ and λ 100k N/A 35275s = 0.353s / tip

@richardreeve
Copy link
Member

richardreeve commented Apr 3, 2024

I've identified an error in how you were using lambda in logpdf(::MyDist3, ...), which I think caused the issue that lambda wasn't converging correctly, and I've improved the speed of the function (I've updated the table above). As a result, somewhat surprisingly, the lambda version runs slightly faster using the same node data type as the one without. That might suggest I've introduced an error of course, so you should check! However, the simpler model is ultimately much faster because it can use just Float64s.

Edit: Looking at the speeds, I suspect that it's scaling as $tip^1$ for the threepoint model and $tip^2$ for the covariance matrix model, but we see an additional slowdown in both cases because, as the numbers of tips increases, the data can no longer be stored in a fast cache and has to be stored in main memory. Not super important, but in case you were wondering what was happening.

@richardreeve richardreeve marked this pull request as ready for review October 16, 2024 08:04
@richardreeve richardreeve merged commit 9c8eadc into dev Oct 16, 2024
6 of 10 checks passed
@richardreeve richardreeve deleted the ejd/inference branch October 23, 2024 21:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants