Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add plotting methods #295

Merged
merged 22 commits into from
Nov 23, 2024

Conversation

brandynlucca
Copy link
Collaborator

This PR includes:

  • A new plot method for Survey-class objects
  • Several plotting options for population estimates
    • Bivariate age-length distribution
    • Along-transect distributions
    • Kriged (mesh) population estimates
  • Additional data exploration/manipulation options specifically for the kriged mesh estimates
    • Additional plot_type options with the default being plot_type="hexbin". This enables different views into the dataset such as being able to display the standard deviations of binned data, etc.
  • Associated Pydantic validators for echopop-specific parameters
  • numpydocs-formatted docstring with additional documentation and details
  • An accompanying jupyter-book page that provides an example notebook with different options/functionalities
    • This has been added to the _toc.yml table of contents configuration and successfully renders locally

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@brandynlucca brandynlucca self-assigned this Nov 4, 2024
@brandynlucca brandynlucca added documentation Improvements or additions to documentation enhancement New feature or request labels Nov 4, 2024
@brandynlucca brandynlucca added this to the v0.4.2 (reports, plots) milestone Nov 4, 2024
echopop/survey.py Outdated Show resolved Hide resolved
Copy link
Member

@leewujung leewujung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@brandynlucca : Thanks for the PR! I made some inline comments above.

Overall I think this PR is ready to go. I have 2 main comments:


I think it would be good to actually expose functions like plot_mesh, plot_transect, plot_age_length_distribution in the Survey object, in a way similar to how you can access the plot functions in pandas. This is something that I think should be tabled for now so that we can get other parts of the codebased built. I will create an issue for it as a reminder -- I think one can use accessor functions to streamline the process.


I feel the way the allowed variables are restricted for each plot type is a little limiting. For example, why isn't kriged biomass density in the list of the mesh format? but that is "kriged_mean", right? and you also hard-coded kriged_variable = "biomass" in the plot_mesh function, so there's a potential confusion on the user side of what is being plotted.

My view of the plotting functions is that they are like "containers" where the variables can be swapped out as long as the dimension of the variables fit what is needed for the plot to happen, like how matplotlib imshow or scatter works. Plotting any variables with corresponding lat/lon values in the mesh format should be the same as plotting one of the variations in the defined set of allowed variables. I can see the convenience of providing defaults for the ticks and labels, which is great. I just think that leaving the door open for other variables would enhance reusability of the functions.

Using the kriged biomass density as an example, I feel it would be useful if the users can provide a 3-column dataframe (value, lat, lon) and specify that it is kriged biomass density, and can get the automatic ticks and labels. If they do not specify the type, they don't get any automatic labels.

Let me know what you think.
Edits: Thinking about this again, I think we should table the thoughts above (plotting container with 3-column dataframe as input) as well, so that we can move forward to address other things needed for this project. I was thinking that the plots, at least the mesh and track, can be implemented as accessors to dataframes in Echoshader in the long run. The length-age relationship uses imshow so is pretty straightforward itself, and since it's not acoustic data, we don't need to tie it there.

Requested changes:

  • The main changes for this PR I recommend here then would be change "kriged_mean" to "kriged biomass density mean" and similarly for other quantities from kriging. This way it is clear to the user what they will be plotting.
  • One thing I think we should consider is to change the column name of the dataframe as well -- the more explicit the better. "kriged_mean" does not say anything about what is being kriged. And given that in Matlab EchoPro there were options to krige other quantities like NASC, I think we should change it.

@brandynlucca
Copy link
Collaborator Author

brandynlucca commented Nov 21, 2024

Adding tracking checkboxes:

  • Remove redundancy in geo-configuration plotting parameters check
  • Add tracking and user warning for plotting parameters that go unused
  • Reference API for default parameter values
  • Give kriged mesh variable plotting options clearer names
  • Update example workbook
  • Update documentation

brandynlucca and others added 2 commits November 20, 2024 18:26
* Update to notebook

* Changes to kriged plot variable options

* Plotting function changes

* Pre-commit adjustments

* Change to self-referencing pydantic model

* Streamline geospatial configuration parameter updating in `Survey.plot`

* Test new pydantic classes

* Pydantic validator overhauls
@brandynlucca brandynlucca merged commit 7a5ec86 into OSOceanAcoustics:main Nov 23, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation enhancement New feature or request
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

2 participants