Tools for managing, formatting, coalescing and exporting data from our products, including creating subsets!
A RStudio tutorial is beyond the scope of this readme, but if you need to get started with installing R and RStudio: https://www.earthdatascience.org/courses/earth-analytics/document-your-science/setup-r-rstudio/
- Create an account: https://github.com
- Work through chapters 6-12 here if you need to install git, and connect it all with RStudio: https://happygitwithr.com/install-git.html
- Follow the instructions (at least through 5) here under "How to do this using RStudio and GitHub?" https://r-bio.github.io/intro-git-rstudio/
a. you don't need to enter the backticks in the shell
b. this example is a bit misleading because it doesn't include the .git, copy the link to the clipboard like before
c. RESTART RSTUDIO BEFORE MOVING ONTO STEP 6 IN THIS TUTORIAL - If you want to pull updates from here to your copy, see chapter 31: https://happygitwithr.com/upstream-changes.html#pull-changes-from-upstream
Double-click the .Rproj file in your local repo folder to launch you into the right working directory, on your working git branch in RStudio. From there, open whatever files you want to modify.
- "example.R" shows you example implementations of the data management and node health functions (also read comments, functions that produce files are commented out)
- "locate_example.R" is a template script for running the location functions
I suggest making your own copy of these scripts, renaming them, and modifying them with your file path inputs.
There is a subfolder within this repo named "functions" which is full of, well, scripts that contain functions! You'll notice they're often called (via source()) at the top of the example scripts. This loads in the custom functions that I have written to handle CTT data. Ultimately, these will be rolled into an R package.
The input folder can contain any melange of raw downloaded files from the sensor station (beep data, node health, GPS) all in the same folder or subfolders. Zipped folders need to be unzipped, but compressed files do not (i.e. csv.gz files are just fine as they are). The function will return a list of 3 dataframes from the files in the folder you give it:
- beep data
- node health
- GPS
This function is the "engine" behind the export function. You can run it standalone with the following parameters, but you don't have to.
health: the 2nd dataframe output by the load_data() function
freq: the time interval for which you want variables to be summarized
The output is a nested list for each combination of channel and node, with the following plots for each:
- battery
- RSSI
- number of check-ins
- scaled number of check-ins as line plot over scaled RSSI
- box plot of node RSSI
health: the 2nd dataframe output by the load_data() function
freq: the time interval for which you want variables to be summarized
The output is a nested list for each combination of channel and node, with the following plots for each:
- latitude
- longitude
- RSSI
- dispersion
NOTE: THIS ONLY WORKS FOR V2
health: the 2nd dataframe output by the load_data() function
nodes: list of nodes
freq: the time interval for which you want variables to be summarized
The output is a nested list for each node, with the following plots for each:
- RSSI
- number of check-ins
- battery
- time mismatches
- small time mismatches
gps: the 3rd data frame from the load_data() function
freq: the time interval of summary
- altitude
- number of fixes
health_data: the 2nd dataframe output by the load_data() function
freq: the time interval for which you want variables to be summarized
out_path: where you want your plots to go
x: the plot for the 1st panel
y: the plot for the 2nd panel
z: the plot for the 3rd panel
To assign x, y and z, look at the description for node_channel_plots() and select those plot indices in the order you want them on the page.
NOTE: THIS ONLY WORKS FOR V2
same as above; indices for the plots can be chosen from the list under the node_plots() description
Ideas for R package functions:
load_data()
export_data()
summarize_health_data()
node_plots()
export_node_plots()
node_channel_plots()
export_node_channel_plots()