-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update packages, drop unused code, ingest stage running w/o error #61
Conversation
…arto-doc-with-performance-results-on-each-model-run
renv/profiles/reporting/renv.lock
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Everything in here is just a dependency update (+ R version bump).
renv/activate.R
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All of the changes here are automated changes from renv and can be safely ignored.
@@ -5,3 +5,4 @@ local/ | |||
lock/ | |||
python/ | |||
staging/ | |||
profile |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't want to commit the profile setting file OR include it in the Docker build, else the container will use the reporting profile.
renv.lock
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All version bumps + an R version bump.
py/flagging.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The entirety of this script has been moved to model-sales-val.
@@ -10,7 +10,9 @@ suppressPackageStartupMessages({ | |||
library(arrow) | |||
library(aws.s3) | |||
library(aws.ec2metadata) | |||
library(ccao) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was missing this as a dependency for get_town_triad()
later in the script.
pipeline/02-assess.R
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only changes for compatibility with the latest version of lintr
.
@@ -267,71 +267,15 @@ assessment_data_w_hie <- assessment_data %>% | |||
|
|||
|
|||
#- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - | |||
# 5. Validate Sales ------------------------------------------------------------ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Everything here is moved to the dedicated sales val repo.
library(glue) | ||
library(here) | ||
library(magrittr) | ||
suppressPackageStartupMessages({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added this just to suppress annoying output in the logs.
Pipfile
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both Pipfile things were removed since they existed mostly to support sales validation. DVC is installed directly in the Dockerfile without a lock file.
renv/library/ | ||
renv/sandbox/ | ||
renv/staging/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ignoring this when using COPY
in Docker since it's manually replaced with the installed libraries.
# Set the working directory to setup. Uses a dedicated directory instead of | ||
# root since otherwise renv will try to scan every subdirectory | ||
WORKDIR /setup |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Slight re-configuration of the Dockerfile here. renv
was stupidly scanning the entire filesystem when using the root directory and thus timing out. I created a dedicated working directory just for setup stuff, sort of following the multi-stage build pattern.
|
||
# Use PPM for binary installs | ||
ENV RENV_CONFIG_REPOS_OVERRIDE "https://packagemanager.posit.co/cran/__linux__/jammy/latest" | ||
ENV RENV_CONFIG_SANDBOX_ENABLED FALSE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still not clear what this does but starting R inside the container recommends settings it ~for speed~.
ENV RENV_PATHS_LIBRARY renv/library | ||
ENV RENV_PATHS_CACHE /setup/cache |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the actual location in the container the packages get installed to. They're then symlinked in the actual working dir.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Praise] Thanks for the clear explanations of the renv choices here!
# Install pipenv for Python dependencies | ||
RUN pip install pipenv | ||
|
||
# Copy pipenv files into the image. The reason this is a separate step from | ||
# the later step that adds files from the working directory is because we want | ||
# to avoid having to reinstall dependencies every time a file in the directory | ||
# changes, as Docker will bust the cache of every layer following a layer that | ||
# needs to change | ||
COPY Pipfile . | ||
COPY Pipfile.lock . | ||
|
||
# Install Python dependencies | ||
RUN pipenv install --system --deploy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't need any of this now. We only need DVC.
I've reviewed the R scripts and params.yaml and nothing stuck out to me. I'll leave docker and workflow review to @jeancochrane |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent cleanup!
.renvignore
Outdated
pipeline/05-finalize.R |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Thought, non-blocking] Instead of ignoring the finalize
script completely, I wonder if there's a way for us to specifically ignore the library(quarto)
call, which is the only reporting
dependency in that script? That seems to me like it would be the cleanest way to isolate the finalize
step writ large from its report generation substep. But this is a speculative ask, so I don't think it needs to block merging! Just something for us to consider.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately it seems like having renv
look at finalize
also means it looks into the Quarto doc itself i.e. there's no way to ignore finalize while also ignoring the doc. I think it's fine since all the finalize dependencies (except git2r
are covered by prior steps and are unlikely to change.
But let me play around with it some more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in 2cf6ceb!
pipeline/08-api.R |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Praise] Oops, I missed this during #62, thanks for catching!
ENV RENV_PATHS_LIBRARY renv/library | ||
ENV RENV_PATHS_CACHE /setup/cache |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Praise] Thanks for the clear explanations of the renv choices here!
pipeline/06-upload.R
Outdated
library(dplyr) | ||
library(git2r) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Question, non-blocking] Where is git2r
being used in the upload
stage?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's being used in finalize
, see above comment.
This PR:
lintr
rules