Skip to content

Commit

Permalink
Merge pull request #1088 from reichlab/reichlab/aws-onboard
Browse files Browse the repository at this point in the history
Onboard FluSight Forecast Hub to AWS Cloud
  • Loading branch information
smathis14 authored May 21, 2024
2 parents a302476 + 7a5c7eb commit eae860a
Show file tree
Hide file tree
Showing 3 changed files with 89 additions and 3 deletions.
78 changes: 78 additions & 0 deletions .github/workflows/hubverse-aws-upload.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
name: Upload hub data to a hubverse-hosted AWS S3 bucket

on:
push:
branches:
- main

env:
# Hubverse AWS account number
AWS_ACCOUNT: 767397675902

permissions:
contents: read
# id-token write required for AWS auth
id-token: write

jobs:
upload:
runs-on: ubuntu-latest

steps:
- name: Checkout
uses: actions/checkout@v4

- name: Get hub cloud config
# save cloud-related fields from admin config as environment variables
# (jq json parser is installed on Github-hosted runners)
run: |
cloud_enabled=$(cat ./hub-config/admin.json | jq -r '.cloud.enabled') \
&& echo "CLOUD_ENABLED=$cloud_enabled"
cloud_storage_location=$(cat ./hub-config/admin.json | jq -r '.cloud.host.storage_location') \
&& echo "CLOUD_STORAGE_LOCATION=$cloud_storage_location"
echo "CLOUD_ENABLED=$cloud_enabled" >> $GITHUB_ENV
echo "CLOUD_STORAGE_LOCATION=$cloud_storage_location" >> $GITHUB_ENV
- name: Configure AWS credentials
# request credentials to assume the hub's AWS role via OpenID Connect
if: env.CLOUD_ENABLED == 'true'
uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: arn:aws:iam::${{ env.AWS_ACCOUNT }}:role/${{ env.CLOUD_STORAGE_LOCATION }}
aws-region: us-east-1

- name: Install rclone
if: env.CLOUD_ENABLED == 'true'
run: |
curl https://rclone.org/install.sh | sudo bash
rclone version
- name: Sync files to cloud storage
# sync specified hub directories to S3
# (to exclude a directory, remove it from the hub_directories list below)
if: env.CLOUD_ENABLED == 'true'
run: |
hub_directories=(
'auxiliary-data'
'hub-config'
'model-abstracts'
'model-metadata'
'target-data'
)
for DIRECTORY in "${hub_directories[@]}"
do
if [ -d "./$DIRECTORY" ]
then
rclone sync \
"./$DIRECTORY/" \
":s3,provider=AWS,env_auth:$BUCKET_NAME/$DIRECTORY" \
--checksum --verbose --stats-one-line --config=/dev/null
fi
done
# unlike other data, model-outputs are synced to a "raw" location
# so we can transform it before presenting to users
rclone sync ./model-output/ ":s3,provider=AWS,env_auth:$BUCKET_NAME/raw/model-output" \
--checksum --verbose --stats-one-line --config=/dev/null
shell: bash
env:
BUCKET_NAME: ${{ env.CLOUD_STORAGE_LOCATION }}
12 changes: 10 additions & 2 deletions hub-config/admin.json
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
{
"schema_version": "https://raw.githubusercontent.com/Infectious-Disease-Modeling-Hubs/schemas/main/v2.0.0/admin-schema.json",
"schema_version": "https://raw.githubusercontent.com/Infectious-Disease-Modeling-Hubs/schemas/main/v2.0.1/admin-schema.json",
"name": "US CDC FluSight",
"maintainer": "US CDC",
"contact": {
Expand All @@ -10,5 +10,13 @@
"repository_url": "https://github.com/cdcepi/FluSight-forecast-hub",
"file_format": ["csv"],
"timezone": "US/Eastern",
"model_output_dir": "model-output"
"model_output_dir": "model-output",
"cloud": {
"enabled": true,
"host": {
"name": "aws",
"storage_service": "s3",
"storage_location": "cdcepi-flusight-forecast-hub"
}
}
}
2 changes: 1 addition & 1 deletion hub-config/tasks.json
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
{
"schema_version": "https://raw.githubusercontent.com/Infectious-Disease-Modeling-Hubs/schemas/main/v2.0.0/tasks-schema.json",
"schema_version": "https://raw.githubusercontent.com/Infectious-Disease-Modeling-Hubs/schemas/main/v2.0.1/tasks-schema.json",
"rounds": [
{
"round_id_from_variable": true,
Expand Down

0 comments on commit eae860a

Please sign in to comment.