app2.py : dynamically rename S3 objects within/between buckets 🎮
1. app.py : get basic stats of S3 buckets
- aws cli lets you figure out the number of S3 objects and their size .
- What if you want to know hourly/daily/monthly stats within a given range of specific dates?
- the
profile
in yourcredentials
must have an access to S3 buckets
$ cat ~/.aws/credentials
>>>
[higee]
aws_access_key_id = something
aws_secret_access_key = something
$ cd {somewhere}
$ git clone https://github.com/higee/s3-buddy
$ cd s3-buddy
$ pip install -r requirements.txt
$ python app.py --help
$ python app.py --start-date 2019-02-01 --end-date 2019-02-01 \
--bucket higee-bucket --path incoming --interval hour \
--profile higee
$ python app.py --start-date 2019-02-01 --end-date 2019-02-03 \
--bucket higee-bucket --path incoming --interval day \
--profile higee
$ python app.py --start-date 2019-01-01 --end-date 2019-03-01 \
--bucket higee-bucket --path incoming --interval month \
--profile higee
$ python app.py --start-date 2019-02-01 --end-date 2020-01-01 \
--bucket higee-bucket --path incoming --interval year \
--profile higee
$ python app.py --start-date 2019-02-01 --end-date 2019-02-01 \
--bucket higee-bucket --path incoming --interval hour \
--delta 6 --profile higee
2. app2.py : dynamically rename S3 objects within/between buckets
- aws cli lets you copy or move objects.
- What if you need to dynamically rename
keys
orfolders
as well? - Term
key
andfolder
are slightly different from original meanings- example :
s3://higee-bucket/incoming/2019/01/01/00/higee.log
bucket
:higee-bucket
folder
:incoming/2019/01/01/00/
key
:higee.log
- example :
- this app renames objects by creating copies with new name.
- So set
delete
to True if you want to remove them.
- the
profile
in yourcredentials
must have an access to S3 buckets
$ cat ~/.aws/credentials
>>>
[higee]
aws_access_key_id = something
aws_secret_access_key = something
$ cd {somewhere}
$ git clone https://github.com/higee/s3-buddy
$ cd s3-buddy
$ pip install -r requirements.txt
argument | description | required | default |
---|---|---|---|
profile | needs access to source-bucket and dest-bucket |
✓ | |
source-bucket | S3 bucket with objects to be renamed | ✓ | |
dest-bucket | S3 bucket where renamed objects will be moved to | source-bucket | |
dryrun | whether you want to test it first | True | |
delete | whether you want to remove source objects | False | |
source-folder-key-prefix | prefix of S3 objects you want to rename | ✓ | |
source-key-str | sub-string of key to be replaced by dest-key-str |
||
dest-key-str | sub-string that will replace source-key-str |
||
source-folder-str | original folder name |
||
dest-folder-str | new folder name that will replace dest-folder-str |
$ python app2.py --help
- rename keys and copy objects within a bucket
- source :
s3://higee-bucket/incoming/2019/01/01/01/higee-staging*
- dest :
s3://higee-bucket/incoming/2019/01/01/01/higee-prod*
- dryrun : True
- delete source objects : False
- source :
$ python app2.py --source-folder-key-prefix incoming/2019/01/01/01/higee-staging \
--source-bucket higee-bucket --source-key-str staging \
--dest-key-str prod --profile higee
- rename keys and copy objects to a different bucket
- source :
s3://higee-bucket/incoming/2019/01/01/02/higee-staging*
- dest :
s3://higee-bucket-2/incoming/2019/01/01/02/higee-dev*
- dryrun : True
- delete source objects : False
- source :
$ python app2.py --source-bucket higee-bucket --dest-bucket higee-bucket-2 \
--source-folder-key-prefix incoming/2019/01/01/02/higee-staging \
--source-key-str staging --dest-key-str dev --profile higee
- rename folder and copy objects within a bucket
- source :
s3://higee-bucket/incoming/2019/02/*
- dest :
s3://higee-bucket/processed/2019/02/*
- dryrun : True
- delete source objects : False
- source :
$ python app2.py --source-folder-key-prefix incoming/2019/02 \
--source-bucket higee-bucket --source-folder-str incoming \
--dest-folder-str processed --profile higee
- rename folder and copy objects to a different bucket
- source :
s3://higee-bucket/incoming/2019/03/*
- dest :
s3://higee-bucket-2/processed/2019/03/*
- dryrun : True
- delete source objects : False
- source :
$ python app2.py --source-bucket higee-bucket --dest-bucket higee-bucket-2 \
--source-folder-key-prefix incoming/2019/03 \
--source-folder-str incoming --dest-folder-str processed \
--profile higee
- 💣 execute the command without dryrun 💣
- source :
s3://higee-bucket/incoming/2019/01/01/01/higee*
- dest :
s3://higee-bucket/incoming/2019/01/01/01/higee-staging*
- dryurn : False
- delete source objects : False
- source :
$ python app2.py --source-folder-key-prefix incoming/2019/01/01/01 \
--source-bucket higee-bucket --source-key-str higee \
--dest-key-str higee-staging --dryrun False --profile higee
- 💣 remove source S3 objects after creating copies 💣
- source :
s3://higee-bucket/incoming/2019/02/01/01/*
- dest :
s3://higee-bucket/incoming/2019/02/01/01/*
- dryrun : True
- delete source objects : True
- source :
$ python app2.py --source-folder-key-prefix incoming/2019/01 \
--source-bucket higee-bucket --source-key-str higee \
--dest-key-str higee-staging --dryrun False --delete True \
--profile higee
$ python app2.py --source-bucket higee-bucket --dest-bucket higee-bucket-2 \
--source-folder-key-prefix incoming/2019/01/01/01/02/higee-staging \
--source-key-str staging --dest-key-str dev --profile higee