1.0.3 (#121)

* New branch * bug on containers pagination from rancher * error codes seadata * better pagination for rancher containers list * irods problems fix * Saving errors in order of data (seadata) * bug fix on edmo code * more improvements on seadata * metadata fix * bug fix * request for a fix * errors one step higher (seadata) * doc fix * bug fixes and optimizations * bug fix * Adding a patch test * Draft celery async task in seadata * bug fix * small updates to async * small mods * debugging * bug fix * bug fix * base for irods in worker * splitting b2handle from pid generation * Move to production async * Adding api call when celery finishes * bug fix * bug fix * counting the workers totals * return the id * removing extra projects and moving projectrc * worker path fix * Back door to skip the APIs call * try more workers :) * worker fix * dir fix * unzip operation with shutils in celery task * try to fix the broken pip10 update * Allow celery to put the zip file only in backdoor cases * A new task for async orders * bug fix * bug fix * b2handle bugfix * debugging b2handle on celery * bug fix * dir zip fix * adding extra configuration for seadata * bug fix * more fix * fix for merret * fix task for order to call CDI * bug fix * double fix * Adding a count of files added * remove logging * redis test * fix * fix * bug fix to avoid redis library in backend * fix redis again * bug fix * fix * fix redis import * swagger fix * don't set for now the cache on workers * bug fix * Adding redis parameters * scan fix * more fix * extra check on pid resolution * prefetching disabled on celery * Push fix for ticket and slashes * Try catch problems in workers * Adding mongo and fixing celery conf * adding elastic * add verification counts * adding redis cache * use the cache * various bug fixes * bug fix * bug fix * better output on seadata * few mods * previous * towards leaving all ideas up to others * Cleanup compose configurations to reflect new defaults * Switching to backend+irods images * Adding --development flag to fix tests execution * First stub of restricted orders * Added submodules do to gitignore * Added celeryui * Added rapydo version to project configuration * Check on order path existence * Fix error message * Creation of order collection * Merge of restricted users list * Added json.loads to convert string into json * Stub for zip merge * Debug on irods stream operation * Checks on upload operation * Bug fix to allow flower credentials * Switched put and post methods * Test to enable celery in debug mode * Removed celery stuffs in debug mode * Removed project_configuration from gitignore * Shifted the ingestion phase for restricted orders [not yet in add mode] Removed binary mode * Working on zip merge * Checking if partial zip exists * Checking if partial zip exists * Copying partial zip on local fs * Fix local path, added file name * Unzipping partial zip * Check unzip result * Check if final zip exists * Copy partial zip as final zip if still does not exist * Passing extra check parameters as input [checksum, size, file count] * add verify of file size and file count * Fix file count check * Added checksum to checks * Check to avoid overwrite of partial zip * User stresstest backdoor to upload two zip files * Working on zip merge * Completing filename with zip extention if missing * Adding files to file zip * Fix adding files to file zip * Fix zip file open mode from r to a * Uploading update final zip * Bug fix, i were using open instead of put * Multi-file download * Bug fix download * Added Back compatibility temporary fix * Fix back compatibility * Removed variables [should be inherited from backend conf] * Added new variables to backend conf * Removed debug code * Open port for adminer interface * Extended the restricted upload endpoint to specify the file name * Configurable qc label for rancher * changed var name. * Added more helpful log message in error case. * Logging to RabbitMQ updated (much code moved to rapydo repo). * Added docstring, changed comment. * Enabled logging to RabbitMQ in production mode. * In debug mode, rapydo handles logging to file instead of to RabbitMQ. * Renamed variable. * Allow customizing app_name. * Added log messages. * Added TODOs. * Cosmetics (logging start+end of enable+upload). * Log end of upload in case file existed. * Added log message for error if batch was not enabled. * Added two log messages. * Configure if we wait for rancher-tasks (#120) * Can now configure whether rancher should wait for container to be running or stopped. * Make sure waits until stopped if told so.
EUDAT-B2STAGE · Aug 29, 2018 · cd17a55 · cd17a55
1 parent d455a78
commit cd17a55
Show file tree

Hide file tree

Showing 78 changed files with 2,107 additions and 638 deletions.
diff --git a/.gitignore b/.gitignore
@@ -2,6 +2,7 @@
 ########################
 ##  Your custom data  ##
 ########################
+.projectrc
 
 # data
 uploads
@@ -14,6 +15,7 @@ data/certs/*cineca*
 data/secrets/secret_*
 data/b2handle/*.pem
 data/files/*
+data/*/*.json
 
 #################
 ##  Submodules ##
@@ -26,6 +28,7 @@ submodules/http-api
 submodules/frontend
 submodules/prc
 submodules/rapydo-confs
+submodules/do
 
 ############################################
 ##  Dynamically created file and folders  ##
@@ -34,7 +37,6 @@ submodules/rapydo-confs
 #.projectrc
 frontend/b2stage/libs/bower_components
 .env
-projects/*/backend/confs/project_configuration.yaml
 
 ###################################
 ##  Temporary and miscellaneous  ##

diff --git a/.travis.yml b/.travis.yml
@@ -35,9 +35,9 @@ script:
 # - rm projects/b2stage/backend/swagger/publish/SKIP
 
 # startup and launch tests
-- rapydo init && rapydo start
-- rapydo shell backend --command 'restapi tests --wait'
+- rapydo --development init && rapydo --development start
+- rapydo --development shell backend --command 'restapi tests --wait'
 
 # coverage within a docker container
 after_success:
-- rapydo --log-level VERBOSE coverall
+- rapydo --development --log-level VERBOSE coverall
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -1,11 +1,19 @@
 
-# v1.0.3 (*near future*)
+# v1.0.4 (*near future*)
 
 ## features
 
 - to be defined
 
-# v1.0.2 (*current*)
+# v1.0.3 (*current*)
+**due date**: June 30th, 2018
+
+## features
+
+- better support to SeaData.net project
+- bug fixes
+
+# v1.0.2 (*stable*)
 **due date**: April 30th, 2018
 
 ## features

diff --git a/data/scripts/prerequisites.sh b/data/scripts/prerequisites.sh
@@ -1,6 +1,7 @@
 #!/bin/bash
 
-pip_binary="pip3"
+# pip_binary="pip3"
+pip_binary="pip3 --disable-pip-version-check"
 ve_name="b2stage"
 myuser=$(whoami)
 
@@ -16,13 +17,21 @@ if [ $myuser != "root" ]; then
     source $ve_name/bin/activate
 fi
 
-# update or die
-$pip_binary install --upgrade pip
-if [ "$?" != 0 ]; then
-    echo "Failed to use $pip_binary. Did you install Python and Pip?"
-fi
+#############################
+## PIP 10 BROKEN UPDATE
+## see: https://github.com/pypa/pip/issues/5221
+
+# # update or die
+# $pip_binary install --upgrade pip
+# if [ "$?" != 0 ]; then
+#     echo "Failed to use $pip_binary. Did you install Python and Pip?"
+# fi
+
+## if broken, try:
+## $ python3 -m pip install --force-reinstall pip
+#############################
 
-for existing in `$pip_binary list --format columns | grep rapydo | awk '{print $1}'`;
+for existing in `$pip_binary list | grep rapydo | awk '{print $1}'`;
 do
     echo "removing: $existing"
     $pip_binary uninstall -y $existing

diff --git a/data/scripts/seadata/getpids.sh b/data/scripts/seadata/getpids.sh
@@ -0,0 +1,25 @@
+#!/bin/bash
+
+vault_path="/mnt/data1/irods/Vault/cloud"
+irods_path="/sdcCineca/cloud"
+import_prefix="import30may_rabbithole_500000_"
+pid_prefix="21.T12995"
+out_file="./pids.txt"
+
+rm -f $out_file
+
+for dir in $(ls -1d ${vault_path}/${import_prefix}*);
+do
+    for element in $(ls -1 $dir/*);
+    do
+        # echo $element
+        fname=$(basename $element)
+        dpath=$(dirname $element)
+        dname=$(basename $dpath)
+        pid=$(imeta ls -d $irods_path/$dname/$fname | grep $pid_prefix)
+        echo -e "$dname\t$fname\t$pid" >> $out_file
+        # break
+    done
+    # break
+done
+
diff --git a/data/scripts/seadata/sea.py b/data/scripts/seadata/sea.py
@@ -0,0 +1,77 @@
+
+###############################################
+# count current batch import
+
+import os
+import json
+# from datetime import datetime
+from glob import glob as find
+from plumbum.cmd import imeta, grep
+
+pid_prefix = '21.T12995'
+irods_path = "/sdcCineca/cloud"
+main_path = '/mnt/data1/irods/Vault/cloud'
+prefix_batches = 'import01june_rabbithole_500000_'
+os.chdir(main_path)
+
+files = find("./%s*/**.txt" % prefix_batches)
+# print(datetime.now(), len(files))
+
+###############
+# Obtain pids (requires 'plumbum')
+counter = 0
+data = {}
+for file in files:
+    counter += 1
+    pieces = file.split('/')
+    filename = pieces.pop()
+    batch = pieces.pop()
+    ipath = os.path.join(irods_path, batch, filename)
+    chain = imeta['ls', '-d', ipath] | grep[pid_prefix]
+    try:
+        out = chain()
+    except Exception as e:
+        print('failed: %s [%s]' % (filename, batch))
+        continue
+    pid = out.split(' ')[1].rstrip()
+    # print(filename, pid.encode('ascii'))
+    data[filename] = pid.encode('ascii')
+
+    if counter % 100 == 0:
+        print("Found: %s" % counter)
+        # break
+    if counter % 10000 == 0:
+        print("saving what we have so far")
+        with open('/tmp/test.json', 'w') as fh:
+            json.dump(data, fh)
+
+with open('/tmp/test.json', 'w') as fh:
+    json.dump(data, fh)
+
+"""
+(datetime.datetime(2018, 6, 1, 23, 36, 56, 304232), 37738)
+(datetime.datetime(2018, 6, 2, 6, 57, 35, 644743), 82849)
+(datetime.datetime(2018, 6, 2, 10, 12, 52, 812827), 103112)
+(datetime.datetime(2018, 6, 3, 10, 33, 53, 584727), 254153)
+"""
+
+# ###############################################
+# # Graceful shutdown of celery worker(s)
+
+# """
+# http://docs.celeryproject.org/en/latest/userguide/workers.html#stopping-the-worker
+# """
+
+# from restapi.flask_ext import get_debug_instance
+# from restapi.flask_ext.flask_celery import CeleryExt
+# obj = get_debug_instance(CeleryExt)
+# workers = obj.control.inspect()
+# workers.active().keys()
+# w = list(workers.active().keys())
+
+# # https://stackoverflow.com/a/41885106
+# # celery.control.broadcast('shutdown', destination=[<celery_worker_name>])
+
+# for element in w:
+#     print(element)  # celery@worker-25d6b959aa91 ?
+
diff --git a/data/seadatacloud/stress.py b/data/seadatacloud/stress.py
@@ -0,0 +1,66 @@
+# -*- coding: utf-8 -*-
+
+#################################
+import sys
+import random
+from datetime import datetime
+import better_exceptions as be
+# from utilities import helpers
+from utilities import apiclient
+
+#################################
+REMOTE_DOMAIN = 'seadata.cineca.it'
+URI = 'https://%s' % REMOTE_DOMAIN
+USERNAME = 'stresstest'
+PASSWORD = 'somepassword'
+# LOG_LEVEL = 'info'  # or 'debug', 'verbose', 'very_verbose'
+LOG_LEVEL = 'verbose'
+
+#################################
+log = apiclient.setup_logger(__name__, level_name=LOG_LEVEL)
+log.very_verbose('init log: %s\nURI [%s]', be, URI)
+# apiclient.call(URI)
+
+#################################
+# log.pp(sys.argv)
+order_size = 100
+if len(sys.argv) > 1:
+    order_size = int(sys.argv[1])
+log.debug("Order size: %s", order_size)
+
+#################################
+# read json
+with open('init.json') as f:
+    import json
+    myjson = json.load(f)
+    log.info("Total PIDs: %s", len(myjson))
+
+#################################
+pids = random.sample(list(myjson.values()), order_size)
+# log.pp(pids)
+
+#################################
+# login to HTTP API with B2SAFE credentials
+token, _ = apiclient.login(URI, USERNAME, PASSWORD)
+log.info("Logged in with token: %s...", token[:20])
+
+# #################################
+# pass
+order_id = 'pythonpaulie_00%s' % datetime.today().strftime("%H%M%S")
+now = datetime.today().strftime("%Y%m%dT%H:%M:%S")
+endpoint = '/api/orders'
+params = {
+    "request_id": order_id, "edmo_code": 12345, "datetime": now,
+    "version": "1", "api_function": "order_create_zipfile",
+    "test_mode": "true", "parameters": {
+        "login_code": "unknown", "restricted": "false",
+        "file_name": "order_%s_unrestricted" % order_id,
+        "order_number": order_id, "pids": pids, "file_count": len(pids),
+    }
+}
+# log.pp(params)
+apiclient.call(
+    URI, method='post', endpoint=endpoint,
+    token=token, payload=params
+)
+# log.pp(res)
diff --git a/docs/deploy/authentication.md b/docs/deploy/authentication.md
@@ -25,7 +25,7 @@ Your application must be registered as a client for the B2ACCESS OAUTH protocol.
 
 Once you start the B2STAGE server with the two variables `B2ACCESS_ACCOUNT` and `B2ACCESS_SECRET` set, the related endpoints will be activated (you may double-check this inside your `/api/specs` JSON content).
 
-Please read also how the authentication works for a user [here](/docs/user/authentication.md#authentication-via-the-b2access-service).
+Please read also how the authentication works for a user [here](/docs/user/authentication.md)
 
 ### Current issues
 

diff --git a/docs/deploy/preq.md b/docs/deploy/preq.md
@@ -34,7 +34,7 @@ sudo yum -y install python36u python36u-pip
 sudo ln -s /usr/bin/pip3.6 /usr/bin/pip3
 ```
 
-### GIT
+##  GIT
 
 Most of UNIX distributions have the `git` command line client already installed. If that is not that case then refer to the [official documentation](https://git-scm.com/book/en/v2/Getting-Started-Installing-Git).
 
@@ -50,7 +50,7 @@ sudo yum -y install git
 
 ### docker engine
 
-Please *NOTE*: if you are using `Red Hat` (RHEL) as Operatin System, Docker is supported [only in the enterprise edition](https://docs.docker.com/install/linux/docker-ee/rhel/#prerequisites).
+Please *NOTE*: if you are using `Red Hat` (RHEL) as Operating System, Docker is supported [only in the enterprise edition](https://docs.docker.com/install/linux/docker-ee/rhel/#prerequisites).
 
 ---
 

diff --git a/docs/deploy/startup.md b/docs/deploy/startup.md
@@ -7,7 +7,7 @@
 To clone the working code:
 
 ```bash
-$ VERSION=1.0.2 \
+$ VERSION=1.0.3 \
     && git clone https://github.com/EUDAT-B2STAGE/http-api.git \
     && cd http-api \
     && git checkout $VERSION  
@@ -51,12 +51,10 @@ Your current project needs to be initialized. This step is needed only the first
 $ rapydo init
 ```
 
-NOTE: with `RC1` there is no working `upgrade` process in place to make life easier if you already have this project cloned from a previous release. This is something important already in progress [here](https://github.com/EUDAT-B2STAGE/http-api/issues/87).
-
 If you wish to __**manually upgrade**__:
 
 ```bash
-VERSION="0.6.1"
+VERSION="0.6.2"
 git checkout $VERSION
 
 # supposely the rapydo framework has been updated, so you need to check:

diff --git a/docs/quick_start.md b/docs/quick_start.md
@@ -23,7 +23,7 @@ Here's step-by-step tutorial to work with the HTTP API project:
 # get the code
 git clone https://github.com/EUDAT-B2STAGE/http-api.git latest
 cd latest
-git checkout 1.0.2
+git checkout 1.0.3
 
 ################
 # install the corrensponding rapydo framework version

diff --git a/projects/b2host/backend/.gitkeep → previous/b2host/backend/.gitkeep b/projects/b2host/backend/.gitkeep → previous/b2host/backend/.gitkeep
diff --git a/projects/b2host/backend/__main__.py → previous/b2host/backend/__main__.py b/projects/b2host/backend/__main__.py → previous/b2host/backend/__main__.py
diff --git a/projects/b2host/backend/apis/.gitkeep → previous/b2host/backend/apis/.gitkeep b/projects/b2host/backend/apis/.gitkeep → previous/b2host/backend/apis/.gitkeep
diff --git a/projects/b2host/backend/models/.gitkeep → previous/b2host/backend/models/.gitkeep b/projects/b2host/backend/models/.gitkeep → previous/b2host/backend/models/.gitkeep
diff --git a/projects/b2host/backend/swagger/.gitkeep → previous/b2host/backend/swagger/.gitkeep b/projects/b2host/backend/swagger/.gitkeep → previous/b2host/backend/swagger/.gitkeep
diff --git a/projects/b2host/backend/tests/.gitkeep → previous/b2host/backend/tests/.gitkeep b/projects/b2host/backend/tests/.gitkeep → previous/b2host/backend/tests/.gitkeep
diff --git a/projects/b2host/confs/commons.yml → previous/b2host/confs/commons.yml b/projects/b2host/confs/commons.yml → previous/b2host/confs/commons.yml
@@ -17,8 +17,7 @@ services:
   # http://docs.celeryproject.org/en/latest/getting-started/brokers/rabbitmq.html
   # default username and password of guest / guest
   rabbit:
-    # image: rabbitmq:3.6.14-alpine
-    image: rabbitmq:3.6.14-management-alpine
+    image: rabbitmq:3.7.5-management-alpine
     hostname: rabbit
     volumes:
       - queuedata:/var/lib/rabbitmq