diff --git a/docs/Continuous_Integration.md b/docs/Continuous_Integration.md new file mode 100644 index 0000000..15b987e --- /dev/null +++ b/docs/Continuous_Integration.md @@ -0,0 +1,15 @@ +# Continuous Integration + +This is the practice of frequently pushing new changes to the codebase to the repository and having each change verified as working by an automated build and test tool. Github provides hooks for automated build services to watch a repo and trigger a new build/test. + +There are several options for testing utilties out there. This provides a nice summary of three of them and of the CI process itself: https://hackernoon.com/continuous-integration-circleci-vs-travis-ci-vs-jenkins-41a1c2bd95f5 + + + +# TravisCI + +For linux and macos + +# Appveyor + +For windows \ No newline at end of file diff --git a/docs/conf.py b/docs/conf.py index 1b16bd7..2fdb0b3 100644 --- a/docs/conf.py +++ b/docs/conf.py @@ -22,7 +22,7 @@ author = 'Resen Team' # The full version, including alpha/beta/rc tags -release = 'v2019.1.0rc2' +release = 'v2019.1.0' master_doc = 'index' diff --git a/docs/installation/installation.general.rst b/docs/installation/installation.general.rst index 66f31d0..494d7eb 100644 --- a/docs/installation/installation.general.rst +++ b/docs/installation/installation.general.rst @@ -31,14 +31,7 @@ Docker CE is the recomended version of Docker to use with Resen. `Installation Resen ===== -Install Resen by first cloning the resen GitHub repo (https://github.com/EarthCubeInGeo/resen):: +Install Resen from a python 3 environment using ``pip``:: - git clone https://github.com/EarthCubeInGeo/resen.git + pip install git+https://github.com/EarthCubeInGeo/resen.git@v2019.1.0 -Change into the ``resen`` directory:: - - cd resen - -In a python 3 environment, use pip to install Resen:: - - pip install . diff --git a/docs/installation/installation.windows.rst b/docs/installation/installation.windows.rst index d5ae66c..9e1ff89 100644 --- a/docs/installation/installation.windows.rst +++ b/docs/installation/installation.windows.rst @@ -10,17 +10,9 @@ Install Anaconda and Resen We recommend downloading and installing the Python 3 Anaconda Distribution (https://www.anaconda.com/distribution/). This simplifies the installation and usage of several common software tools needed to install and run Resen. **Resen**: -Using the start menu search, open the "Anaconda Powershell Prompt" and navigate to a directory where you wish to host the Resen source code. Next, install Resen by first cloning the resen GitHub repo (https://github.com/EarthCubeInGeo/resen):: +Using the start menu search, open the "Anaconda Powershell Prompt" and install Resen using ``pip``:: - git clone https://github.com/EarthCubeInGeo/resen.git - -Change into the ``resen`` directory:: - - cd resen - -Finally, install Resen:: - - pip install . + pip install git+https://github.com/EarthCubeInGeo/resen.git@v2019.1.0 Once complete, this will provide the command line command ``resen``. Next, we need to install Docker. diff --git a/docs/usage.rst b/docs/usage.rst index 6d6a1d0..32695b1 100644 --- a/docs/usage.rst +++ b/docs/usage.rst @@ -7,14 +7,14 @@ To use resen, simply enter ``resen`` at the command line:: This will open the resen tool:: - ___ ___ ___ ___ _ _ + ___ ___ ___ ___ _ _ | _ \ __/ __| __| \| | | / _|\__ \ _|| .` | |_|_\___|___/___|_|\_| - - Resen 2019.1.0rc2 -- Reproducible Software Environment - - [resen] >>> + + Resen 2019.1.0 -- Reproducible Software Environment + + [resen] >>> Type ``help`` to see available commands:: @@ -24,24 +24,28 @@ This will produce a list of resen commands you will use to manage your resen buc Documented commands (type help ): ======================================== - EOF exit quit start_jupyter stop_jupyter - create_bucket help remove_bucket status + EOF exit help list remove status + create export import quit start stop To get more information about a specific command, enter ``help ``. Resen Workflow ============== -Use Resen to create and remove buckets. Buckets are portable, system independent environments where code can be developed and run. Buckets can be shared between Windows, Linux, and macos systems and all analysis within the bucket will be run exactly the same. Resen buckets come preinstalled with a variety of common geospace software that can be used immediately in analysis. +To create, import, export, and remove buckets, we use Resen. Buckets are portable, system independent environments where code can be developed and run. Buckets can be shared between Windows, Linux, and macos systems and all analysis within the bucket will be run exactly the same. Resen buckets come preinstalled with a variety of common geospace software that can be used immediately in analysis. + +The interface to a resen bucket is a jupyter lab server and access to the bucket is provided through a web browser. The user home directory is ``/home/jovyan``. Any mounted storage directories are available in ``mount``. + +Below are instructions on how to use Resen to work with buckets. A typical workflow will involve: creation of a bucket, performing scientific data analysis inside the bucket, exporting the bucket and sharing it with colleagues. Collaborators can then import the bucket, perform additional analysis, and then export the bucket for publication in an open access citable repository, such as `Zenodo `_. Setup a New Bucket ------------------ 1. Creating a new bucket is performed with the command:: - [resen] >>> create_bucket + [resen] >>> create - The ``create_bucket`` command queries the user for several pieces of information required to create a bucket. First it asks for the bucket name. Creating a bucket named ``amber``:: + The ``create`` command queries the user for several pieces of information required to create a bucket. First it asks for the bucket name. Creating a bucket named ``amber``:: Please enter a name for your bucket. Valid names may not contain spaces and must start with a letter and be less than 20 characters long.`` @@ -50,10 +54,10 @@ Setup a New Bucket Next, the user is asked to specify the version of resen-core to use:: Please choose a version of resen-core. - Available versions: 2019.1.0rc2 - >>> Select a version: 2019.1.0.rc2 + Available versions: 2019.1.0 + >>> Select a version: 2019.1.0 - Optionally, one may then specify a local directory to mount into the bucket at ``/home/jovyan/work``:: + Optionally, one may then specify a local directory to mount into the bucket at ``/home/jovyan/mount``:: Local directories can be mounted to either /home/jovyan/work or /home/jovyan/mount/ in a bucket. The /home/jovyan/work location is a workspace and /home/jovyan/mount/ is intended @@ -61,16 +65,8 @@ Setup a New Bucket specify permissions as either r or rw for directories in mount. Code and data created in a bucket can ONLY be accessed outside the bucket or after the bucket has been deleted if it is saved in a mounted local directory. - >>> Mount storage to /home/jovyan/work? (y/n): y - >>> Enter local path: /some/local/path - - Followed by additional local directories that can be mounted under ``/home/jovyan/mount``:: - >>> Mount storage to /home/jovyan/mount? (y/n): y - >>> Enter local path: /some/other/local/path - >>> Enter bucket path: /home/jovyan/mount/data001 - >>> Enter permissions (r/rw): r - >>> Mount additional storage to /home/jovyan/mount? (y/n): n + >>> Enter local path: /some/local/path Finally, the user is asked if they want jupyterlab to be started:: @@ -79,60 +75,148 @@ Setup a New Bucket after which resen will begin creating the bucket. Example output for a new bucket named ``amber`` with jupyterlab started is:: ...adding core... + ...adding ports... ...adding mounts... Bucket created successfully! ...starting jupyterlab... - Jupyter lab can be accessed in a browser at: http://localhost:9000/?token=61469c2ccef5dd27dbf9a8ba7c296f40e04278a89e6cf76a + Jupyter lab can be accessed in a browser at: http://localhost:9002/?token=e7a11fc1ea42a445807b4e24146b9908e1abff82bacbf6f2 2. Check the status of the bucket:: - [resen] >>> status amber - {'bucket': {'name': 'amber'}, 'docker': {'image': '2019.1.0rc2', 'container': 'a6501d441a9f025dc7dd913bf6d531b6b452d0a3bd6d5bad0eedca791e1d92ca', 'port': [[9000, 9000, True]], 'storage': [['/some/local/path', '/home/jovyan/work', 'rw'], ['/some/other/local/path', '/home/jovyan/mount/data001', 'ro']], 'status': 'running', 'jupyter': {'token': '61469c2ccef5dd27dbf9a8ba7c296f40e04278a89e6cf76a', 'port': 9000}, 'image_id': 'sha256:3ba43e401c1b1a8eca8969aec8426a22d99bca349fd837270fa06dbcaefaeb47', 'pull_image': 'earthcubeingeo/resen-core@sha256:c3783e3b7f05ec17f9381a01009b794666107780d964e8087c62f7baaa00049d'}} - -At this point, the bucket should have a name, an image, at least one port, and optionally one or more storage location. Status should be ``running`` if the user decided to have jupyterlab started, otherwise the status will be ``None``. + [resen] >>> status amber + + amber + ===== + + Resen-core Version: 2019.1.0 + Status: running + Jupyter Token: e7a11fc1ea42a445807b4e24146b9908e1abff82bacbf6f2 + Jupyter Port: 9002 + Jupyter lab URL: http://localhost:9002/?token=e7a11fc1ea42a445807b4e24146b9908e1abff82bacbf6f2 + + Storage: + Local Bucket Permissions + /some/local/path /home/jovyan/mount/path rw + + Ports: + Local Bucket + 9002 9002 + +At this point, the bucket should have a name, an image, at least one port, and optionally one or more storage locations. Status should be ``running`` if the user decided to have jupyterlab started, otherwise the status will be ``None``. Work with a Bucket ------------------ -1. Check what buckets are available with ``status``:: +1. Check what buckets are available with ``list``:: - [resen] >>> status + [resen] >>> list Bucket Name Docker Image Status - amber 2019.1.0rc2 running + amber 2019.1.0 running If a bucket is running, it will consume system resources accordingly. -2. Stop jupyter lab from a bucket:: +2. Stop the bucket ``amber``:: - [resen] >>> stop_jupyter amber + [resen] >>> stop amber The status of ``amber`` should now be ``exited``:: - [resen] >>> status + [resen] >>> list Bucket Name Docker Image Status - amber 2019.1.0rc2 exited + amber 2019.1.0 exited The bucket will still exist and can be restarted at any time, even after quitting and restarting resen. -3. Start a jupyter lab in bucket ``amber`` that has been stopped:: +3. Start the bucket ``amber`` that was just stopped:: - [resen] >>> start_jupyter amber + [resen] >>> start amber The status of ``amber`` should now be ``running``:: [resen] >>> status Bucket Name Docker Image Status - amber 2019.1.0rc2 running + amber 2019.1.0 running + +4. Export bucket ``amber``:: + + [resen] >>> export amber + + The ``export`` command will ask a series of question. First, provide a name for the output *.tar file:: + + >>> Enter name for output tar file: /path/for/output/amber.tar + + If desired, change the default name and tag for the exported image:: + + By default, the output image will be named "amber" and tagged "latest". + >>> Would you like to change the name and tag? (y/n): y + >>> Image name: custom_name + >>> Image tag: custom_tag + + Specify if you want all mounted directories to be included in the exported bucket. Answering `n` to this query will allow you to see how large each mount is and specify which you would like to include. Consider excluding any mounts that are not nessesary for the analysis to reduce the size of the output file:: + The following local directories are mounted to the bucket (total 2212 MB): + /home/usr/mount1 + /home/usr/mount2 + /home/usr/mount3 + >>> Would you like to include all of these in the exported bucket? (y/n): n + >>> Include /home/usr/mount1 [154.68095 MB]? (y/n): y + >>> Include /home/usr/mount2 [2005.28493 MB]? (y/n): y + >>> Include /home/usr/mount3 [53.59823 MB]? (y/n): y - The jupyter lab server starts in the ``/home/jovyan`` directory, which should include the persistent storage directories ``work`` and ``mount``. - The user can alternate between the jupyter lab and the classic notebook view by changing the url in the browser from ``http://localhost:8000/lab`` to ``http://localhost:8000/tree``. Alternatively one can switch from the lab to the notebook through Menu -> Help -> Launch Classic Notebook. + Confirm that you want to continue with the export. The values shown should be considered a "high-side" approximation and may not be the actual final size:: + + This export could require up to 13337 MB of disk space to complete and will produce an output file up to 4600 MB. + >>> Are you sure you would like to continue? (y/n): y + Exporting bucket amber. This will take several minutes. + +5. Import a new bucket, ``amber2``, from a tar file ``amber.tar``:: + + [resen] >>> import + + This command will also ask a series of questions. First provide a name for the imported bucket:: + + Please enter a name for your bucket. + Valid names may not contain spaces and must start with a letter and be less than 20 characters long. + >>> Enter bucket name: amber2 + + Specify the *.tar file to import the bucket from:: + + >>> Enter name for input tar file: /path/to/file/amber.tar + + If desired, enter a custom image name and tag. If not provided, the name an image saved on export will be used:: + + >>> Would you like to keep the default name and tag for the imported image? (y/n): n + >>> Image name: amber2 + >>> Image tag: new_tag + + When a bucket that had mounts is imported, the mounted directories must be extracted and saved on the local machine. Resen will do this automatically, but you have the option to specify where these files should be saved instead of the default location:: + + The default directory to extract the bucket metadata and mounts to is /default/save/path/resen_amber2. + >>> Would you like to specify and alternate directory? (y/n): y + >>> Enter path to directory: /new_save_path + + Aside from the existing mounts, you can add new mounts to a imported bucket. This is useful if you would like to repeat the analysis with a different dataset:: + + >>> Mount additional storage to the imported bucket? (y/n): y + >>> Enter local path: /new/local/path/new_mount + >>> Enter bucket path: /home/jovyan/mount/new_mount + >>> Enter permissions (r/rw): r + >>> Mount additional storage to /home/jovyan/mount? (y/n): n + + Similar to ``create``, you have the option to start jupyter lab immediately after the bucket is imported:: + + >>> Start bucket and jupyterlab? (y/n): y + ...starting jupyterlab... + Jupyter lab can be accessed in a browser at: http://localhost:9003/?token=70532767bab0ddc4febe2790efaaf974961e961e78e6025a Remove a Bucket --------------- +**WARNING**: This will permanently delete the bucket. Any work that was not saved in a mounted storage directory or downloaded from the bucket will be **permanently lost**. + The user can delete a bucket with the following command:: - [resen] >>> remove_bucket amber + [resen] >>> remove amber + +A bucket that is running needs to be stopped before being removed. + -A bucket that is running needs to be stopped before removed. -WARNING: This will permanently delete the bucket. Any work that was not saved in a mounted storage directory will be lost. diff --git a/resen/DockerHelper.py b/resen/DockerHelper.py new file mode 100644 index 0000000..c41b7d1 --- /dev/null +++ b/resen/DockerHelper.py @@ -0,0 +1,261 @@ +#!/usr/bin/env python + +import gzip +import time +import docker +import requests + +# all the docker commands wrapped up nicely + +# how do we call docker commands? subprocess? os.call? +# TODO: Use the docker SDK (https://docker-py.readthedocs.io/en/stable/) +class DockerHelper(): + def __init__(self): + # TODO: define these in a dictionary or json file for each version of resen-core + # need to get information for each resen-core from somewhere. + # Info like, what internal port needs to be exposed? Where do we get the image from? etc. + # mounting directory in the container? + # What does container.reload() do? Do we need it? Where? + self.container_prefix = 'resen_' + + self.docker = docker.from_env() + + # def create_container(self,**input_kwargs): + def create_container(self,bucket): + ''' + Create a docker container with the image, mounts, and ports set in this bucket. If the image + does not exist locally, pull it. + ''' + + # set up basic keyword argument dict + kwargs = dict() + kwargs['name'] = self.container_prefix + bucket['name'] + kwargs['command'] = 'bash' + kwargs['tty'] = True + kwargs['ports'] = dict() + + # if bucket has ports, add these to kwargs + for host, container, tcp in bucket['port']: + if tcp: + key = '%s/tcp' % (container) + else: + key = '%s/udp' % (container) + kwargs['ports'][key] = host + + # if bucket has mounts, add these to kwargs + kwargs['volumes'] = dict() + for host, container, permissions in bucket['storage']: + temp = {'bind': container, 'mode': permissions} + kwargs['volumes'][host] = temp + + # check if we have image, if not, pull it + local_image_ids = [x.id for x in self.docker.images.list()] + if bucket['image']['image_id'] not in local_image_ids: + # print("Pulling image: %s" % bucket['image']['repo']) + # print(" This may take some time...") + # status = self.stream_pull_image(bucket['pull_image']) + self.stream_pull_image(bucket['image']) + # image = self.docker.images.get(bucket['pull_image']) + # repo,digest = pull_image.split('@') + # # When pulling from repodigest sha256 no tag is assigned. So: + # image.tag(repo, tag=bucket['image']) + # print("Done!") + + # start the container + container = self.docker.containers.create(bucket['image']['image_id'],**kwargs) + + return container.id, container.status + + + def remove_container(self,bucket, remove_image=False): + ''' + Remove the container associated with the provided bucket. + ''' + container = self.docker.containers.get(bucket['container']) + container.remove() + + if remove_image: + self.docker.images.remove(bucket['image']['image_id']) + return + + + def start_container(self, bucket): + ''' + Start a container. + ''' + # need to check if bucket config has changed since last run + container = self.docker.containers.get(bucket['container']) + container.start() # this does nothing if already started + container.reload() + # print(container.status) + # time.sleep(0.1) + # print(container.status) + return container.status + + + def stop_container(self,bucket): + ''' + Stop a container. + ''' + container = self.docker.containers.get(bucket['container']) + container.stop() # this does nothing if already stopped + container.reload() + # time.sleep(0.1) + return container.status + + + def execute_command(self,bucket,command,user='jovyan',detach=True): + ''' + Execute a command in a container. Returns the exit code and output + ''' + container = self.docker.containers.get(bucket['container']) + result = container.exec_run(command,user=user,detach=detach) + return result.exit_code, result.output + + + # def stream_pull_image(self,pull_image): + def stream_pull_image(self,image): + ''' + Pull image from dockerhub. + ''' + import datetime + # time formatting + def truncate_secs(delta_time, fmt=":%.2d"): + delta_str = str(delta_time).split(':') + return ":".join(delta_str[:-1]) + fmt%(float(delta_str[-1])) + # progress bar + def update_bar(sum_total,accumulated,t0,current_time, scale=0.5): + percentage = accumulated/sum_total*100 + nchars = int(percentage*scale) + bar = "\r["+nchars*"="+">"+(int(100*scale)-nchars)*" "+"]" + time_info = "Elapsed time: %s"%truncate_secs(current_time - t0) + print(bar+" %6.2f %%, %5.3f/%4.2fGB %s"%(percentage, + accumulated/1024**3,sum_total/1024**3,time_info),end="") + + print('Pulling image: {}:{}'.format(image['repo'],image['version'])) + print(' This may take some time...') + + id_list = [] + id_current = [] + id_total = 0 + t0 = prev_time = datetime.datetime.now() + # define pull_image sha256 + pull_image = '{}/{}@{}'.format(image['org'],image['repo'],image['repodigest']) + try: + # Use a lower level pull call to stream the pull + for line in self.docker.api.pull(pull_image,stream=True, decode=True): + if 'progress' not in line: + continue + line_current = line['progressDetail']['current'] + if line['id'] not in id_list: + id_list.append(line['id']) + id_current.append(line_current) + id_total += line['progressDetail']['total'] + else: + id_current[id_list.index(line['id'])] = line_current + current_time = datetime.datetime.now() + if (current_time-prev_time).total_seconds()<1: + # To limit print statements to no more than 1 per second. + continue + prev_time = current_time + update_bar(id_total,sum(id_current),t0,current_time) + # Last update of the progress bar: + update_bar(id_total,sum(id_current),t0,current_time) + except Exception as e: + raise RuntimeError("\nException encountered while pulling image {}\nException: {}".format(pull_image,str(e))) + + print() # to avoid erasing the progress bar at the end + + # repo,digest = pull_image.split('@') + # When pulling from repodigest sha256 no tag is assigned. So: + docker_image = self.docker.images.get(pull_image) + docker_image.tag('{}/{}'.format(image['org'],image['repo']), tag=image['version']) + print("Done!") + + return + + def export_container(self,bucket,filename,repo,tag): + ''' + Export existing container to a tared image file. After tar file has been created, image of container is removed. + ''' + + # TODO: + # Add checks that image was sucessfully saved before removing it? + + container = self.docker.containers.get(bucket['container']) + + # set a long timeout for this - image save takes a while + default_timeout = self.docker.api.timeout + self.docker.api.timeout = 60.*60.*24. + + # image_name = '{}:{}'.format(repo,tag) + full_repo = '{}/{}'.format(bucket['image']['org'],repo) + image_name = '{}:{}'.format(full_repo,tag) + + try: + # create new image from container + container.commit(repository=full_repo,tag=tag) + + # save image as *.tar file + image = self.docker.images.get(image_name) + out = image.save() + # with open(str(filename), 'wb') as f: + with gzip.open(str(filename),'wb',compresslevel=1) as f: + for chunk in out: + f.write(chunk) + + except requests.exceptions.ReadTimeout: + raise RuntimeError('Timeout while exporting bucket!') + + finally: + # remove image after it has been saved or if a timeout occurs + self.docker.images.remove(image_name) + + # reset default timeout + self.docker.api.timeout = default_timeout + + return + + def import_image(self,filename,repo,tag): + ''' + Import an image from a tar file. Return the image ID. + ''' + + with open(str(filename), 'rb') as f: + image = self.docker.images.load(f)[0] + + # add tag + image.tag(repo, tag) + + return image.id + + def get_container_size(self, bucket): + # determine the size of the container (disk space) + # docker container inspect (https://docs.docker.com/engine/reference/commandline/container_inspect/) should be able to be used + # for this purpose, but it looks like the docker SDK equivilent (APIClient.inspect_container()) does not include fuctionality + # for the --size flag (https://docker-py.readthedocs.io/en/stable/api.html#module-docker.api.container), so the dict returned + # does not have size information + + # with docker.APIClient() as apiclient: + # info = apiclient.containers(all=True, size=True, filters={'id':bucket['container']})[0] + info = self.docker.api.containers(all=True, size=True, filters={'id':bucket['container']})[0] + + return info['SizeRw']+info['SizeRootFs'] + + def get_container_status(self, bucket): + ''' + Get the status of a particular container. + ''' + container = self.docker.containers.get(bucket['container']) + container.reload() # maybe redundant + + return container.status + + # # get a container object given a container id + # def get_container(self,container_id): + # try: + # container = self.docker.containers.get(container_id) + # return container + # except docker.errors.NotFound: + # print("ERROR: No such container: %s" % container_id) + # return None diff --git a/resen/Resen.py b/resen/Resen.py index faa63e9..b701d74 100644 --- a/resen/Resen.py +++ b/resen/Resen.py @@ -3,7 +3,7 @@ # # Title: resen # -# Author: asreimer +# Author: resen developer team # Description: The resen tool for working with resen-core locally # which allows for listing available core docker # images, creating resen buckets, starting buckets, @@ -14,300 +14,234 @@ # TODO # 1) list available resen-core version from dockerhub -# 2) create a bucket manifest from existing bucket -# 3) load a bucket from manifest file (supports moving from cloud to local, or from one computer to another) -# 4) keep track of whether a jupyter server is running or not already and provide shutdown_jupyter and open_jupyter commands -# 5) freeze a bucket -# 6) check for python 3, else throw error -# 7) when starting a bucket again, need to recreate the container if ports and/or storage locations changed. Can do so with: https://stackoverflow.com/a/33956387 +# 2) check for python 3, else throw error +# 3) when starting a bucket again, need to recreate the container if ports and/or storage locations changed. Can do so with: https://stackoverflow.com/a/33956387 # until this happens, we cannot modify storage nor ports after a bucket has been started -# 8) check that a local port being added isn't already used by another bucket. -# 9) check that a local storage location being added isn't already used by another bucket. +# 4) check that a local port being added isn't already used by another bucket. +# 5) check that a local storage location being added isn't already used by another bucket. + # - add a location for home directory persistent storage + # - how many cpu/ram resources are allowed to be used? + # - json file contains all config info about buckets + # - used to share and freeze buckets + # - track information about buckets (1st time using, which are running?) + +# The fuctions remove_storage and remove_port will probably be used MINMALLY. Is it worth keeping them? import os import cmd # for command line interface import json # used to store bucket manifests locally and for export import time # used for waiting (time.sleep()) +import socket # find available port +import shutil import random # used to generate tokens for jupyter server +import tarfile +import tempfile import tempfile # use this to get unique name for docker container import webbrowser # use this to open web browser from pathlib import Path # used to check whitelist paths from subprocess import Popen, PIPE # used for selinux detection -import docker +from .DockerHelper import DockerHelper class Resen(): def __init__(self): - self.base_config_dir = self._get_config_dir() - self.__locked = False - self.__lock() - - self.bucket_manager = BucketManager(self.base_config_dir) - - def create_bucket(self,bucket_name): - return self.bucket_manager.create_bucket(bucket_name) - - def list_buckets(self,names_only=False,bucket_name=None): - return self.bucket_manager.list_buckets(names_only=names_only,bucket_name=bucket_name) - - def remove_bucket(self,bucket_name): - return self.bucket_manager.remove_bucket(bucket_name) - - def add_storage(self,bucket_name,local,container,permissions): - return self.bucket_manager.add_storage(bucket_name,local,container,permissions) - - def remove_storage(self,bucket_name,local): - return self.bucket_manager.remove_storage(bucket_name,local) - - def add_port(self,bucket_name,local,container,tcp=True): - return self.bucket_manager.add_port(bucket_name,local,container,tcp=tcp) - - def remove_port(self,bucket_name,local): - return self.bucket_manager.remove_port(bucket_name,local) - - def add_image(self,bucket_name,docker_image): - return self.bucket_manager.add_image(bucket_name,docker_image) - - def start_bucket(self,bucket_name): - return self.bucket_manager.start_bucket(bucket_name) - - def stop_bucket(self,bucket_name): - return self.bucket_manager.stop_bucket(bucket_name) - - def start_jupyter(self,bucket_name,local,container): - return self.bucket_manager.start_jupyter(bucket_name,local,container) - def stop_jupyter(self,bucket_name): - return self.bucket_manager.stop_jupyter(bucket_name) + # get configuration info + self.resen_root_dir = self._get_config_dir() + self.resen_home_dir = self._get_home_dir() - def _get_config_dir(self): - appname = 'resen' - - if 'APPDATA' in os.environ: - confighome = os.environ['APPDATA'] - elif 'XDG_CONFIG_HOME' in os.environ: - confighome = os.environ['XDG_CONFIG_HOME'] - else: - confighome = os.path.join(os.environ['HOME'],'.config') - configpath = os.path.join(confighome, appname) - - # TODO: add error checking - if not os.path.exists(configpath): - os.makedirs(configpath) - - return configpath - - def __lock(self): - self.__lockfile = os.path.join(self.base_config_dir,'lock') - if os.path.exists(self.__lockfile): - raise RuntimeError('Another instance of Resen is already running!') - - with open(self.__lockfile,'w') as f: - f.write('locked') - self.__locked = True - - def __unlock(self): - if not self.__locked: - return - - try: - os.remove(self.__lockfile) - self.__locked = False - except FileNotFoundError: - pass - except Exception as e: - print("WARNING: Unable to remove lockfile: %s" % str(e)) - - def __del__(self): - self.__unlock() - - -# All the bucket stuff -# TODO: check status of bucket before updating it in case the bucket has changed status since last operation -class BucketManager(): -# - use a bucket -# - how many are allowed to run simultaneously? -# - use the bucket how? only through jupyter notebook/lab is Ashton's vote. Terminal provided there - - def __init__(self,resen_root_dir): + # set lock + self.__locked = False + self.__lock() - self.resen_root_dir = resen_root_dir + # initialize docker helper self.dockerhelper = DockerHelper() - # load + + # load configuration self.load_config() self.valid_cores = self.__get_valid_cores() self.selinux = self.__detect_selinux() - self.storage_whitelist = ['/home/jovyan/work','/home/jovyan/mount'] + ### NOTE - Does this still need to include '/home/jovyan/work' for server compatability? + ### If so, can we move the white list to resencmd.py? The server shouldn't every try to + ### mount to an illegal location but the user might. + self.storage_whitelist = ['/home/jovyan/mount'] - def __get_valid_cores(self): - # TODO: download json file from resen-core github repo - # and if that fails, fallback to hardcoded list - return [{"version":"2019.1.0rc2","repo":"resen-core","org":"earthcubeingeo", - "image_id":'sha256:8b4750aa5186bdcf69a50fa10b0fd24a7c2293ef6135a9fdc594e0362443c99c', - "repodigest":'sha256:2fe3436297c23a0d5393c8dae8661c40fc73140e602bd196af3be87a5e215bc2'},] def load_config(self): + ''' + Load config file that contains information on existing buckets. + ''' + # define config file name bucket_config = os.path.join(self.resen_root_dir,'buckets.json') - # check if buckets.json exists, if not, initialize empty dictionary - if not os.path.exists(bucket_config): - params = list() - else: - # if it does exist, load it and return # TODO: handle exceptions due to file reading problems (incorrect file permissions) - with open(bucket_config,'r') as f: + # TODO: update status of buckets to double check that status is the same as in bucket.json + try: + # check if buckets.json exists, if not, initialize empty dictionary + with open(bucket_config,'r') as f: params = json.load(f) + except FileNotFoundError: + # if config file doesn't exist, initialize and empty list + params = list() self.buckets = params - self.bucket_names = [x['bucket']['name'] for x in self.buckets] + self.bucket_names = [x['name'] for x in self.buckets] - # TODO: update status of buckets to double check that status is the same as in bucket.json def save_config(self): + ''' + Save config file with information on existing buckets + ''' + # define config file name bucket_config = os.path.join(self.resen_root_dir,'buckets.json') # TODO: handle exceptions due to file writing problems (no free disk space, incorrect file permissions) - with open(bucket_config,'w') as f: + with open(bucket_config,'w') as f: json.dump(self.buckets,f) + + def get_bucket(self,bucket_name): + ''' + Retrieve a bucket object by its name. Raise an error if the bucket does not exist. + ''' + try: + ind = self.bucket_names.index(bucket_name) + except ValueError: + raise ValueError('Bucket with name: %s does not exist!' % bucket_name) + + bucket = self.buckets[ind] + return bucket + + def create_bucket(self,bucket_name): - # - add a location for home directory persistent storage - # - how many cpu/ram resources are allowed to be used? - # - json file contains all config info about buckets - # - used to share and freeze buckets - # - track information about buckets (1st time using, which are running?) + ''' + Create "empty" bucket. Only name assigned. + ''' + # raise error if bucket_name already in uses if bucket_name in self.bucket_names: - print("ERROR: Bucket with name: %s already exists!" % (bucket_name)) - return False + raise ValueError("Bucket with name: %s already exists!" % (bucket_name)) params = dict() - params['bucket'] = dict() - params['docker'] = dict() - params['bucket']['name'] = bucket_name - params['docker']['image'] = None - params['docker']['container'] = None - params['docker']['port'] = list() - params['docker']['storage'] = list() - params['docker']['status'] = None - params['docker']['jupyter'] = dict() - params['docker']['jupyter']['token'] = None - params['docker']['jupyter']['port'] = None + params['name'] = bucket_name + params['image'] = None + params['container'] = None + params['port'] = list() + params['storage'] = list() + params['status'] = None + params['jupyter'] = dict() + params['jupyter']['token'] = None + params['jupyter']['port'] = None # now add the new bucket to the self.buckets config and then update the config file self.buckets.append(params) - self.bucket_names = [x['bucket']['name'] for x in self.buckets] + self.bucket_names = [x['name'] for x in self.buckets] self.save_config() - return True + return - def list_buckets(self,names_only=False,bucket_name=None): - if bucket_name is None: - if names_only: - print("{:<0}".format("Bucket Name")) - for name in self.bucket_names: - print("{:<0}".format(str(name))) + + def remove_bucket(self,bucket_name): + ''' + Remove a bucket, including the corresponding container. + ''' + + self.update_bucket_statuses() + bucket = self.get_bucket(bucket_name) + + # cannot remove bucket if currently running - raise error + if bucket['status'] == 'running': + raise RuntimeError('ERROR: Bucket %s is running, cannot remove.' % (bucket['name'])) + + # are other buckets using the same image? + # if so, we shouldn't try to remove the image! + rm_image_id = bucket['image']['image_id'] + buckets_with_same_id = list() + for x in self.buckets: + other_id = x['image']['image_id'] + other_name = x['name'] + if other_id == rm_image_id and other_name != bucket_name: + buckets_with_same_id.append(other_name) + + remove_image = len(buckets_with_same_id) == 0 + + # if docker container created, remove it first and update status + if bucket['status'] in ['created','exited'] and bucket['container'] is not None: + # if bucket imported, clean up by removing image and import directory + if 'import_dir' in bucket: + self.dockerhelper.remove_container(bucket, remove_image=remove_image) + # also remove temporary import directory + shutil.rmtree(bucket['import_dir']) else: + self.dockerhelper.remove_container(bucket) - print("{:<20}{:<25}{:<25}".format("Bucket Name","Docker Image","Status")) - for bucket in self.buckets: - name = self.__trim(str(bucket['bucket']['name']),18) - image = self.__trim(str(bucket['docker']['image']),23) - status = self.__trim(str(bucket['docker']['status']),23) - print("{:<20}{:<25}{:<25}".format(name, image, status)) + bucket['status'] = None + bucket['container'] = None + self.save_config() - else: # TODO, print all bucket info for bucket_name - if not bucket_name in self.bucket_names: - print("ERROR: Bucket with name: %s does not exist!" % bucket_name) - return False + # identify bucket index and remove it from both buckets and bucket_names + ind = self.bucket_names.index(bucket_name) + self.buckets.pop(ind) + self.bucket_names.pop(ind) + self.save_config() - ind = self.bucket_names.index(bucket_name) - # TODO: make this print a nice table - print(self.buckets[ind]) + return - return True - def __trim(self,string,length): - if len(string) > length: - return string[:length-3]+'...' - else: - return string + def set_image(self,bucket_name,docker_image): + ''' + Set the image to use in a bucket + ''' + # It should be fine to overwrite an existing image if the container hasn't been started yet + # would be helpful to save image org and repo as well for export purposes + # should we check if the image ID is available locally and if not pull it HERE insead of in the container creation? - def remove_bucket(self,bucket_name): - if not bucket_name in self.bucket_names: - print("ERROR: Bucket with name: %s does not exist!" % bucket_name) - return False + # get bucket + bucket = self.get_bucket(bucket_name) - self.update_bucket_statuses() - ind = self.bucket_names.index(bucket_name) - bucket = self.buckets[ind] + # if container has been created, cannot change the image + if bucket['status'] is not None: + raise RuntimeError("Bucket has already been started, cannot set new image.") - if bucket['docker']['status'] == 'running': - #container is running and we should throw an error - print('ERROR: Bucket %s is running, cannot remove.' % (bucket['bucket']['name'])) - return False + # check that input is a valid image + valid_versions = [x['version'] for x in self.valid_cores] + if not docker_image in valid_versions: + raise ValueError("Invalid resen-core version %s. Valid versions: %s" % (docker_image,', '.join(valid_versions))) - if bucket['docker']['status'] in ['created','exited'] and bucket['docker']['container'] is not None: - # then we can remove container and update status - success = self.dockerhelper.remove_container(bucket['docker']['container']) - if success: - self.buckets[ind]['docker']['status'] = None - self.buckets[ind]['docker']['container'] = None - self.save_config() + ind = valid_versions.index(docker_image) + image = self.valid_cores[ind] + bucket['image'] = image - ind = self.bucket_names.index(bucket_name) - bucket = self.buckets[ind] - if bucket['docker']['container'] is None: - self.buckets.pop(ind) - self.bucket_names = [x['bucket']['name'] for x in self.buckets] - self.save_config() - return True - else: - print('ERROR: Failed to remove bucket %s' % (bucket['bucket']['name'])) - return False + self.save_config() + + return - def load(self): - # - import a bucket - # - docker container export? (https://docs.docker.com/engine/reference/commandline/container_export/) - # - check iodide, how do they share - pass - - def export(self): - # export a bucket - # - pass - - def freeze_bucket(self): - # - bucket freeze (create docker image) - # - make a Dockerfile, build it, save it to tar.gz - # - docker save (saves an image): https://docs.docker.com/engine/reference/commandline/save/ - # or docker container commit: https://docs.docker.com/engine/reference/commandline/container_commit/ - # - docker image load (opposite of docker save): https://docs.docker.com/engine/reference/commandline/image_load/ - pass def add_storage(self,bucket_name,local,container,permissions='r'): + ''' + Add a host machine storage location to the bucket. + ''' # TODO: investiage difference between mounting a directory and fileblock # See: https://docs.docker.com/storage/ - if not bucket_name in self.bucket_names: - print("ERROR: Bucket with name: %s does not exist!" % bucket_name) - return False + # get bucket + bucket = self.get_bucket(bucket_name) - ind = self.bucket_names.index(bucket_name) - # check if bucket is running - if self.buckets[ind]['docker']['status'] is not None: - print("ERROR: Bucket has already been started, cannot add storage: %s" % (local)) - return False + # if container has been created, cannot add storage + if bucket['status'] is not None: + raise RuntimeError("Bucket has already been started, cannot add storage: %s" % (local)) - # check if storage already exists in list of storage - existing_local = [x[0] for x in self.buckets[ind]['docker']['storage']] + # check if input locations already exist in bucket list of storage + existing_local = [x[0] for x in bucket['storage']] if local in existing_local: - print("ERROR: Local storage location already in use in bucket!") - return False - existing_container = [x[1] for x in self.buckets[ind]['docker']['storage']] + raise FileExistsError('Local storage location already in use in bucket!') + existing_container = [x[1] for x in bucket['storage']] if container in existing_container: - print("ERROR: Container storage location already in use in bucket!") - return False + raise FileExistsError('Container storage location already in use in bucket!') + + # check that local file path exists + if not Path(local).is_dir(): + raise FileNotFoundError('Cannot find local storage location!') # check that user is mounting in a whitelisted location valid = False @@ -317,12 +251,11 @@ def add_storage(self,bucket_name,local,container,permissions='r'): if p == child or p in child.parents: valid = True if not valid: - print("ERROR: Invalid mount location. Can only mount storage into: %s." % ', '.join(self.storage_whitelist)) - return False + raise ValueError("Invalid mount location. Can only mount storage into: %s." % ', '.join(self.storage_whitelist)) + # check and adjust permissions if not permissions in ['r','ro','rw']: - print("ERROR: Invalid permissions. Valid options are 'r' and 'rw'.") - return False + raise ValueError("Invalid permissions. Valid options are 'r' and 'rw'.") if permissions in ['r','ro']: permissions = 'ro' @@ -330,302 +263,311 @@ def add_storage(self,bucket_name,local,container,permissions='r'): if self.selinux: permissions += ',Z' - # TODO: check if storage location exists on host - self.buckets[ind]['docker']['storage'].append([local,container,permissions]) + # Add storage location + bucket['storage'].append([local,container,permissions]) self.save_config() - - return True + + return + def remove_storage(self,bucket_name,local): - if not bucket_name in self.bucket_names: - print("ERROR: Bucket with name: %s does not exist!" % bucket_name) - return False + ''' + Remove a storage location from the bucket. + ''' - ind = self.bucket_names.index(bucket_name) - # check if bucket is running - if self.buckets[ind]['docker']['status'] is not None: - print("ERROR: Bucket has already been started, cannot remove storage: %s" % (local)) - return False - - # check if storage already exists in list of storage - existing_storage = [x[0] for x in self.buckets[ind]['docker']['storage']] + # get bucket + bucket = self.get_bucket(bucket_name) + + # if container created, cannot remove storage + if bucket['status'] is not None: + raise RuntimeError("Bucket has already been started, cannot remove storage: %s" % (local)) + + # find index of storage + existing_storage = [x[0] for x in bucket['storage']] try: - ind2 = existing_storage.index(local) - self.buckets[ind]['docker']['storage'].pop(ind2) - self.save_config() + ind = existing_storage.index(local) + # raise exception if input location does not exist except ValueError: - print("ERROR: Storage location %s not associated with bucket %s" % (local,bucket_name)) - return False - - return True + raise FileNotFoundError("Storage location %s not associated with bucket %s" % (local,bucket_name)) - def add_port(self,bucket_name,local,container,tcp=True): - if not bucket_name in self.bucket_names: - print("ERROR: Bucket with name: %s does not exist!" % bucket_name) - return False + bucket['storage'].pop(ind) + self.save_config() - ind = self.bucket_names.index(bucket_name) - # check if bucket is running - if self.buckets[ind]['docker']['status'] is not None: - print("ERROR: Bucket has already been started, cannot add port: %s" % (local)) - return False + return - # check if local port already exists in list of ports - existing_local = [x[0] for x in self.buckets[ind]['docker']['port']] - if local in existing_local: - print("ERROR: Local port location already in use in bucket!") - return False - # TODO: check if port location exists on host - self.buckets[ind]['docker']['port'].append([local,container,tcp]) + def add_port(self,bucket_name,local=None,container=None,tcp=True): + ''' + Add a port to the bucket + ''' + # get bucket + bucket = self.get_bucket(bucket_name) + + # if container has been created, cannot add port + if bucket['status'] is not None: + raise RuntimeError("Bucket has already been started, cannot add port: %s" % (local)) + + if not local and not container: + # this is not atomic, so it is possible that another process might snatch up the port + local = self.get_port() + container = local + + else: + # check if local/container port already exists in list of ports + existing_local = [x[0] for x in bucket['port']] + if local in existing_local: + raise ValueError('Local port location already in use in bucket!') + existing_container = [x[1] for x in bucket['port']] + if container in existing_container: + raise ValueError('Container port location already in use in bucket!') + + bucket['port'].append([local,container,tcp]) self.save_config() - - return True - def remove_port(self,bucket_name,local): - if not bucket_name in self.bucket_names: - print("ERROR: Bucket with name: %s does not exist!" % bucket_name) - return False + return - ind = self.bucket_names.index(bucket_name) - # check if bucket is running - if self.buckets[ind]['docker']['status'] is not None: - print("ERROR: Bucket has already been started, cannot remove port: %s" % (local)) - return False - - # check if port already exists in list of port - existing_port = [x[0] for x in self.buckets[ind]['docker']['port']] + + def remove_port(self,bucket_name,local): + ''' + Remove a port from the bucket + ''' + # get bucket + bucket = self.get_bucket(bucket_name) + + # if container has been created, cannot remove port + if bucket['status'] is not None: + raise RuntimeError("Bucket has already been started, cannot remove port: %s" % (local)) + + # find port and remove it + existing_port = [x[0] for x in bucket['port']] try: - ind2 = existing_port.index(local) - self.buckets[ind]['docker']['port'].pop(ind2) - self.save_config() + ind = existing_port.index(local) + # raise exception if port is not assigned to bucket except ValueError: - print("ERROR: port location %s not associated with bucket %s" % (local,bucket_name)) - return False - - return True + raise ValueError("Port location %s not associated with bucket %s" % (local,bucket_name)) - def add_image(self,bucket_name,docker_image): - if not bucket_name in self.bucket_names: - print("ERROR: Bucket with name: %s does not exist!" % bucket_name) - return False + bucket['port'].pop(ind) + self.save_config() - # TODO: check if "docker_image" is a valid resen-core image + return - # check if image is already added exists in list of storage - ind = self.bucket_names.index(bucket_name) - existing_image = self.buckets[ind]['docker']['image'] - if not existing_image is None: - print("ERROR: Image %s was already added to bucket %s" % (existing_image,bucket_name)) - return False - valid_versions = [x['version'] for x in self.valid_cores] - if not docker_image in valid_versions: - print("ERROR: Invalid resen-core version %s. Valid version: %s" % (docker_image,', '.join(valid_versions))) - return False - - for x in self.valid_cores: - if docker_image == x['version']: - image = x['version'] - image_id = x['image_id'] - pull_image = '%s/%s@%s' % (x['org'],x['repo'],x['repodigest']) - break + def get_port(self): + # this is not atomic, so it is possible that another process might snatch up the port + # TODO: check if port location exists on host - maybe not? If usuer manually assigns port, ok to trust they know what they're doing? + # check if port avaiable on host (from https://stackoverflow.com/questions/2470971/fast-way-to-test-if-a-port-is-in-use-using-python) + port = 9000 + assigned_ports = [y[0] for x in self.buckets for y in x['port']] + + while True: + with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s: + assigned = s.connect_ex(('localhost', port)) == 0 + if not assigned and not port in assigned_ports: + return port + else: + port += 1 + + + def create_container(self, bucket_name, give_sudo=True): + + # get bucket + bucket = self.get_bucket(bucket_name) + + # Make sure we have an image assigned to the bucket + if bucket['image'] is None: + raise RuntimeError('Bucket does not have an image assigned to it.') - self.buckets[ind]['docker']['image'] = image - self.buckets[ind]['docker']['image_id'] = image_id - self.buckets[ind]['docker']['pull_image'] = pull_image + container_id, status = self.dockerhelper.create_container(bucket) + bucket['container'] = container_id + bucket['status'] = status self.save_config() - - return True - # TODO: def change_image(self,bucket_name,new_docker_image) - # but only if container=None and status=None, in other words, only if the bucket has never been started. + if give_sudo: + # start bucket and execute any commands needed for proper set-up + self.start_bucket(bucket_name) + # run commands to set up sudo for jovyan + self.set_sudo(bucket_name) + self.stop_bucket(bucket_name) def start_bucket(self,bucket_name): - # check if container has been previously started, create one if needed, start bucket if not running - if not bucket_name in self.bucket_names: - print("ERROR: Bucket with name: %s does not exist!" % bucket_name) - return False + ''' + Start the bucket + ''' + # get bucket + bucket = self.get_bucket(bucket_name) + + # if bucket is already running, do nothing + if bucket['status'] in ['running']: + return - ind = self.bucket_names.index(bucket_name) - bucket = self.buckets[ind] - - # Make sure we have an image assigned to the bucket - existing_image = bucket['docker']['image'] - if existing_image is None: - print("ERROR: Bucket does not have an image assigned to it.") - return False + # If a container hasn't been created yet, raise error + if bucket['container'] is None: + raise RuntimeError('Container for this bucket has not been created yet. Cannot start bucket.') - if bucket['docker']['container'] is None: - # no container yet created, so create one - kwargs = dict() - kwargs['ports'] = bucket['docker']['port'] - kwargs['storage'] = bucket['docker']['storage'] - kwargs['bucket_name'] = bucket['bucket']['name'] - kwargs['image_name'] = bucket['docker']['image'] - kwargs['image_id'] = bucket['docker']['image_id'] - kwargs['pull_image'] = bucket['docker']['pull_image'] - container_id = self.dockerhelper.create_container(**kwargs) - - if container_id is None: - print("ERROR: Failed to create container") - return False + # start the container and update status + status = self.dockerhelper.start_container(bucket) + bucket['status'] = status + self.save_config() - self.buckets[ind]['docker']['container'] = container_id - self.save_config() + # raise error if bucket did not start sucessfully + if status != 'running': + raise RuntimeError('Failed to start bucket %s' % (bucket['name'])) + + return - self.update_bucket_statuses() - ind = self.bucket_names.index(bucket_name) - bucket = self.buckets[ind] - if bucket['docker']['status'] in ['created', 'exited']: - # then we can start the container and update status - success = self.dockerhelper.start_container(bucket['docker']['container']) - if success: - self.buckets[ind]['docker']['status'] = 'running' - self.save_config() - return True - else: - print('ERROR: Failed to start bucket %s' % (bucket['bucket']['name'])) - return False - else: - #contained is already running and we should throw an error - print('ERROR: Bucket %s is already running!' % (bucket['bucket']['name'])) - return False - def stop_bucket(self,bucket_name): - if not bucket_name in self.bucket_names: - print("ERROR: Bucket with name: %s does not exist!" % bucket_name) - return False + ''' + Stop bucket + ''' self.update_bucket_statuses() - ind = self.bucket_names.index(bucket_name) - bucket = self.buckets[ind] + # get bucket + bucket = self.get_bucket(bucket_name) - if bucket['docker']['status'] in ['running']: - # then we can start the container and update status - success = self.dockerhelper.stop_container(bucket['docker']['container']) - if success: - self.buckets[ind]['docker']['status'] = 'exited' - self.save_config() - return True - else: - print('ERROR: Failed to stop bucket %s' % (bucket['bucket']['name'])) - return False - else: - #contained is already running and we should throw an error - print('ERROR: Bucket %s is not running!' % (bucket['bucket']['name'])) - return False + # if bucket is already stopped, do nothing + if bucket['status'] in ['created', 'exited']: + return - def execute_command(self,bucket_name,command,detach=True): - if not bucket_name in self.bucket_names: - print("ERROR: Bucket with name: %s does not exist!" % bucket_name) - return False + # stop the container and update status + status = self.dockerhelper.stop_container(bucket) + bucket['status'] = status + self.save_config() + + if status != 'exited': + raise RuntimeError('Failed to stop bucket %s' % (bucket['name'])) + + return + + def execute_command(self,bucket_name,command,user='jovyan',detach=True): + ''' + Execute a command in the bucket. Returns the exit code and output form the command, if applicable (if not detached?). + ''' self.update_bucket_statuses() - ind = self.bucket_names.index(bucket_name) - bucket = self.buckets[ind] + # get bucket + bucket = self.get_bucket(bucket_name) - if bucket['docker']['status'] in ['running']: - # then we can start the container and update status - result = self.dockerhelper.execute_command(bucket['docker']['container'],command,detach=detach) - status, output = result - if (detach and status is None) or (not detach and status==0): - return True - else: - print('ERROR: Failed to execute command %s' % (command)) - return False - else: - #contained is already running and we should throw an error - print('ERROR: Bucket %s is not running!' % (bucket['bucket']['name'])) - return False + # raise error if bucket not running + if bucket['status'] not in ['running']: + raise RuntimeError('Bucket %s is not running!' % (bucket['name'])) - def start_jupyter(self,bucket_name,local_port,container_port): - if not bucket_name in self.bucket_names: - print("ERROR: Bucket with name: %s does not exist!" % bucket_name) - return False + # execute command + result = self.dockerhelper.execute_command(bucket,command,user=user,detach=detach) + code, output = result + if (detach and code is not None) or (not detach and code!=0): + raise RuntimeError('Failed to execute command %s' % (command)) - ind = self.bucket_names.index(bucket_name) - bucket = self.buckets[ind] - pid = self.get_jupyter_pid(bucket['docker']['container']) + return result + + + def set_sudo(self, bucket_name, password='ganimede'): + ''' + Add jovyan user to sudoers + ''' + cmd = "bash -cl 'echo \"jovyan:{}\" | chpasswd && usermod -aG sudo jovyan && sed --in-place \"s/^#\s*\(%sudo\s\+ALL=(ALL:ALL)\s\+ALL\)/\\1/\" /etc/sudoers'".format(password) + self.execute_command(bucket_name, cmd, user='root') + return + + + def start_jupyter(self,bucket_name,local_port=None,container_port=None): + ''' + Start a jupyter server in the bucket and open a web browser window to a jupyter lab session. Server will + use the specified local and container ports (ports must be a matched pair!) + ''' + # TODO: + # Identify port ONLY with local port? + # Select port automatically if none provided? + # Allow multiple jupyter servers to run simultaniously? Would this ever be useful? + + # get bucket + bucket = self.get_bucket(bucket_name) + + # check if jupyter server already running - if so, proint the url to the screen + pid = self.get_jupyter_pid(bucket_name) if not pid is None: - port = bucket['docker']['jupyter']['port'] - token = bucket['docker']['jupyter']['token'] + port = bucket['jupyter']['port'] + token = bucket['jupyter']['token'] url = 'http://localhost:%s/?token=%s' % (port,token) print("Jupyter lab is already running and can be accessed in a browser at: %s" % (url)) - return True + return + # if ports are not specified, use the first port set from the bucket + if not local_port and not container_port: + local_port = bucket['port'][0][0] + container_port = bucket['port'][0][1] + # set a random token and form token = '%048x' % random.randrange(16**48) - command = "bash -cl 'source activate py36 && jupyter lab --no-browser --ip 0.0.0.0 --port %s --NotebookApp.token=%s --KernelSpecManager.ensure_native_kernel=False'" + command = "bash -cl 'source /home/jovyan/envs/py36/bin/activate py36 && jupyter lab --no-browser --ip 0.0.0.0 --port %s --NotebookApp.token=%s --KernelSpecManager.ensure_native_kernel=False'" command = command % (container_port, token) - status = self.execute_command(bucket_name,command,detach=True) - if status == False: - return False + # exectute command to start jupyter server + self.execute_command(bucket_name,command,detach=True) time.sleep(0.1) # now check that jupyter is running self.update_bucket_statuses() - pid = self.get_jupyter_pid(bucket['docker']['container']) + pid = self.get_jupyter_pid(bucket_name) - if pid is not None: - self.buckets[ind]['docker']['jupyter']['token'] = token - self.buckets[ind]['docker']['jupyter']['port'] = local_port - self.save_config() - url = 'http://localhost:%s/?token=%s' % (local_port,token) - print("Jupyter lab can be accessed in a browser at: %s" % (url)) - time.sleep(3) - webbrowser.open(url) - return True - else: - print("ERROR: Failed to start jupyter server!") - return False + if pid is None: + raise RuntimeError("Failed to start jupyter server!") - def stop_jupyter(self,bucket_name): - if not bucket_name in self.bucket_names: - print("ERROR: Bucket with name: %s does not exist!" % bucket_name) - return False + # set jupyter token an port + bucket['jupyter']['token'] = token + bucket['jupyter']['port'] = local_port + self.save_config() - ind = self.bucket_names.index(bucket_name) - bucket = self.buckets[ind] - if not bucket['docker']['status'] in ['running']: - return True + # print url to access jupyter lab to screen and automatically open in web browser + url = 'http://localhost:%s/?token=%s' % (local_port,token) + print("Jupyter lab can be accessed in a browser at: %s" % (url)) + time.sleep(3) + webbrowser.open(url) - pid = self.get_jupyter_pid(bucket['docker']['container']) + return + + + def stop_jupyter(self,bucket_name): + ''' + Stop jupyter server + ''' + # get bucket + bucket = self.get_bucket(bucket_name) + + # if jupyter server not running, do nothing + pid = self.get_jupyter_pid(bucket_name) if pid is None: return True - port = bucket['docker']['jupyter']['port'] + # form python command to stop jupyter and execute it + port = bucket['jupyter']['port'] python_cmd = 'from notebook.notebookapp import shutdown_server, list_running_servers; ' python_cmd += 'svrs = [x for x in list_running_servers() if x[\\\"port\\\"] == %s]; ' % (port) python_cmd += 'sts = True if len(svrs) == 0 else shutdown_server(svrs[0]); print(sts)' command = "bash -cl '/home/jovyan/envs/py36/bin/python -c \"%s \"'" % (python_cmd) status = self.execute_command(bucket_name,command,detach=False) - self.update_bucket_statuses() - # now verify it is dead - pid = self.get_jupyter_pid(bucket['docker']['container']) + pid = self.get_jupyter_pid(bucket_name) if not pid is None: - print("ERROR: Failed to stop jupyter lab.") - return False + raise RuntimeError("Failed to stop jupyter lab.") - self.buckets[ind]['docker']['jupyter']['token'] = None - self.buckets[ind]['docker']['jupyter']['port'] = None + # Update jupyter token and port to None + bucket['jupyter']['token'] = None + bucket['jupyter']['port'] = None self.save_config() - return True - - def get_jupyter_pid(self,container): + return - result = self.dockerhelper.execute_command(container,'ps -ef',detach=False) - if result == False: - return None - output = result[1].decode('utf-8').split('\n') + def get_jupyter_pid(self,bucket_name): + ''' + Get PID for the jupyter server running in a particular bucket + ''' + code, output = self.execute_command(bucket_name, 'ps -ef', detach=False) + output = output.decode('utf-8').split('\n') pid = None for line in output: @@ -636,232 +578,323 @@ def get_jupyter_pid(self,container): return pid - def update_bucket_statuses(self): - for i,bucket in enumerate(self.buckets): - container_id = bucket['docker']['container'] - if container_id is None: - continue - status = self.dockerhelper.get_container_status(container_id) - if status: - self.buckets[i]['docker']['status'] = status - self.save_config() + def export_bucket(self, bucket_name, outfile, exclude_mounts=[], img_repo=None, img_tag=None): + ''' + Export a bucket + ''' + # TODO: some kind of status bar would be useful - this takes a while + # Should we include "human readable" metadata? + + # make sure the output filename has the .tgz or .tar.gz extension on it + name, ext = os.path.splitext(outfile) + if not ext == '.tar': + outfile = name + '.tar' + + # get bucket + bucket = self.get_bucket(bucket_name) + + # create temporary directory that will become the final bucket tar file + print('Exporting bucket: %s...' % str(bucket_name)) + with tempfile.TemporaryDirectory() as bucket_dir: + + bucket_dir_path = Path(bucket_dir) + + # try: + + # initialize manifest + manifest = dict() + + # # find container size and determine if there's enough disk space for the export + # container_size = self.bucket_diskspace(bucket_name) + # disk_space = + # if disk_space < container_size*3: + # raise RuntimeError("Not enough disk space for image export!") + + if not img_repo: + img_repo = bucket['name'].lower() + if not img_tag: + img_tag = 'latest' + + # export container to image *.tar file + image_file_name = '{}_image.tgz'.format(bucket_name) + print('...exporting image...') + status = self.dockerhelper.export_container(bucket, bucket_dir_path.joinpath(image_file_name), img_repo, img_tag) + print('...done') + manifest['image'] = image_file_name + manifest['image_repo'] = img_repo + manifest['image_tag'] = img_tag + + # save all mounts individually as *.tgz files + manifest['mounts'] = list() + for mount in bucket['storage']: + # skip mount if it is listed in exclude_mounts + if mount[0] in exclude_mounts: + continue - def get_container(self,bucket_name): - if not bucket_name in self.bucket_names: - print("ERROR: Bucket with name: %s does not exist!" % bucket_name) - return False + source_dir = Path(mount[0]) + mount_file_name = '{}_mount.tgz'.format(source_dir.name) + print('...exporting mount: %s' % str(source_dir)) + with tarfile.open(str(bucket_dir_path.joinpath(mount_file_name)), "w:gz", compresslevel=1) as tar: + tar.add(str(source_dir), arcname=source_dir.name) - ind = self.bucket_names.index(bucket_name) - return self.buckets[ind]['docker']['container'] + manifest['mounts'].append([mount_file_name, mount[1], mount[2]]) - def __detect_selinux(self): - try: - p = Popen(['/usr/sbin/getenforce'], stdin=PIPE, stdout=PIPE, stderr=PIPE) - output, err = p.communicate() - output = output.decode('utf-8').strip('\n') - rc = p.returncode + # save manifest file + print('...saving manifest') + with open(str(bucket_dir_path.joinpath('manifest.json')),'w') as f: + json.dump(manifest, f) - if rc == 0 and output == 'Enforcing': - return True - else: - return False - except FileNotFoundError: - return False + # save entire bucket as tar file + with tarfile.open(outfile, 'w') as tar: + for f in os.listdir(str(bucket_dir_path)): + tar.add(str(bucket_dir_path.joinpath(f)), arcname=f) + print('...Bucket export complete!') - # TODO: def reset_bucket(self,bucket_name): - # used to reset a bucket to initial state (stop existing container, delete it, create new container) + # except (RuntimeError,tarfile.TarError) as e: + # raise RuntimeError('Bucket Export Failed: {}'.format(str(e))) + # finally: + # # remove temporary directory + # # shutil.rmtree(bucket_dir) + # bucket_dir.cleanup() -# all the docker commands wrapped up nicely + return -# how do we call docker commands? subprocess? os.call? -# TODO: Use the docker SDK (https://docker-py.readthedocs.io/en/stable/) -class DockerHelper(): - def __init__(self): - # TODO: define these in a dictionary or json file for each version of resen-core - # need to get information for each resen-core from somewhere. - # Info like, what internal port needs to be exposed? Where do we get the image from? etc. - # mounting directory in the container? - self.container_prefix = 'resen_' - - self.docker = docker.from_env() - - def create_container(self,**input_kwargs): - # TODO: Does container already exist? Does image exist (if not, pull it)? - ports = input_kwargs.get('ports',None) - storage = input_kwargs.get('storage',None) - bucket_name = input_kwargs.get('bucket_name',None) - image_name = input_kwargs.get('image_name',None) - image_id = input_kwargs.get('image_id',None) - pull_image = input_kwargs.get('pull_image',None) - - - # TODO: jupyterlab or jupyter notebook, pass ports, mount volumes, generate token for lab/notebook server - create_kwargs = dict() - create_kwargs['name'] = self.container_prefix + bucket_name - create_kwargs['command'] = 'bash' - create_kwargs['tty'] = True - create_kwargs['ports'] = dict() - - for host, container, tcp in ports: - if tcp: - key = '%s/tcp' % (container) - else: - key = '%s/udp' % (container) - create_kwargs['ports'][key] = host - - create_kwargs['volumes'] = dict() - for host, container, permissions in storage: - # TODO: if SELinux, modify permissions to include ",Z" - key = host - temp = {'bind': container, 'mode': permissions} - create_kwargs['volumes'][key] = temp - - # check if we have image, if not, pull it - local_image_ids = [x.id for x in self.docker.images.list()] - if image_id not in local_image_ids: - print("Pulling image: %s" % image_name) - print(" This may take some time...") - status = self.stream_pull_image(pull_image) - if not status: - print("ERROR: Failed to pull image") - return None - image = self.docker.images.get(pull_image) - repo,digest = pull_image.split('@') - # When pulling from repodigest sha256 no tag is assigned. So: - image.tag(repo, tag=image_name) - print("Done!") - - container_id = self.docker.containers.create(image_id,**create_kwargs) - - return container_id.id - - def stream_pull_image(self,pull_image): - import datetime - # time formatting - def truncate_secs(delta_time, fmt=":%.2d"): - delta_str = str(delta_time).split(':') - return ":".join(delta_str[:-1]) + fmt%(float(delta_str[-1])) - # progress bar - def update_bar(sum_total,accumulated,t0,current_time, scale=0.5): - percentage = accumulated/sum_total*100 - nchars = int(percentage*scale) - bar = "\r["+nchars*"="+">"+(int(100*scale)-nchars)*" "+"]" - time_info = "Elapsed time: %s"%truncate_secs(current_time - t0) - print(bar+" %6.2f %%, %5.3f/%4.2fGB %s"%(percentage, - accumulated/1024**3,sum_total/1024**3,time_info),end="") - - id_list = [] - id_current = [] - id_total = 0 - t0 = prev_time = datetime.datetime.now() - try: - # Use a lower level pull call to stream the pull - for line in self.docker.api.pull(pull_image,stream=True, decode=True): - if 'progress' not in line: - continue - line_current = line['progressDetail']['current'] - if line['id'] not in id_list: - id_list.append(line['id']) - id_current.append(line_current) - id_total += line['progressDetail']['total'] - else: - id_current[id_list.index(line['id'])] = line_current - current_time = datetime.datetime.now() - if (current_time-prev_time).total_seconds()<1: - # To limit print statements to no more than 1 per second. - continue - prev_time = current_time - update_bar(id_total,sum(id_current),t0,current_time) - # Last update of the progress bar: - update_bar(id_total,sum(id_current),t0,current_time) - except Exception as e: - print("\nException encountered while pulling image %s" % pull_image) - print("Exception: %s" % str(e)) - return False - print() # to avoid erasing the progress bar at the end + def import_bucket(self,bucket_name,filename,extract_dir=None,img_repo=None,img_tag=None,remove_image_file=False): + ''' + Import bucket from tgz file. Extract image and mounts. Set up new bucket with image and mounts. + This does NOT add ports (these should be selected based on new local computer) and container is NOT created/started. + ''' - return True + if not extract_dir: + extract_dir = Path(filename).resolve().with_name('resen_{}'.format(bucket_name)) + else: + extract_dir = Path(extract_dir) - def start_container(self, container_id): - # need to check if bucket config has changed since last run - # need to check if bucket is already running - container = self.get_container(container_id) - if container is None: - return False + # untar bucket file + with tarfile.open(filename) as tar: + tar.extractall(path=str(extract_dir)) - container.start() # this does nothing if already started - container.reload() - time.sleep(0.1) + # read manifest + with open(str(extract_dir.joinpath('manifest.json')),'r') as f: + manifest = json.load(f) + + # create new bucket + self.create_bucket(bucket_name) + bucket = self.get_bucket(bucket_name) + + if not img_repo: + img_repo = manifest['image_repo'] + full_repo = 'earthcubeingeo/{}'.format(img_repo) + + if not img_tag: + img_tag = manifest['image_tag'] + + # load image + image_file = str(extract_dir.joinpath(manifest['image'])) + img_id = self.dockerhelper.import_image(image_file,full_repo,img_tag) + + # add image to bucket + bucket['image'] = {"version":img_tag,"repo":img_repo,"org":"earthcubeingeo","image_id":img_id,"repodigest":''} + + # add mounts to bucket + for mount in manifest['mounts']: + # extract mount from tar file + with tarfile.open(str(extract_dir.joinpath(mount[0]))) as tar: + tar.extractall(path=str(extract_dir)) + local = extract_dir.joinpath(tar.getnames()[0]) + # add mount to bucket with original container path + self.add_storage(bucket_name,local.as_posix(),mount[1],permissions=mount[2]) + + bucket['import_dir'] = str(extract_dir) + self.save_config() + + # clean up image file + if remove_image_file: + os.remove(image_file) + + return + + + def bucket_diskspace(self, bucket_name): + ''' + Determine the amount of disk space used by a bucket + ''' + # get bucket + bucket = self.get_bucket(bucket_name) + + report = dict() + report['container'] = self.dockerhelper.get_container_size(bucket)/1.e6 + report['storage'] = list() + + total_size = 0.0 + for mount in bucket['storage']: + mount_size = self.dir_size(mount[0])/1.e6 + report['storage'].append({'local':mount[0],'size':mount_size}) + total_size += mount_size + + report['total_storage'] = total_size + + return report + + + def dir_size(self, directory): + ''' + Determine total size of directory in bytes, doesn't follow symlinks. + ''' + total_size = 0 + for dirpath, dirnames, filenames in os.walk(directory): + for f in filenames: + fp = os.path.join(dirpath, f) + # skip if it is symbolic link + if not os.path.islink(fp): + total_size += os.path.getsize(fp) + return total_size + + + def list_buckets(self,names_only=False,bucket_name=None): + ''' + Generate a nicely formated string listing all the buckets and their statuses + ''' + if bucket_name is None: + if names_only: + print("{:<0}".format("Bucket Name")) + for name in self.bucket_names: + print("{:<0}".format(str(name))) + else: + + print("{:<20}{:<25}{:<25}".format("Bucket Name","Version","Status")) + for bucket in self.buckets: + name = self.__trim(str(bucket['name']),18) + image = self.__trim(str(bucket['image']['version']),23) + status = self.__trim(str(bucket['status']),23) + print("{:<20}{:<25}{:<25}".format(name, image, status)) - if container.status == 'running': - return True else: - return False + bucket = self.get_bucket(bucket_name) + print("%s\n%s\n" % (bucket['name'],'='*len(bucket['name']))) + print('Resen-core Version: ', bucket['image']['version']) + print('Status: ', bucket['status']) + print('Jupyter Token: ', bucket['jupyter']['token']) + print('Jupyter Port: ', bucket['jupyter']['port']) + if bucket['jupyter']['token']: + print("Jupyter lab URL: http://localhost:%s/?token=%s" % (bucket['jupyter']['port'],bucket['jupyter']['token'])) - def execute_command(self,container_id,command,detach=True): - container = self.get_container(container_id) - if container is None: - return False + print('\nStorage:') + print("{:<40}{:<40}{:<40}".format("Local","Bucket","Permissions")) + for mount in bucket['storage']: + print("{:<40}{:<40}{:<40}".format(mount[0], mount[1], mount[2])) - result = container.exec_run(command,detach=detach) - return result.exit_code, result.output + print('\nPorts:') + print("{:<15}{:<15}".format("Local","Bucket")) + for port in bucket['port']: + print("{:<15}{:<15}".format(port[0], port[1])) + return - def stop_container(self,container_id): - container = self.get_container(container_id) - if container is None: - return False + def update_bucket_statuses(self): + ''' + Update container status for all buckets + ''' + for bucket in self.buckets: + if bucket['container'] is None: + continue - container.stop() # this does nothing if already stopped - container.reload() - time.sleep(0.1) + status = self.dockerhelper.get_container_status(bucket) + bucket['status'] = status + self.save_config() - if container.status == 'exited': - return True + + def __get_valid_cores(self): + # TODO: download json file from resen-core github repo + # and if that fails, fallback to hardcoded list + return [{"version":"2019.1.0","repo":"resen-core","org":"earthcubeingeo", + "image_id":'sha256:5300c6652851f35d2fabf866491143f471a7e121998fba27a8dff6b3c064af35', + "repodigest":'sha256:a8ff4a65aa6fee6b63f52290c661501f6de5bf4c1f05202ac8823583eaad4296'},] + + + def _get_config_dir(self): + appname = 'resen' + + if 'APPDATA' in os.environ: + confighome = os.environ['APPDATA'] + elif 'XDG_CONFIG_HOME' in os.environ: + confighome = os.environ['XDG_CONFIG_HOME'] else: - return False + confighome = os.path.join(os.environ['HOME'],'.config') + configpath = os.path.join(confighome, appname) - def remove_container(self,container_id): - container = self.get_container(container_id) - if container is None: - return False + # TODO: add error checking + if not os.path.exists(configpath): + os.makedirs(configpath) - if not container.status == 'exited': - print("ERROR: Container is still running!") - return False + return configpath + + + def _get_home_dir(self): + appname = 'resen' + homedir = os.path.expanduser('~') - container.remove() + return os.path.join(homedir,appname) - return True + def __lock(self): + self.__lockfile = os.path.join(self.resen_root_dir,'lock') + if os.path.exists(self.__lockfile): + raise RuntimeError('Another instance of Resen is already running!') + with open(self.__lockfile,'w') as f: + f.write('locked') + self.__locked = True - # helper functions - def get_container_status(self, container_id): - container = self.get_container(container_id) - if container is None: - return False - container.reload() # maybe redundant + def __unlock(self): + if not self.__locked: + return + + try: + os.remove(self.__lockfile) + self.__locked = False + except FileNotFoundError: + pass + except Exception as e: + print("WARNING: Unable to remove lockfile: %s" % str(e)) - return container.status - # get a container object given a container id - def get_container(self,container_id): + def __detect_selinux(self): try: - container = self.docker.containers.get(container_id) - return container - except docker.errors.NotFound: - print("ERROR: No such container: %s" % container_id) - return None + p = Popen(['/usr/sbin/getenforce'], stdin=PIPE, stdout=PIPE, stderr=PIPE) + output, err = p.communicate() + output = output.decode('utf-8').strip('\n') + rc = p.returncode + + if rc == 0 and output == 'Enforcing': + return True + else: + return False + except FileNotFoundError: + return False + + + def __trim(self,string,length): + if len(string) > length: + return string[:length-3]+'...' + else: + return string + def __del__(self): + self.__unlock() + + + # TODO: def reset_bucket(self,bucket_name): + # used to reset a bucket to initial state (stop existing container, delete it, create new container) + # def list_cores(): # # list available docker images # # - list/pull docker image from docker hub @@ -870,19 +903,9 @@ def get_container(self,container_id): -# Configuration information: -# - store it in .json file somewhere -# - read the .json file and store config in config classes - - -# handle all of the bucket configuration info including reading -# and writing bucket config - def main(): - pass if __name__ == '__main__': - main() diff --git a/resen/__init__.py b/resen/__init__.py index a244ad8..c695fca 100644 --- a/resen/__init__.py +++ b/resen/__init__.py @@ -1,3 +1,3 @@ -__version__ = '2019.1.0rc2' +__version__ = '2019.1.0' from .Resen import Resen diff --git a/resen/resencmd.py b/resen/resencmd.py index bbb0426..d932bb2 100644 --- a/resen/resencmd.py +++ b/resen/resencmd.py @@ -18,6 +18,7 @@ import socket import pathlib import os +from pathlib import Path version = resen.__version__ @@ -28,49 +29,27 @@ def __init__(self,resen): cmd.Cmd.__init__(self) self.prompt = '[resen] >>> ' self.program = resen - # get current state of buckets - # --------------- resen stuff -------------------- - # try to create a bucket by guiding user - # if - def do_create_bucket(self,args): + + def do_create(self,args): """Usage: -create_bucket : Create a new bucket by responding to the prompts provided.""" +create : Create a new bucket by responding to the prompts provided.""" # First, ask user for bucket name print('Please enter a name for your bucket.') bucket_name = self.get_valid_name('>>> Enter bucket name: ') # First, ask user about the bucket they want to create - # resen-core version? - valid_versions = sorted([x['version'] for x in self.program.bucket_manager.valid_cores]) + valid_versions = sorted([x['version'] for x in self.program.valid_cores]) print('Please choose a version of resen-core.') docker_image = self.get_valid_version('>>> Select a version: ',valid_versions) - - # Figure out a port to use - local_port = self.get_port() - container_port = local_port - # Mounting persistent storage - msg = 'Local directories can be mounted to either /home/jovyan/work or ' - msg += '/home/jovyan/mount/ in a bucket. The /home/jovyan/work location is ' - msg += 'a workspace and /home/jovyan/mount/ is intended for mounting in data. ' - msg += 'You will have rw privileges to everything mounted in work, but can ' - msg += 'specified permissions as either r or rw for directories in mount. Code ' - msg += 'and data created in a bucket can ONLY be accessed outside the bucket or ' - msg += 'after the bucket has been deleted if it is saved in a mounted local directory.' + msg = 'Local directories can be mounted to /home/jovyan/mount in a bucket. ' + msg += 'You can specify either r or rw privileges for each directory mounted. ' print(msg) mounts = list() - # query for mount to work - answer = self.get_yn('>>> Mount storage to /home/jovyan/work? (y/n): ') - if answer == 'y': - local_path = self.get_valid_local_path('>>> Enter local path: ') - container_path = '/home/jovyan/work' - permissions = 'rw' - mounts.append([local_path,container_path,permissions]) - # query for mounts to mount answer = self.get_yn('>>> Mount storage to /home/jovyan/mount? (y/n): ') while answer == 'y': @@ -84,77 +63,52 @@ def do_create_bucket(self,args): msg = '>>> Start bucket and jupyterlab? (y/n): ' start = self.get_yn(msg) == 'y' - success = True - print("...adding core...") - status = self.program.add_image(bucket_name,docker_image) - success = success and status - if status: - status = self.program.add_port(bucket_name,local_port,container_port,tcp=True) - success = success and status - if status: - print("...adding mounts...") - for mount in mounts: - status = self.program.add_storage(bucket_name,mount[0],mount[1],mount[2]) - success = success and status - if not status: - print(" Failed to mount storage!") - - if success: + try: + self.program.create_bucket(bucket_name) + print("...adding core...") + self.program.set_image(bucket_name,docker_image) + print("...adding ports...") + self.program.add_port(bucket_name) + print("...adding mounts...") + for mount in mounts: + self.program.add_storage(bucket_name,mount[0],mount[1],mount[2]) + self.program.create_container(bucket_name) print("Bucket created successfully!") - if start: - # start bucket - status = self.program.start_bucket(bucket_name) - if not status: - return - # start jupyterlab - print("...starting jupyterlab...") - status = self.program.start_jupyter(bucket_name,local_port,container_port) - else: - print("Failed to create bucket!") - status = self.program.remove_bucket(bucket_name) - -# def do_start_bucket(self,args): -# """Usage: -# start_bucket bucket_name : Start bucket named bucket_name.""" -# inputs,num_inputs = self.parse_args(args) -# if num_inputs != 1: -# print("Syntax Error. Usage: start_bucket bucket_name") -# return - -# bucket_name = inputs[0] -# status = self.program.start_bucket(bucket_name) - -# def do_stop_bucket(self,args): -# """Usage: -# stop_bucket bucket_name : Stop bucket named bucket_name.""" -# inputs,num_inputs = self.parse_args(args) -# if num_inputs != 1: -# print("Syntax Error. Usage: stop_bucket bucket_name") -# return - -# bucket_name = inputs[0] -# status = self.program.stop_bucket(bucket_name) - - def do_remove_bucket(self,args): + except Exception as e: + print("Bucket creation failed!") + print(e) + return + + if start: + # start bucket + self.program.start_bucket(bucket_name) + print("...starting jupyterlab...") + self.program.start_jupyter(bucket_name) + + + def do_remove(self,args): """Usage: -remove_bucket bucket_name : Remove bucket named bucket_name.""" +remove bucket_name : Remove bucket named bucket_name.""" inputs,num_inputs = self.parse_args(args) if num_inputs != 1: print("Syntax Error. Usage: remove_bucket bucket_name") return bucket_name = inputs[0] - status = self.program.remove_bucket(bucket_name) + try: + self.program.remove_bucket(bucket_name) + except (ValueError, RuntimeError) as e: + print(e) + return - def do_status(self,args): + + def do_list(self,args): """Usage: ->>> status \t\t: Print the status of all resen buckets. ->>> status --names \t: Print only bucket names. ->>> status bucket_name \t: Print status of bucket with name "bucket_name" +>>> list \t\t: List all resen buckets. +>>> list --names \t: Print only bucket names. """ inputs,num_inputs = self.parse_args(args) names_only = False - bucket_name = None if num_inputs == 0: pass elif num_inputs == 1: @@ -164,17 +118,32 @@ def do_status(self,args): else: print("Syntax Error. See 'help status'.") return - else: - bucket_name = inputs[0] else: print("Syntax Error. See 'help status'.") return - + + status = self.program.list_buckets(names_only=names_only) + + + def do_status(self,args): + """Usage: +>>> status bucket_name \t: Print status of bucket with name "bucket_name" + """ + inputs,num_inputs = self.parse_args(args) + names_only = False + bucket_name = None + if num_inputs == 1: + bucket_name = inputs[0] + else: + print("Syntax Error. See 'help status'.") + return + status = self.program.list_buckets(names_only=names_only,bucket_name=bucket_name) - def do_start_jupyter(self,args): + + def do_start(self,args): """Usage: ->>> start_jupyter bucket_name : Start jupyter on bucket bucket_name +>>> start bucket_name : Start jupyter on bucket bucket_name """ inputs,num_inputs = self.parse_args(args) @@ -184,134 +153,180 @@ def do_start_jupyter(self,args): print("Syntax Error. See 'help start_jupyter'.") return - # get bucket name from input bucket_name = inputs[0] + try: + self.program.start_bucket(bucket_name) # does nothing if bucket already started + self.program.start_jupyter(bucket_name) + except (ValueError, RuntimeError) as e: + print(e) + return - if not bucket_name in self.program.bucket_manager.bucket_names: - print("ERROR: Bucket with name: %s does not exist!" % bucket_name) - return False - - # get bucket infomrmation (ports and status) - # This stuff may be better suited to exist in some kind of "status query" inside of Resen.py - ind = self.program.bucket_manager.bucket_names.index(bucket_name) - bucket = self.program.bucket_manager.buckets[ind] - # This automatically selects the first port in the list of ports - # TODO: Manage multiple ports assigned to one bucket - ports = bucket['docker']['port'][0] - running_status = bucket['docker']['status'] - - - # if bucket is not running, first start bucket - if running_status != 'running': - status = self.program.start_bucket(bucket_name) - # check if jupyter server running + def do_stop(self,args): + """Usage: +stop bucket_name : Stop jupyter on bucket bucket_name.""" + inputs,num_inputs = self.parse_args(args) + if num_inputs != 1: + print("Syntax Error. Usage: stop_bucket bucket_name") + return - # then start jupyter - status = self.program.start_jupyter(bucket_name,ports[0],ports[1]) + bucket_name = inputs[0] + try: + self.program.stop_jupyter(bucket_name) + self.program.stop_bucket(bucket_name) + except (ValueError, RuntimeError) as e: + print(e) + return - def do_stop_jupyter(self,args): + def do_export(self,args): """Usage: -stop_jupyter bucket_name : Stop jupyter on bucket bucket_name.""" +export bucket_name: Export bucket to a sharable *.tar file.""" inputs,num_inputs = self.parse_args(args) if num_inputs != 1: - print("Syntax Error. Usage: stop_bucket bucket_name") + print("Syntax Error. Usage: export_bucket bucket_name") return bucket_name = inputs[0] - status = self.program.stop_jupyter(bucket_name) - status = self.program.stop_bucket(bucket_name) - - -# def do_add_storage(self,args): -# """Usage: -# >>> add_storage bucket_name local_path container_path permissions : Add a local_path storage location available at container_path. -# use "" for paths with spaces in them -# - permissions should be 'r' or 'rw' -# """ -# inputs,num_inputs = self.parse_args(args) -# if num_inputs != 4: -# print("Syntax Error. Usage: add_storage bucket_name local_path container_path permissions") -# return -# bucket_name = inputs[0] -# local_path = inputs[1] -# container_path = inputs[2] -# permissions = inputs[3] - -# status = self.program.add_storage(bucket_name,local_path,container_path,permissions) - -# def do_remove_storage(self,args): -# """Usage: -# >>> remove_storage bucket_name local_path : Remove the local_path storage location in bucket bucket_name. -# use "" for paths with spaces in them -# """ -# inputs,num_inputs = self.parse_args(args) -# if num_inputs != 2: -# print("Syntax Error. Usage: remove_storage bucket_name local_path") -# return -# bucket_name = inputs[0] -# local_path = inputs[1] - -# status = self.program.remove_storage(bucket_name,local_path) - -# def do_add_port(self,args): -# """Usage: -# >>> add_port bucket_name local_port container_port\t: Map container_port available at local_port. -# >>> add_port bucket_name local_port container_port --udp\t: Map container_port available at local_port. -# """ -# inputs,num_inputs = self.parse_args(args) - -# tcp = True -# if num_inputs == 3: -# pass -# elif num_inputs == 4: -# if inputs[3][0] == '-': -# if inputs[3] == '--udp': -# tcp = False -# else: -# print("Syntax Error. See 'help add_port'") -# return -# else: -# print("Syntax Error. See 'help add_port'") -# return - -# bucket_name = inputs[0] -# local_port = int(inputs[1]) -# container_port = int(inputs[2]) - -# status = self.program.add_port(bucket_name,local_port,container_port,tcp=tcp) - -# def do_remove_port(self,args): -# """Usage: -# >>> remove_port bucket_name local_port : Remove the local_port mapping from bucket bucket_name. -# """ -# inputs,num_inputs = self.parse_args(args) -# if num_inputs != 2: -# print("Syntax Error. Usage: remove_port bucket_name local_port") -# return -# bucket_name = inputs[0] -# local_port = int(inputs[1]) - -# status = self.program.remove_port(bucket_name,local_port) - - # def do_import(self): - # """import : Print the status of all resen buckets.""" - # pass - - # def do_export(self): - # """export : Print the status of all resen buckets.""" - # pass - - # def do_freeze(self): - # """freeze : Print the status of all resen buckets.""" - # pass + + file_name = self.get_valid_local_path('>>> Enter name for output tgz file: ', is_file=True) + + print('By default, the output image will be named "{}" and tagged "latest".'.format(bucket_name.lower())) + rsp = self.get_yn(">>> Would you like to change the name and tag? (y/n): ") + if rsp=='y': + img_name = self.get_valid_tag(">>> Image name: ") + img_tag = self.get_valid_tag(">>> Image tag: ") + else: + img_name = None + img_tag = None + + report = self.program.bucket_diskspace(bucket_name) + + # identify storage locations to exclude + exclude_list = [] + total_size = 0. + if len(report['storage']) > 0: + print("The following local directories are mounted to the bucket (total %s MB):" % int(report['total_storage'])) + for mount in report['storage']: + print(mount['local']) + msg = '>>> Would you like to include all of these in the exported bucket? (y/n): ' + rsp = self.get_yn(msg) + if rsp == 'n': + for mount in report['storage']: + rsp = self.get_yn(">>> Include %s [%s MB]? (y/n): " % (mount['local'], mount['size'])) + if rsp == 'n': + exclude_list.append(mount['local']) + else: + total_size += mount['size'] + else: + total_size = report['total_storage'] + + # Find the maximum output file size and required disk space for bucket export + output = report['container'] + total_size + required = max(report['container']*3., output*2.) + + print('This export could require up to %s MB of disk space to complete and will produce an output file up to %s MB.' % (int(required), int(output))) + # msg = '>>> Are you sure you would like to continue? (y/n): ' + rsp = self.get_yn('>>> Are you sure you would like to continue? (y/n): ') + + + try: + print('Exporting bucket %s. This will take several mintues.' % bucket_name) + self.program.export_bucket(bucket_name, file_name, exclude_mounts=exclude_list, img_repo=img_name, img_tag=img_tag) + except (ValueError, RuntimeError) as e: + print(e) + return + + + def do_import(self,args): + """Usage: +import : Import a bucket from a .tgz file by providing input.""" + + print('Please enter a name for your bucket.') + bucket_name = self.get_valid_name('>>> Enter bucket name: ') + + file_name = self.get_valid_local_path('>>> Enter name for input tar file: ', is_file=True) + + rsp = self.get_yn(">>> Would you like to keep the default name and tag for the imported image? (y/n): ") + if rsp=='n': + img_name = self.get_valid_tag(">>> Image name: ") + img_tag = self.get_valid_tag(">>> Image tag: ") + else: + img_name = None + img_tag = None + + resen_home_dir = self.program.resen_home_dir + default_import = os.path.join(resen_home_dir,bucket_name) + print("The default directory to extract the bucket metadata and mounts to is {}.".format(default_import)) + rsp = self.get_yn(">>> Would you like to specify and alternate directory? (y/n): ") + if rsp=='y': + while True: + extract_dir = input('>>> Enter path to directory: ') + if not os.path.exists(extract_dir): + rsp = self.get_yn(">>> Directory does not exist. Create it? (y/n): ") + if rsp=='y': + try: + os.makedirs(extract_dir) + break + except: + print('Invalid: Directory cannot be created!') + else: + dir_contents = os.listdir(extract_dir) + if len(dir_contents) == 0: + break + print("Invalid: Directory must be empty!") + else: + extract_dir = default_import + + # query for aditional mounts + mounts = list() + answer = self.get_yn('>>> Mount additional storage to the imported bucket? (y/n): ') + while answer == 'y': + local_path = self.get_valid_local_path('>>> Enter local path: ') + container_path = self.get_valid_container_path('>>> Enter bucket path: ','/home/jovyan/mount') + permissions = self.get_permissions('>>> Enter permissions (r/rw): ') + mounts.append([local_path,container_path,permissions]) + answer = self.get_yn('>>> Mount additional storage to /home/jovyan/mount? (y/n): ') + + # should we start jupyterlab when done creating bucket? + msg = '>>> Start bucket and jupyterlab? (y/n): ' + start = self.get_yn(msg) == 'y' + + # should we clean up the bucket archive? + msg = '>>> Remove %s after successful import? (y/n): ' % str(file_name) + remove_archive = self.get_yn(msg) == 'y' + + try: + self.program.import_bucket(bucket_name,file_name,extract_dir=extract_dir, + img_repo=img_name,img_tag=img_tag,remove_image_file=True) + self.program.add_port(bucket_name) + for mount in mounts: + self.program.add_storage(bucket_name,mount[0],mount[1],mount[2]) + self.program.create_container(bucket_name, give_sudo=False) + except (ValueError, RuntimeError) as e: + print('Bucket import failed!') + print(e) + return + + if start: + # start bucket + try: + self.program.start_bucket(bucket_name) + print("...starting jupyterlab...") + self.program.start_jupyter(bucket_name) + except Exception as e: + print(e) + return + + if remove_archive: + print("Deleting %s as requested." % str(file_name)) + os.remove(file_name) def do_quit(self,arg): """quit : Terminates the application.""" - # turn off currently running buckets or leave them running? leave running but + # turn off currently running buckets or leave them running? leave running but print("Exiting") return True @@ -354,14 +369,11 @@ def get_valid_name(self,msg): print("Bucket names must be less than 20 characters.") elif not name[0].isalpha(): print("Bucket names must start with an alphabetic character.") + elif name in self.program.bucket_names: + print("Cannot use the same name as an existing bucket!") else: - # check if bucket with that name already exists - # Is the only reason create_bucket fails if the name is already take? May need a more rigerous check - status = self.program.create_bucket(name) - if status: - return name - else: - print("Cannot use the same name as an existing bucket!") + return name + def get_valid_version(self,msg,valid_versions): print('Available versions: {}'.format(", ".join(valid_versions))) @@ -373,27 +385,18 @@ def get_valid_version(self,msg,valid_versions): print("Invalid version. Available versions: {}".format(", ".join(valid_versions))) - def get_port(self): - # this is not atomic, so it is possible that another process might snatch up the port - port = 9000 - assigned_ports = [x['docker']['port'][0] for x in self.program.bucket_manager.buckets if len(x['docker']['port'])] - while True: - with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s: - assigned = s.connect_ex(('localhost', port)) == 0 - if not assigned and not port in assigned_ports: - return port - else: - port += 1 - - def get_valid_local_path(self,msg): + def get_valid_local_path(self,msg,is_file=False): while True: path = input(msg) path = pathlib.PurePosixPath(path) if os.path.isdir(str(path)): return str(path) + elif is_file and os.path.isdir(str(path.parent)): + return str(path) else: print('Cannot find local path entered.') + def get_valid_container_path(self,msg,base): while True: path = input(msg) @@ -403,6 +406,7 @@ def get_valid_container_path(self,msg,base): else: print("Invalid path. Must start with: {}".format(base)) + def get_permissions(self,msg): valid_inputs = ['r', 'rw'] while True: @@ -413,6 +417,23 @@ def get_permissions(self,msg): print("Invalid input. Valid input are {} or {}.".format(valid_inputs[0],valid_inputs[1])) + def get_valid_tag(self,msg): + while True: + tag = input(msg) + + # check if bucket_name has spaces in it and is greater than 20 characters + # also bucket name must start with a letter + if ' ' in tag: + print("Tags may not contain spaces.") + elif len(tag) > 128: + print("Tags must be less than 128 characters.") + elif not tag[0].isalpha(): + print("Tags must start with an alphabetic character.") + elif not tag.islower(): + print("Tags may only contain lower case letters.") + else: + return tag + @@ -451,7 +472,4 @@ def main(): if __name__ == '__main__': - main() - - diff --git a/setup.py b/setup.py index 64000d6..0e8ddc4 100644 --- a/setup.py +++ b/setup.py @@ -23,7 +23,7 @@ # Versions should comply with PEP440. For a discussion on single-sourcing # the version across setup.py and the project code, see # https://packaging.python.org/en/latest/single_source_version.html - version='2019.1.0rc2', #TODO: parse this from resen/__init__.py + version='2019.1.0', #TODO: parse this from resen/__init__.py description='A python package for watching, copying, and transporting files around.', long_description=long_description,