Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Setup tools #2

Open
wants to merge 11 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 13 additions & 0 deletions neuroscience/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
.idea
.DS_Store
*.pyc
hosts
hosts.*
!hosts.template
aws_credentials
ansible_setup.retry
venv
*.pem
*_bvals
*_bvecs
*.nii.gz
134 changes: 117 additions & 17 deletions neuroscience/Readme.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,61 @@
# Installation
## Virtual environment
```
virtualenv -p python venv # just once to create python virtual environment
source venv/bin/activate
```
More information about virtual environments are available [here](https://virtualenv.pypa.io/en/stable/).

# Data
Data for the neuroscience use case is available at [HCP](https://wiki.humanconnectome.org/display/PublicData/How+To+Connect+to+Connectome+Data+via+AWS)
## Dependencies
```
pip install -r requirements.txt
```
Note: if you need to add any dependencies to the project please install them with `pip install ...` and remember to
update the requirements file using `pip freeze > requirements.txt`. More information about `pip` tool can be found at
[https://pip.pypa.io/en/stable/user_guide/](https://pip.pypa.io/en/stable/user_guide/).

## Ansible
A great way of deploying the app is by using [Ansible](http://docs.ansible.com/ansible/latest/index.html).

Firstly you need to install Ansible locally. You can do it using [this tutorial](http://docs.ansible.com/ansible/latest/intro_installation.html).
Secondly prepare two files `hosts` and `aws_credentials`. First will tell Ansible to what servers deploy the app.
Example of `hosts` file for aws ec2 servers:
```
[masters]
18.196.69.101 ansible_user=ubuntu ansible_ssh_private_key_file=tests.pem

[slaves]
18.196.69.102 ansible_user=ubuntu ansible_ssh_private_key_file=tests.pem
18.196.69.103 ansible_user=ubuntu ansible_ssh_private_key_file=tests.pem
18.196.69.104 ansible_user=ubuntu ansible_ssh_private_key_file=tests.pem
```
IP-s at the beginning of each line represents server IP. tests.pem in the example is local path to pem key assigned to this instance.

For Azure cloud cluster this file may look as follows:
```
[masters]
master ansible_host=test.westus2.cloudapp.azure.com ansible_port=50001 ansible_user=ubuntu ansible_ssh_pass=PASS

[slaves]
slave1 ansible_host=test.westus2.cloudapp.azure.com ansible_port=50002 ansible_user=ubuntu ansible_ssh_pass=PASS
slave2 ansible_host=test.westus2.cloudapp.azure.com ansible_port=50003 ansible_user=ubuntu ansible_ssh_pass=PASS
slave3 ansible_host=test.westus2.cloudapp.azure.com ansible_port=50004 ansible_user=ubuntu ansible_ssh_pass=PASS
slave4 ansible_host=test.westus2.cloudapp.azure.com ansible_port=50005 ansible_user=ubuntu ansible_ssh_pass=PASS
slave5 ansible_host=test.westus2.cloudapp.azure.com ansible_port=50006 ansible_user=ubuntu ansible_ssh_pass=PASS
```
Note: inlining password in hosts file may not be safe and requires installing sshpass on local machine.

File `aws_credentials` should contain valid s3 hcp credentials. It will be copied to `~/.aws/credentials` on each
deployed server.

After installing Ansible and preparing `hosts` and `aws_credentials` you can deploy the app on all servers using command:
```
ANSIBLE_HOST_KEY_CHECKING=False ansible-playbook -i ./hosts ansible_setup.yml
```

# Data
Data for the neuroscience use case is available at
[HCP](https://wiki.humanconnectome.org/display/PublicData/How+To+Connect+to+Connectome+Data+via+AWS)

## Get the data from AWS S3
We assume that you have a file '.aws/credentials',
Expand All @@ -12,19 +66,65 @@ AWS_ACCESS_KEY_ID=XXXXXXXXXXXXXXXX
AWS_SECRET_ACCESS_KEY=XXXXXXXXXXXXXXXX
```

Following code excerpt shows how to access the data from the hcp-openaccess s3 bucket.
```python
import botocore.session
import boto3
boto3.setup_default_session(profile_name='hcp')
s3 = boto3.resource('s3')
bucket = s3.Bucket('hcp-openaccess')

data_files = {'./bvals':'HCP/994273/T1w/Diffusion/bvals',
'./bvecs':'HCP/994273/T1w/Diffusion/bvecs',
'./data.nii.gz':'HCP/994273/T1w/Diffusion/data.nii.gz'}

for k in data_files.keys():
if not op.exists(k):
bucket.download_file(data_files[k], k)
Test data consists of 3 files `bvals`, `bvecs` and `data.nii.gz`. Following command lists available cases:
```
aws s3 ls s3://hcp-openaccess/HCP/ --profile hcp
```


Utility `common/download.py` can be used to download them to project directory. Note: Files for single case weight in total 1.3GB.
```
python common/download.py 100307
```

# Running
## Local
```
python ref/main.py 100307
```

## Dask
Start dask scheduler on one machine with command:
```
dask-scheduler
```

On any number of other machines start worker process with command (replace SCHEDULER_IP with ip address of scheduler machine):
```
dask-worker SCHEDULER_IP:8786
```

Ansible playbooks `ansible_dask_start.yml` and `ansible_dask_stop.yml` automates starting dask cluster described in `hosts` file.
Starting dask on all machines of the cluster:
```
ANSIBLE_HOST_KEY_CHECKING=False ansible-playbook -i ./hosts ansible_dask_start.yml
```

Stopping dask:
```
ANSIBLE_HOST_KEY_CHECKING=False ansible-playbook -i ./hosts ansible_dask_stop.yml
```

To run benchmark run following command on scheduler machine:
```
python dask/main.py SUBJECT_ID [SUBJECT_ID ...]
```

Note: If you forward port 8786 from scheduler machine to your machine you would be able to schedule tasks from your pc.
Forwarding ports can be achieved using `ssh`: `ssh -L 8786:127.0.0.1:8786 ...`.

## Spark
Start spark master process on one machine:
```
~/spark/sbin/start-master.sh
```

Start spark slaves on any machines:
```
~/spark/sbin/start-slave.sh spark://MASTER_IP:7077
```

Starting execution requires running on master machine:
```
python spark/main.py SUBJECT_ID [SUBJECT_ID ...]
```
33 changes: 33 additions & 0 deletions neuroscience/ansible_dask_start.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
- hosts: masters
gather_facts: false
tasks:
- name: Check IP
command: bash -c "ifconfig | grep 'inet addr:' | grep -v '127.0.0.1' | cut -f2 -d':' | cut -f1 -d' '"
register: ifconfig
- name: Save masters ip to variable
set_fact: ip="{{ ifconfig.stdout }}"
- debug: msg="{{ ip }}"
- name: Create dask directory
file:
state: directory
path: "/home/{{ ansible_user }}/dask"
mode: 0755
- name: Start dask scheduler
shell: 'start-stop-daemon --start --quiet --make-pidfile --pidfile ~/dask/scheduler.pid --background --startas /bin/bash -- -c "exec dask-scheduler >~/dask/scheduler.log 2>&1"'
register: verification
changed_when: verification.rc == 0
failed_when: verification.rc not in [0,1]
- hosts: all
strategy: free
gather_facts: false
tasks:
- name: Create dask directory
file:
state: directory
path: "/home/{{ ansible_user }}/dask"
mode: 0755
- name: Start dask worker
shell: "start-stop-daemon --start --quiet --make-pidfile --pidfile ~/dask/worker.pid --background --startas /bin/bash -- -c 'cd ~/dask && exec dask-worker {{ hostvars['master']['ip'] }}:8786 --local-directory ~/dask/worker-space >~/dask/worker.log 2>&1'"
register: verification
changed_when: verification.rc == 0
failed_when: verification.rc not in [0,1]
25 changes: 25 additions & 0 deletions neuroscience/ansible_dask_stop.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
- hosts: all
strategy: free
gather_facts: false
tasks:
- name: Stop dask worker
shell: "start-stop-daemon --stop --quiet --pidfile ~/dask/worker.pid"
register: verification
changed_when: verification.rc == 0
failed_when: verification.rc not in [0,1]
- hosts: masters
gather_facts: false
tasks:
- name: Stop dask scheduler
shell: 'start-stop-daemon --stop --quiet --pidfile ~/dask/scheduler.pid'
register: verification
changed_when: verification.rc == 0
failed_when: verification.rc not in [0,1]
- hosts: all
strategy: free
gather_facts: false
tasks:
- name: Clean after working directory
file:
state: absent
path: "/home/{{ ansible_user }}/dask"
60 changes: 60 additions & 0 deletions neuroscience/ansible_setup.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
- hosts: all
strategy: free
name: Clone github repo and install dependencies.
vars:
homedir: '/home/{{ ansible_user }}'
rootdir: '{{ homedir }}/image_analytics'
workdir: '{{ rootdir }}/neuroscience'
gather_facts: false
pre_tasks:
- name: install python
become: true
raw: bash -c "test -e /usr/bin/python || (apt -y update && apt install -y python-minimal)"
register: output
changed_when: output.stdout != ""
tasks:
- name: check to see if pip is already installed
command: "pip --version"
ignore_errors: true
register: pip_is_installed
changed_when: false
- block:
- name: download get-pip.py
get_url: url=https://bootstrap.pypa.io/get-pip.py dest=/tmp
- name: install pip
become: true
command: "python /tmp/get-pip.py"
- name: delete get-pip.py
file: state=absent path=/tmp/get-pip.py
when: pip_is_installed.rc != 0
- name: Install apt packages
become: true
apt: name={{item}} state=present
with_items:
- git
- python-dev
- gcc
- openjdk-8-jdk-headless
- name: Clone repository from github
git:
repo: 'https://github.com/tpawlowski/image_analytics.git'
dest: '{{ rootdir }}'
update: yes
force: yes
- name: Install requirements
become: true
pip:
requirements: '{{ workdir }}/requirements.txt'
- name: Create aws configuration directory
file:
path: '{{ homedir }}/.aws'
state: directory
mode: 0755
- name: Copy aws_credentials
copy:
src: aws_credentials
dest: '{{ homedir }}/.aws/credentials'
mode: 0600
- name: Install spark
import_role:
name: spark
29 changes: 29 additions & 0 deletions neuroscience/ansible_spark_start.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
- hosts: masters
gather_facts: false
tasks:
- name: Check IP
command: bash -c "ifconfig | grep 'inet addr:' | grep -v '127.0.0.1' | cut -f2 -d':' | cut -f1 -d' '"
register: ifconfig
- name: Save masters ip to variable
set_fact: ip="{{ ifconfig.stdout }}"
- debug: msg="{{ ip }}"
- name: Check spark master
shell: 'jps | grep Master'
register: jps_master
changed_when: jps_master.stdout == ''
failed_when: jps_master.rc not in [0,1]
- name: Start spark master
shell: '~/spark/sbin/start-master.sh'
when: jps_master.stdout == ''
- hosts: all
strategy: free
gather_facts: false
tasks:
- name: Check spark worker
shell: 'jps | grep Worker'
register: jps_worker
changed_when: jps_worker.stdout == ''
failed_when: jps_worker.rc not in [0,1]
- name: Start spark slave
shell: "~/spark/sbin/start-slave.sh spark://{{ hostvars['master']['ip'] }}:7077"
when: jps_worker.stdout == ''
23 changes: 23 additions & 0 deletions neuroscience/ansible_spark_stop.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
- hosts: all
strategy: free
gather_facts: false
tasks:
- name: Check spark worker
shell: 'jps | grep Worker'
register: jps_worker
changed_when: jps_worker.stdout != ''
failed_when: jps_worker.rc not in [0,1]
- name: Stop spark slave
shell: "~/spark/sbin/stop-slave.sh"
when: jps_worker.stdout != ''
- hosts: masters
gather_facts: false
tasks:
- name: Check spark master
shell: 'jps | grep Master'
register: jps_master
changed_when: jps_master.stdout != ''
failed_when: jps_master.rc not in [0,1]
- name: Stop spark master
shell: '~/spark/sbin/stop-master.sh'
when: jps_master.stdout != ''
3 changes: 3 additions & 0 deletions neuroscience/aws_credentials.template
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
[hcp]
aws_access_key_id = XXXXXXXXXXXXXXXX
aws_secret_access_key = XXXXXXXXXXXXXXXX
Empty file added neuroscience/common/__init__.py
Empty file.
32 changes: 32 additions & 0 deletions neuroscience/common/download.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
import os.path as op
import os
import boto3
import sys


def download(subject_id, output_directory='.'):
boto3.setup_default_session(profile_name='hcp')
s3 = boto3.resource('s3')
bucket = s3.Bucket('hcp-openaccess')

for file_name in ['bvals', 'bvecs', 'data.nii.gz']:
output_path = '{0}/{1}_{2}'.format(output_directory, subject_id, file_name)
if op.exists(output_path):
print("Using previously downloaded {0}".format(output_path))
else:
source_path = 'HCP/{0}/T1w/Diffusion/{1}'.format(subject_id, file_name)
print("Downloading {0} to {1}".format(source_path, output_path))
bucket.download_file(source_path, output_path)
print("Done")


def remove(subject_id, output_directory='.'):
for file_name in ['bvals', 'bvecs', 'data.nii.gz']:
output_path = '{0}/{1}_{2}'.format(output_directory, subject_id, file_name)
if op.exists(output_path):
os.remove(output_path)


if __name__ == "__main__":
for case_id in sys.argv[1:]:
download(case_id)
5 changes: 5 additions & 0 deletions neuroscience/common/subjects.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
test_subjects = ["100307", "100408", "101006", "101107", "101309", "101410",
"101915", "102311", "102816", "103111", "103515", "105014",
"105115", "105216", "106016", "106319", "106521", "107321",
"108121", "108323", "108525", "108828", "109123", "110411",
"111312"]
Loading