Skip to content

Commit

Permalink
Merge pull request #30 from bento-platform/rename-refact
Browse files Browse the repository at this point in the history
refact: remove federation parts + rename to aggregation
  • Loading branch information
davidlougheed authored Dec 19, 2022
2 parents 17b8d61 + 06e336e commit eea181d
Show file tree
Hide file tree
Showing 44 changed files with 1,580 additions and 921 deletions.
2 changes: 1 addition & 1 deletion .coveragerc
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
[run]
include =
bento_federation_service/*
bento_aggregation_service/*
4 changes: 4 additions & 0 deletions .dockerignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
.coverage
.tox

env
32 changes: 32 additions & 0 deletions .github/workflows/build.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
name: Build and push bento_aggregation_service
on:
release:
types: [ published ]
pull_request:
branches:
- master
push:
branches:
- master

jobs:
build-push:
runs-on: ubuntu-latest

permissions:
contents: read
packages: write

steps:
- name: Checkout
uses: actions/checkout@v3

- name: Run Bento build action
uses: bento-platform/bento_build_action@v0.9.3
with:
registry: ghcr.io
registry-username: ${{ github.actor }}
registry-password: ${{ secrets.GITHUB_TOKEN }}
image-name: ghcr.io/bento-platform/bento_aggregation_service
development-dockerfile: dev.Dockerfile
dockerfile: Dockerfile
33 changes: 33 additions & 0 deletions .github/workflows/lint.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
name: Test

on:
push:
branches:
- master
pull_request:
- master

jobs:
build:

runs-on: ubuntu-latest
strategy:
matrix:
python-version: [3.8, 3.10]

steps:

- uses: actions/checkout@v2
with:
submodules: true

- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v2
with:
python-version: ${{ matrix.python-version }}

- name: Install dependencies
run: python -m pip install -r requirements-dev.txt

- name: Lint
run: flake8 ./bento_aggregation_service ./tests
36 changes: 36 additions & 0 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
name: Test

on:
push:
branches:
- master
pull_request:
- master

jobs:
build:

runs-on: ubuntu-latest
strategy:
matrix:
python-version: [3.8, 3.10]

steps:

- uses: actions/checkout@v2
with:
submodules: true

- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v2
with:
python-version: ${{ matrix.python-version }}

- name: Install dependencies
run: python -m pip install -r requirements-dev.txt

- name: Test
run: coverage run -m unittest -v

- name: Codecov
run: codecov
2 changes: 1 addition & 1 deletion .idea/bento_federation_service.iml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion .idea/misc.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

28 changes: 28 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
FROM ghcr.io/bento-platform/bento_base_image:python-debian-latest

# Use uvicorn (instead of hypercorn) in production since I've found
# multiple benchmarks showing it to be faster - David L
RUN pip install --no-cache-dir poetry==1.2.2 "uvicorn[standard]==0.20.0"

WORKDIR /aggregation

COPY pyproject.toml pyproject.toml
COPY poetry.toml poetry.toml
COPY poetry.lock poetry.lock

# Install production dependencies
# Without --no-root, we get errors related to the code not being copied in yet.
# But we don't want the code here, otherwise Docker cache doesn't work well.
RUN poetry install --without dev --no-root

# Manually copy only what's relevant
# (Don't use .dockerignore, which allows us to have development containers too)
COPY bento_aggregation_service bento_aggregation_service
COPY LICENSE LICENSE
COPY README.md README.md
COPY run.py run.py

# Install the module itself, locally (similar to `pip install -e .`)
RUN poetry install --without dev

ENTRYPOINT ["python3", "run.py"]
2 changes: 0 additions & 2 deletions MANIFEST.in

This file was deleted.

24 changes: 3 additions & 21 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,39 +1,21 @@
# Bento Federation Service
# Bento Aggregation Service

A service for federated search between Bento platform nodes.
A service for aggregating search results across Bento data services.

## Environment Variables

`DATABASE`: Defaults to `data/federation.db`

Keeps track of a list of peers to pass federated queries to.

`CHORD_DEBUG`: `true` (insecure) or `false`; default is `false`

`BENTO_FEDERATION_MODE`: `true` or `false`; default is `true`

If set to false, the peer-contacting process will be skipped entirely.

`CHORD_URL`: ex. `http://127.0.0.1:5000/`

By convention, this *should* have a trailing slash; however as of v0.9.1 this
is optional.

`CHORD_REGISTRY_URL`: ex. `http://127.0.0.1:5000/`

By convention, this *should* have a trailing slash; however as of v0.9.1 this
is optional.

`OIDC_DISCOVERY_URI`:
ex. `https://keycloak.example.og/auth/realms/master/.well-known/openid-configuration`

By convention *should not* have a trailing slash.

`PORT`: Specified when running via `./run.py`; defaults to `5000`

`SERVICE_URL_BASE_PATH`: Base URL fragment (e.g. `/test/`) for endpoints

Should usually be blank; set to non-blank to locally emulate a proxy prefix
like `/api/federation`.
like `/api/aggregation`.

`SOCKET`: Specifies Unix socket location for production deployment
6 changes: 6 additions & 0 deletions bento_aggregation_service/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
from importlib import metadata

__all__ = ["name", "__version__"]

name = __package__
__version__ = metadata.version(__package__)
70 changes: 70 additions & 0 deletions bento_aggregation_service/app.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
from __future__ import annotations

import bento_aggregation_service
import tornado.gen
import tornado.ioloop
import tornado.web

from datetime import datetime
from tornado.web import RequestHandler, url

from .constants import (
SERVICE_ID,
SERVICE_TYPE,
SERVICE_NAME,
PORT,
BASE_PATH,
CHORD_DEBUG,
CHORD_URL_SET,
DEBUGGER_PORT,
)
from .search.handlers.datasets import DatasetsSearchHandler
from .search.handlers.private_dataset import PrivateDatasetSearchHandler


# noinspection PyAbstractClass,PyAttributeOutsideInit
class ServiceInfoHandler(RequestHandler):
async def get(self):
# Spec: https://github.com/ga4gh-discovery/ga4gh-service-info
self.write({
"id": SERVICE_ID,
"name": SERVICE_NAME, # TODO: Should be globally unique?
"type": SERVICE_TYPE,
"description": "Aggregation service for a Bento platform node.",
"organization": {
"name": "C3G",
"url": "https://www.computationalgenomics.ca"
},
"contactUrl": "mailto:david.lougheed@mail.mcgill.ca",
"version": bento_aggregation_service.__version__
})


class Application(tornado.web.Application):
def __init__(self, base_path: str):
super().__init__([
url(f"{base_path}/service-info", ServiceInfoHandler),
url(f"{base_path}/dataset-search", DatasetsSearchHandler),
url(f"{base_path}/private/dataset-search/([a-zA-Z0-9\\-_]+)", PrivateDatasetSearchHandler),
])


application = Application(BASE_PATH)


def run(): # pragma: no cover
if not CHORD_URL_SET:
print(f"[{SERVICE_NAME} {datetime.utcnow()}] CHORD_URL is not set, terminating...")
exit(1)

if CHORD_DEBUG:
try:
# noinspection PyPackageRequirements,PyUnresolvedReferences
import debugpy
debugpy.listen(("0.0.0.0", DEBUGGER_PORT))
print("debugger attached")
except ImportError:
print("debugpy not found")

application.listen(PORT)
tornado.ioloop.IOLoop.current().start()
Original file line number Diff line number Diff line change
Expand Up @@ -7,10 +7,8 @@
__all__ = [
"BASE_PATH",
"CHORD_DEBUG",
"BENTO_FEDERATION_MODE",
"CHORD_URL",
"CHORD_HOST",
"CHORD_REGISTRY_URL",

"OIDC_DISCOVERY_URI",

Expand All @@ -24,11 +22,12 @@
"SERVICE_ID",
"SERVICE_NAME",

"SERVICE_SOCKET",
"PORT",
"DEBUGGER_PORT",

"INITIALIZE_IMMEDIATELY",
"CHORD_URL_SET",

"CHORD_URLS_SET",
"USE_GOHAN",

"TIMEOUT",
"WORKERS",
Expand All @@ -55,12 +54,10 @@ def _env_url_trailing_slash(var: str) -> str:
BASE_PATH = os.environ.get("SERVICE_URL_BASE_PATH", "")

CHORD_DEBUG = _env_to_bool("CHORD_DEBUG")
BENTO_FEDERATION_MODE = _env_to_bool("BENTO_FEDERATION_MODE", default=True)

# Set CHORD_URL and CHORD_REGISTRY_URL to environment values, or blank if not
# available.
CHORD_URL = _env_url_trailing_slash("CHORD_URL")
CHORD_REGISTRY_URL = _env_url_trailing_slash("CHORD_REGISTRY_URL")

CHORD_HOST = urllib.parse.urlparse(CHORD_URL or "").netloc or ""
OIDC_DISCOVERY_URI = os.environ.get("OIDC_DISCOVERY_URI")
Expand All @@ -69,16 +66,18 @@ def _env_url_trailing_slash(var: str) -> str:

SERVICE_ORGANIZATION = "ca.c3g.bento"
SERVICE_ARTIFACT = "federation"
SERVICE_TYPE_NO_VERSION = f"{SERVICE_ORGANIZATION}:{SERVICE_ARTIFACT}"
SERVICE_TYPE = f"{SERVICE_TYPE_NO_VERSION}:{__version__}"
SERVICE_ID = os.environ.get("SERVICE_ID", SERVICE_TYPE_NO_VERSION)
SERVICE_NAME = "Bento Federation Service"

SERVICE_SOCKET = os.environ.get("SERVICE_SOCKET", "/tmp/federation.sock")

INITIALIZE_IMMEDIATELY = _env_to_bool("INITIALIZE_IMMEDIATELY", default=True)

CHORD_URLS_SET = CHORD_URL != "" and CHORD_REGISTRY_URL != ""
SERVICE_TYPE = {
"group": "ca.c3g.bento",
"artifact": SERVICE_ARTIFACT,
"version": __version__,
}
SERVICE_ID = os.environ.get("SERVICE_ID", ":".join(list(SERVICE_TYPE.values())[:2]))
SERVICE_NAME = "Bento Aggregation Service"

PORT = int(os.environ.get("PORT", "5000"))
DEBUGGER_PORT = int(os.environ.get("DEBUGGER_PORT", "5879"))

CHORD_URL_SET = CHORD_URL != ""

USE_GOHAN = _env_to_bool("USE_GOHAN")

Expand Down
File renamed without changes.
7 changes: 7 additions & 0 deletions bento_aggregation_service/search/constants.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
from __future__ import annotations

from bento_aggregation_service.constants import CHORD_HOST

__all__ = ["DATASET_SEARCH_HEADERS"]

DATASET_SEARCH_HEADERS = {"Host": CHORD_HOST}
Loading

0 comments on commit eea181d

Please sign in to comment.