-
Notifications
You must be signed in to change notification settings - Fork 24
2.1. Software Configuration
Assuming that HADatAc is installed at "[HADatAc]", configuration files are located at [HADatAc]/conf. The way content in [HADatAc]/conf is used is different when using HADatAc in development mode and production mode.
In production, a software upgrade DOES NOT replace the content of hadatac/conf in the distribution because the upgrade should preserve current configuration. To change any configuration file, one needs to change the configuration file in both the current distribution and in "/data/conf" that should not be in "[HADatAc]/conf".
In development mode, you may just need to update a few files including hadatac..conf, and that that change can be done directly in the local copy of what is in the master branch. In case code is contributed back to the master branch, __please do not add and commit changes to [HADatAc]/conf.
This is the main configuration file and tells the system important information about how the webapp connects to SOLR and Blazegraph repositories, and what is going to be the URL of the webapp once it is deployed.
If you are a developer using a local copy of HADatAc in your machine, and you do not have any restriction in calling 'http://localhost:9000' to invoke HADatAc, you may not need to change this part of the configuration.
If you are deploy HADatAc on a server and you expect users to access HADAtAc over the web, you will need the parameters below accordingly to your domain name and to firewall/port restrictions.
-
the application's base host URL
default value: host="http://localhost:9000"
-
the url that the application is deployed
default value: host_deploy="http://localhost:9000"
-
the base url that the application uses to send email and receive email. This is the value prefixed to HADatAc services used to communicate with users during email authentication or password reset. Emails may be sent out if this value is not set properly, but the confirmation of any action embedded in email messages may not have any effect on HADatAc.
default value: base_url="127.0.0.1:9000"
-
the kb's base host URL -- usually, the application's base host URL without any port information. This value is used to build absolute URLs from internal relative URLS, and it is used to inter-connect HADatAc functionalities including the URLs of internal proxies used by javascript code.
default value: kb="http://localhost"
-
HOME: the path in the file system where the SOLR instances are located
default: home=/../hadatac/solr
-
URL for data collections
default: data="http://127.0.0.1:8983/solr"
-
URL used to retrieve content from a Blazegraph repository.
-
For blazegraph in the local
default: triplestore="http://127.0.0.1:9999/blazegraph/namespace/store"
-
For blazegraph in the vm
default: triplestore="http://127.0.0.1:8080/bigdata/namespace"
-
URL for user management collection
default: users="http://127.0.0.1:8983/solr"
-
URL for user permission management collection
- For blazegraph in the local
default: permissions="http://127.0.0.1:9999/blazegraph/namespace/store_users"
- For blazegraph in the vm default: permissions="http://127.0.0.1:8080/bigdata/namespace"
-
-
activity flags are used to verify if HADatAc knowledge base contains
-
concepts essential for supported scientific activities
-
use true for empirical activities involving the use of sensors
empirical=true
-
use true for computational activities involving computational simulations
computational=false
-
-
properties about community using current HADatAc installation
-
these properties are used to project customization of HADaAc installations
default: fullname="Child Health Exposure Analysis Repository"
default: shortname="CHEAR"
default: description=""
default: ont_prefix="chear"
-
You may not need to set up email configuration if you are using HADatAc for development purpose. This configuration is essential if you are planning to create users with authenticated access to the system. In this case, the email configuration will enable users to verify their emails and to request password reset.
Authentication is done through email verification, which requires HADatAc to communicate with users through emails. The configuration file is smtp.conf under /conf/play_authenticate. Instructions for filling up the configuration file are described inside of the file itself. The email account to be used should be one created used for your system (not the gmail account shown in the example below).
play.mailer {
# TODO: Disable this in production
mock=false
# SMTP server
# (mandatory)
# defaults to gmail
host=smtp.gmail.com
# SMTP port
# defaults to 25
port=465
# Use SSL
# for GMail, this should be set to true
ssl=true
# authentication user
# Optional, comment this line if no auth
# defaults to no auth
user="hadatac1234@gmail.com"
# authentication password
# Optional, comment this line to leave password blank
# defaults to no password
password="password"
}
This is the main configuration file of Play Framework behind HADatAc. From this configuration file, Play Framework can locate other configuration files. The session configuration parameters and the java configuration parameters are the parameters that may be changed.
-
Deadbolt (do not change)
required value: "play-authenticate/deadbolt.conf"
-
SMTP (do not change)
required value: "play-authenticate/smtp.conf"
-
Play authenticate (do not change)
required value: "play-authenticate/mine.conf"
-
HADatAc (do not change)
required value: "hadatac.conf"
-
Session conf: specifies how long a session can last without any activity from the user.
session.maxAge=1h play.http.session.maxAge=12h
-
java config: specifies the amount of memory allocated to a HADatAc instance.
jvm.memory=-Xmx2048M -Xms2048M
autoccsc.config specifies the folders where files are stored when they are initially uploaded into HADatAc, and where they are stored after HADatAc has processed them (i.e., after HADatAc has "ingested" the contents of these files). Data Files and metadata files that are uploaded into HADatAc are managed as part of the overall content of the app.
IMPORTANT: the folder paths identified in this configuration file MUST BE OUTSIDE of the HADatAc distribution folder so that existing files are not affected during HADatAc's software upgrade.
-
The path of processed files. The path can be an absolute (when starting with "/") or relative to HADatAc's deployment location.
path_proc=processed_csv/ -
The path of unprocessed files. The path can be an absolute (when starting with "/") or relative to HADatAc's deployment location.
path_unproc=unprocessed_csv/ -
identifies whether the autoannotator is on or off by default.
auto=on
This configuration file has a list of all the namespaces used in HADatAc's knowledge base. This list is cached inside of HADatAc when it is running, and HADatAc needs to be restarted for any change to this configuration file to take effect.
This list has two roles:
- to indicate which ontologies should be loaded into HADatAc's knowledge graph
- to indicate which namespaces may be used as a prefix during the execution of any SPARQL query against the repository
Every loaded ontology should be included in the list of SPARQL prefixes, but the content inside the knowledge graph may refer to terms from namespaces that are not necessarily loaded into the knowledge graph.
Each namespace has four attributes:
- definition (mandatory): It is of the form [abbreviation]=[reference_URI] like xsd=http://www.w3.org/2001/XMLSchema# where [abbreviation] is xsd and [reference_URI] is http://www.w3.org/2001/XMLSchema#
- mime type (only for loaded ontologies): this attribute identifies the encoding format of the ontology to be loaded. If the ontology is encode in RDF/XML the value for this attribute should be application/rdf+xml. If the ontology is encode in turtle tha value for this attribute should be text/turtle.
- loading URL (only for loaded ontologies): this is an URL that should be resolvable on the web (it is suggested that the URL is tested in a browser to verify if the URL is resolvable).
HADatAc uses LabKey to store some content of its knowledge graph. Content stored in LabKey is used to debug some contents of the knowledge graph. This configuration file uses the site parameter to identify the main URL of the LabKey to be used for HADatAc. The folder parameter is used to identify the project inside of LabKey where HADatAc's knowledge graph may be stored.
-
Configure the labkey server url
site=[URL OF YOUR LABKEY APP] folder=[NAME OF YOUR LABKEY PROJECT]
-
Configure the key for encryption on LabKey authority
encryption_key=yourkey
This file is used to define the name of the column headers of all the template files used in HADatAc: STD, SSD, PID, SID, MAP, ACQ, OAS, SDD. Each one of these template files is composed off a fixed, ordered list of properties, and the column header of each one of these templates must be a column header defined in the vocabulary of this configuration file.
Each HADatAc installation has one Master user. This used can only be created once, and the password used to create this user needs to be carefully noted.
Why the Master User is Unique
The Master user is unique because of the following:
- every functional HADatAc installation has one Master User
- it can only be created once
- it is already created as a verified user without the need of email configuration (and without the need of setting up an emailer to work with HADatAc
- it is already created with admin permission In summary, the Master User is everything needed for ONE USER to start using all the functionalities in HADatAc. That is a very important feature for rapid installation of development copies of HADatAc.
How to Create a Master User
The user is created by selecting the "Sign Up" option in HADatAc's main page. The process is the same as the sign up of any other user. The only difference is that once the Master user is created, it is already available to log in. If a Master User is created, there is no way to create a new Master User, and any use of the Sign Up button will only be allowed to users who have already be pre-registered (see Section 5.2).
How to Reset the Password for the Master User
This option consists of erasing the following:
- In Blazegraph: the content of the store_users. One easy way of erasing the content from store_users' namespace in Blazegraph is of erasing the namespace itself and creating in again.
- In Solr, the content of the following tables under solr/solr-home: users, linked_account, and token_action. Each table in Solr is composed of two folders and one file: conf, data and core.properties. The content of the table is inside the data folder. In this case, it is possible to erase the data folder that is going to be recreated the next time you start Solr. Another option is to replace the entire solr/solr_home folder with its content from github.
Copyright (c) 2019, HADatAc.org
-
Installation
1.1. Installing for Linux (Production)
1.2. Installing for Linux (Development)
1.3. Installing for MacOS (Development)
1.4. Deploying with Docker (Production)
1.5. Deploying with Docker (Development)
1.6. Installing for Vagrant under Windows
1.7. Upgrading
1.8. Starting HADatAc
1.9. Stopping HADatAc -
Setting Up
2.1. Software Configuration
2.2. Knowledge Graph Bootstrap
2.2.1. Knowledge Graph
2.2.2. Bootstrap without Labkey
2.2.3. Bootstrap with Labkey
2.3. Config Verification -
Using HADatAc
3.1. Initial Page
3.1.1. Home Button
3.1.2. Sandbox Mode Button
3.2. File Ingestion
3.2.1. Ingesting Study Content
3.2.2. Manual Submission of Files
3.2.3. Automatic Submission of Files
3.2.4. Data File Operations
3.3. Manage Working Files 3.3.1. [Create Empty Semantic File from Template]
3.3.2. SDD Editor
3.3.3. DD Editor
3.4. Manage Metadata
3.4.1. Manage Instrument Infrastructure
3.4.2. Manage Deployments 3.4.3. Manage Studies
3.4.4. [Manage Object Collections]
3.4.5. Manage Streams
3.4.6. Manage Semantic Data Dictionaries
3.4.7. Manage Indicators
3.5. Data Search
3.5.1. Data Faceted Search
3.5.2. Data Spatial Search
3.6. Metadata Browser and Search
3.7. Knowledge Graph Browser
3.8. API
3.9. Data Download -
Software Architecture
4.1. Software Components
4.2. The Human-Aware Science Ontology (HAScO) -
Metadata Files
5.1. Deployment Specification (DPL)
5.2. Study Specification (STD)
5.3. Semantic Study Design (SSD)
5.4. Semantic Data Dictionary (SDD)
5.5. Stream Specification (STR) -
Content Evolution
6.1. Namespace List Update
6.2. Ontology Update
6.3. [DPL Update]
6.4. [SSD Update]
6.5. SDD Update -
Data Governance
7.1. Access Network
7.2. User Status, Categories and Access Permissions
7.3. Data and Metadata Privacy - HADatAc-Supported Projects
- Derived Products and Technologies
- Glossary