Transform digital data to ATCG sequences for DNA storage in high logical density, while output sequences comply with arbitrary user-defined constraints.
The following steps are required in order to run Adaptive Coder:
-
Install Docker.
- Install NVIDIA Container Toolkit for GPU support.
- Setup running Docker as a non-root user.
-
Check GPUs are avaliable by running:
docker run --rm --gpus all nvidia/cuda:11.0-base nvidia-smi
The output of this command should show a list of your GPUs.
The simplest way to run Adaptive Coder is using the provided Docker script. This was tested with 20 vCPUs, 64 GB of RAM, and a 3090 GPU.
-
Launch the nvidia maintained container by running:
docker run --gpus all -it --rm nvcr.io/nvidia/tensorflow:xx.xx-tf1-py3
Where xx.xx is the container version. For example, 21.12.
-
Install the
bert4keras
dependencies in running container, then commit it as a new image for later use.pip install bert4keras docker commit <CONTAINER ID> adaptive-coder:1.0
-
Clone this repository to your machine and
cd
into it.git clone https://github.com/chill868686/adaptive-coder.git
-
Install the
run_docker.py
dependencies. Note: You can create a new environment byConda
orVirtualenv
to prevent conflicts with your system's Python environment.pip3 install -r docker/requirements.txt
-
Run
run_docker.py
pointing to a file containing digital data or DNA sequences which you wish to transform. You optionally provide parameters to command coding:python docker/run_docker.py --file_path=(file_path) [OPTIONS] OPTIONS(defaluts): --log=running.log \ --model=best_model.weights \ --docker_image_name=adaptive-coder:1.0 \ --coding_type=en_decoding|encoding|decoding|training
We provide the following pattern:
- DNA encoding&decoding:
python docker/run_docker.py --file_path=mutimedias/poetry.txt
- DNA encoding:
python docker/run_docker.py --file_path=mutimedias/poetry.txt --coding_type=encoding
- DNA decoding:
python docker/run_docker.py --file_path=results/encodes/poetry.txt.dna --coding_type=decoding
- model training:
python docker/run_docker.py --file_path=datasets/seq_good_256_m.txt --coding_type=training