GitHub - dinosaure/art: Adaptive Radix Tree in OCaml

Adaptive Radix Tree (ART) in OCaml

This is an implementation in OCaml of ART. Adaptive Radix Tree is like a simple Hashtbl with order:

# let tree = Art.make () ;;
# Art.insert tree (Art.key "foo") 42 ;;
# Art.insert tree (Art.key "bar") 21 ;;
# Art.find tree (Art.key "foo")
- : int = 42

Operation like minimum or maximum are available (which don't exist for a simple Hashtbl.t):

# let tree = Art.make () ;;
# Art.insert tree (Art.key "0") 0 ;;
# Art.insert tree (Art.key "1") 1 ;;
# Art.insert tree (Art.key "2") 2 ;;
# Art.minimum tree
- : int = 0
# Art.maximum tree
- : int = 2

If you want the order and the speed of Hashtbl.t, Art is your library:

Benchmark on find
Benchmark on insert

The function prefix_iter is also available if you want to get a subset of your tree:

# let t = Art.make () ;;
# Art.insert t (Art.key
# Art.insert t (Art.key "Dalton Joe") 0 ;;
# Art.insert t (Art.key "Dalton Jack") 1 ;;
# Art.insert t (Art.key "Dalton William") 2 ;;
# Art.insert t (Art.key "Dalton Averell") 3 ;;
# Art.insert t (Art.key "Rantanplan") 4 ;;
# let dalton = Art.prefix_iter ~prefix:(Art.key "Dalton")
  (fun k _ a -> (k :> string) :: a) [] t ;;
- : string list = [ "Dalton Joe"
                  ; "Dalton Jack"
		  ; "Dalton William"
		  ; "Dalton Averell" ]

Read Optimised Write Exclusion (ROWEX) in OCaml

ROWEX is a second implementation of ART with atomic operations. It's a functor which expects an implementation of atomic operations such as load or store.

Parallelism, atomic operation & OCaml

The current version of OCaml has a global lock for the GC. By this way, it's not possible for us to execute ROWEX operations (find/insert) with true parallelism if we use the same OCaml runtime. Even if you use LWT or ASYNC, you execute jobs concurrently.

However, ROWEX wants to provide an implementation where find/insert can be executed in parallel without any problems (race condition or ABA problem). So ROWEX provides an implementation, persistent, which implements atomic operations on a memory area. Then, we are able, as parmap, to simulate true parallelism as long as each operations are executed into their own fork().

The goal of this library is provide:

the most easy way to switch the implementation to ocaml-multicore
a baby step to be able to manipulate a file by several processes (consumers/find, producers/insert) in parallel

ROWEX follows two main papers:

The initial implementation of ROWEX
A derivation of it to be persistent: PART

Tools

The distribution comes with some tools to manipulate an index:

$ opam pin add -y https://github.com/dinosaure/art
$ opam install rowex
$ part.make index.idx
$ ls -lh
-rw-r--r-- 1 user user 8,0M ----- -- --:-- index.idx
prw------- 1 user user    0 ----- -- --:-- index.idx.socket
prw------- 1 user user    0 ----- -- --:-- index.idx-truncate.socket
$ part.insert index.idx foo 1
$ part.find index.idx foo
1

On the OCaml side, a Part module exists which implements these functions:

type 'a t constraint 'a = [< `Rd | `Wr ]

val create : ?len:int -> string -> unit
val insert : [> `Rd | `Wr ] t -> string -> int -> unit
val lookup : [> `Rd ] t -> string -> int

part is Unix dependent (and it need an Unix named pipe). It ensures with explained internal mechanisms to use multiple readers and one writer:

The writer can take the exclusive ownership on the index file and its named pipe
readers don't need to take the ownership but they must send a signal into the named pipe (to the writer) that they start to introspect the index

For readers, some functions exist to signal their existence to the write:

val append_reader : Ipc.t -> unit
val delete_reader : Ipc.t -> unit

val ipc : _ t -> Ipc.t

Status: experimental

This part of the distribution is experimental - even if the distribution comes with several tests to ensure that the implementation works, ROWEX is fragile! It still need a synchronization mechanism fsync() which is added pervasively in some parts of the code according to outcomes of errors.

Name		Name	Last commit message	Last commit date
Latest commit History 201 Commits
atomic		atomic
bench		bench
bin		bin
conf		conf
fuzz		fuzz
ipc		ipc
lib		lib
test		test
.gitignore		.gitignore
.ocamlformat		.ocamlformat
CHANGES.md		CHANGES.md
CHANGES.rowex.md		CHANGES.rowex.md
LICENSE.md		LICENSE.md
README.md		README.md
art.opam		art.opam
dune-project		dune-project
rowex.opam		rowex.opam

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Adaptive Radix Tree (ART) in OCaml

Read Optimised Write Exclusion (ROWEX) in OCaml

Parallelism, atomic operation & OCaml

Tools

Status: experimental

About

Releases 3

Packages

Contributors 2

Languages

License

dinosaure/art

Folders and files

Latest commit

History

Repository files navigation

Adaptive Radix Tree (ART) in OCaml

Read Optimised Write Exclusion (ROWEX) in OCaml

Parallelism, atomic operation & OCaml

Tools

Status: experimental

About

Resources

License

Stars

Watchers

Forks

Releases 3

Packages 0

Contributors 2

Languages

Packages