HorsePower

HorsePower is designed for optimizing database queries with modern hardware. At its core is HorseIR, which is a well-designed array-based intermediate representation (IR) for database queries. Based on HorseIR, sophisticated compiler optimizations can be applied for database operations. Moreover, using array programming offers a promising option for performance speedup with fine-grained parallelism.

Project Overview

Figure 1. The workflow of the HorsePower framework.

In summer 2017, we started this project from scratch. The workflow of the HorsePower framework can be found in Figure 1. A candidate of the source language is our HorseIR language which is an extension of standard SQL. The Horse language is designed for data analytics with extended SQL features. At the current stage, we adopt execution plans from standard database SQL queries and MATLAB code. We provide a front end for parsing and transforming source code to HorseIR. After the optimization phases, multiple back-ends are supported. Static analyses and code optimizations are performed before the target code is generated. On the other hand, we provide an interpreter which allows running programs directly.

In HorsePower, we focus on the following parts.

- Design and implementation of array-based intermediate representation (IR)
- Static analysis for an array-based IR (i.e. HorseIR)
- Query optimizations with compiler optimizations
- Fine-grained primitive functions and highly tuned libraries

Installation

Download the repository

git clone git@github.com:Sable/HorsePower.git

Setup environment variables

cd HorsePower && source ./setup_env.sh

Setup Library

Installation with the following command line (About 13 mins)

(cd ${HORSE_LIB_FOLDER} && sh deploy_linux.sh)

After installation, new folders created as follows.

- include
- lib
- pcre2

Note, it is recommended to use gcc 8.1.0 or higher and additional library uuid-dev may be required during the installation.

Setup Data

Default data path for TPC-H

${HORSE_BASE}/data/tpch

In order to generate different scale factor datasets, you should run

cd data/tpch
./run.sh deploy       ## Read instructions and update Makefile
./run.sh gendb 1      ## Generate database and save to data/tpch/db1

With a specific scale factor, for example, 1, its path is

${HORSE_BASE}/data/tpch/db1

It contains a tbl file for each table

${HORSE_BASE}/data/tpch/db1/*.tbl

Build and Run

You are recommended to use the latest version as this project is still under active development.

To learn how to run, type

(cd ${HORSE_SRC_CODE} && ./run.sh)      # show usage

A Brief Summary

Name	Notes
Platform	Cross-platform
Tools	C/C++, Flex & Bison
Parallelism	OpenMP/Pthread/CUDA/OpenCL
Conventions	docs/conventions

Quick Entries

IR design

Database TPC-H

Implementation

Publications

Hanfeng Chen, Joseph Vinish D’silva, Hongji Chen, Bettina Kemme, and Laurie Hendren, HorseIR: Bringing Array Programming Languages together with Database Query Processing, Proceedings of the 14th Symposium on Dynamic Languages, (DLS '18), pp. 37-49, November 2018.
- BibTeX record on dblp
- DLS18 artifact on GitHub

Copyright and License

PCRE2: PCRE2 Licence

Name		Name	Last commit message	Last commit date
Latest commit History 741 Commits
data		data
docs		docs
libs		libs
src		src
tests		tests
.gitignore		.gitignore
README.md		README.md
setup_env.sh		setup_env.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HorsePower

Project Overview

Installation

Setup Library

Setup Data

Build and Run

A Brief Summary

Quick Entries

Publications

Copyright and License

About

Releases 1

Packages

Contributors 2

Languages

Sable/HorsePower

Folders and files

Latest commit

History

Repository files navigation

HorsePower

Project Overview

Installation

Setup Library

Setup Data

Build and Run

A Brief Summary

Quick Entries

Publications

Copyright and License

About

Resources

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 2

Languages

Packages