Incomplete k-skyband, constrained skyline, and group-by skyline queries on incomplete data in postgresql
- Please see Dr. Gao's paper: [Processing k-skyband, constrained skyline, and group-by skyline queries on incomplete data] (http://www.armourcy.com/skyband.pdf)
- Operation System: Mac OSX / Ubuntu 14.04
- PostgreSQL version: PostgreSQL 9.4.4
for ubuntu:
sudo apt-get install postgresql-9.4
sudo apt-get install postgresql-server-dev-9.4
sudo apt-get install postgresql-contrib-9.4
for mac:
brew install postgresql
git clone git@github.com:Armour/K-SkyBand-Query-in-PostgreSQL.git
cd K-SkyBand-Query-in-PostgreSQL
cd skyband
make
sudo make install
In terminal you will see a path in the output like this: "/usr/local/Cellar/postgresql/9.4.4/share/postgresql/contrib"
- Log in the postgres database where the skyline function will be added
(if you want to add the function to all databases, you should log in template1 database. Google it for more details.)
- Use postgresql command to import function:
(remember to replace content in '[]' with the path that you got in step 2.)
\i [/usr/local/Cellar/postgresql/9.4.4/share/postgresql/contrib/]skyband.sql
- You will see two "CREATE FUNCTION" if everything works
If we have such a table:
postgres=> select * from hotel;
id | name | price | distance | noisy | star
----+--------------+-------+----------+-------+------
1 | cs hotel | 200 | 30.62 | 5 | 3
2 | c hotel | 400 | 20 | 6 | 3
3 | c++ hotel | 300 | 20 | | 3
4 | java hotel | 330 | 17.5 | | 3
5 | python hotel | 278 | | 2 | 4
6 | ruby hotel | 399 | | | 4
7 | obj-c hotel | | | 4 | 4
8 | swift hotel | | 25.99 | 9 |
35 | pascal hotel | 243 | | 7 | 3
36 | lol hotel | 1 | 1 | 1 | 5
(10 rows)
The following command can be performed using skyband function:
-- skyline query (k = 1)
postgres=> select * from skyband('select name, price, distance, noisy from hotel', 1) as (name text, price int, distance real, noisy int);
name | price | distance | noisy
-----------+-------+----------+-------
lol hotel | 1 | 1 | 1
(1 row)
-- k-skyband query (k > 1)
postgres=> select * from skyband('select name, price, distance, noisy from hotel', 3) as (name text, price int, distance real, noisy int);
name | price | distance | noisy
--------------+-------+----------+-------
cs hotel | 200 | 30.62 | 5
lol hotel | 1 | 1 | 1
python hotel | 278 | | 2
obj-c hotel | | | 4
(5 rows)
-- constrained skyband query
postgres=> select * from skyband('select name, price, distance, noisy from hotel', 3) as (name text, price int, distance real, noisy int) where price between 100 and 280;
name | price | distance | noisy
--------------+-------+----------+-------
cs hotel | 200 | 30.62 | 5
python hotel | 278 | | 2
(2 rows)
Note that skyband function needs two parameters:
- The selection clause, can not be NULL (text)
- Integer k
(ps: you should also provide the output format using as
clause, otherwise it won't work).
There is another function call skyand_ext
which is more powerfull and can solve group-by skyband query.
-- skyline query (k = 1)
postgres=> select * from skyband_ext('name, price, distance, noisy', 'hotel', NULL, 1, 'name text, price int, distance real, noisy int') as (name text, price int, distance real, noisy int);
name | price | distance | noisy
-----------+-------+----------+-------
lol hotel | 1 | 1 | 1
(1 row)
-- k-skyband query (k > 1)
postgres=> select * from skyband_ext('name, price, distance, noisy', 'hotel', NULL, 3, 'name text, price int, distance real, noisy int') as (name text, price int, distance real, noisy int);
name | price | distance | noisy
--------------+-------+----------+-------
cs hotel | 200 | 30.62 | 5
lol hotel | 1 | 1 | 1
python hotel | 278 | | 2
obj-c hotel | | | 4
(4 rows)
-- constrained skyband query
postgres=> select * from skyband_ext('name, price, distance, noisy', 'hotel', NULL, 3, 'name text, price int, distance real, noisy int') as (name text, price int, distance real, noisy int) where price between 100 and 280;
name | price | distance | noisy
--------------+-------+----------+-------
cs hotel | 200 | 30.62 | 5
python hotel | 278 | | 2
(2 rows)
-- groupby skyband query (grouped by star attribute)
postgres=> select * from skyband_ext('name, price, distance, noisy, star', 'hotel', 'star', 1, 'name text, price int, distance real, noisy int, star int') as (name text, price int, distance real, noisy int, star int);
name | price | distance | noisy | star
--------------+-------+----------+-------+------
swift hotel | | 25.99 | 9 |
cs hotel | 200 | 30.62 | 5 | 3
python hotel | 278 | | 2 | 4
lol hotel | 1 | 1 | 1 | 5
(4 rows)
Note that skyband_ext
functin takes 5 parameters:
- The fields can not be NULL (text)
- The name of the table can not be NULL (text)
- The attribute, used as group-by standard, can be NULL (text)
- Integer k
- Inner output format can not be NULL and must match the number of fields in 1. (text)
(ps: you should also provide the outer output format using as clause, otherwise it won't work).
(pps: here the inner output format is actully redundent, but I still didn't find a good way to hide it. In most cases, you can fill both inner and outer formats with the original names and types in your table)
- If you have any question about this paper, you can contact Dr. Gao: gaoyj@zju.edu.cn
- The projet is implemented by Armour Guo. Feel free to ask any questions: armourcy@gmail.com