This is an Analytics Engineering project with the sole aim of utilizing dbt to construct an ELT data pipeline to simulate Airbnb datasets containing informations about the hosts, listings and reviews.
DBT
: For data cleaning and transformationAmazon s3
: Cloud storageSnowflakes
: Cloud data warehousePreset
: BI tool for developing dashboard
-
Create a project folder and cd into this folder
-
Run
make setup
on your terminal to setup dbt in your current directory -
Run
make init
to inialize dbt structure for your project -
Setup the data warehouse, grant the neccessary permissions and connect it to dbt in the profiles.yml in your directory. This can be retrieved using the following command
cat ~/.dbt/profiles.yml
setting up data warehouse
-- Use an admin role USE ROLE ACCOUNTADMIN; -- Create the `transform` role CREATE ROLE IF NOT EXISTS transform; GRANT ROLE TRANSFORM TO ROLE ACCOUNTADMIN; -- Create the default warehouse if necessary CREATE WAREHOUSE IF NOT EXISTS COMPUTE_WH; GRANT OPERATE ON WAREHOUSE COMPUTE_WH TO ROLE TRANSFORM; -- Create the `dbt` user and assign to role CREATE USER IF NOT EXISTS dbt PASSWORD='' LOGIN_NAME='dbt' MUST_CHANGE_PASSWORD=FALSE DEFAULT_WAREHOUSE='COMPUTE_WH' DEFAULT_ROLE='transform' DEFAULT_NAMESPACE='AIRBNB.RAW' COMMENT='DBT user used for data transformation'; GRANT ROLE transform to USER dbt; -- Create our database and schemas CREATE DATABASE IF NOT EXISTS AIRBNB; CREATE SCHEMA IF NOT EXISTS AIRBNB.RAW; -- Set up permissions to role `transform` GRANT ALL ON WAREHOUSE COMPUTE_WH TO ROLE transform; GRANT ALL ON DATABASE AIRBNB to ROLE transform; GRANT ALL ON ALL SCHEMAS IN DATABASE AIRBNB to ROLE transform; GRANT ALL ON FUTURE SCHEMAS IN DATABASE AIRBNB to ROLE transform; GRANT ALL ON ALL TABLES IN SCHEMA AIRBNB.RAW to ROLE transform; GRANT ALL ON FUTURE TABLES IN SCHEMA AIRBNB.RAW to ROLE transform;
-
Run
make test
to run all the implemented tests including the singular tests, macros custom tests, generic tests and validation using dbt-expectation -
Run
make docs
to generate the project documentation andmake serve
to deploy this documentation to a webserver accessed at port 8080
Setting up a BI dashboard in Snowflake would require creating a role for this dashboard and granting the neccessary permission to this role, in our case, Preset role.
You can use the below command to set up this access
-- Use an admin role
USE ROLE ACCOUNTADMIN;
CREATE ROLE IF NOT EXISTS REPORTER;
CREATE USER IF NOT EXISTS PRESET
PASSWORD=''
LOGIN_NAME='preset'
MUST_CHANGE_PASSWORD=FALSE
DEFAULT_WAREHOUSE='COMPUTE_WH'
DEFAULT_ROLE='REPORTER'
DEFAULT_NAMESPACE='AIRBNB.DEV'
COMMENT='Preset user for creating reports';
GRANT ROLE REPORTER TO USER PRESET;
GRANT ROLE REPORTER TO ROLE ACCOUNTADMIN;
GRANT ALL ON WAREHOUSE COMPUTE_WH TO ROLE REPORTER;
GRANT USAGE ON DATABASE AIRBNB TO ROLE REPORTER;
GRANT USAGE ON SCHEMA AIRBNB.DEV TO ROLE REPORTER;
GRANT SELECT ON ALL TABLES IN SCHEMA AIRBNB.DEV TO ROLE REPORTER;
GRANT SELECT ON ALL VIEWS IN SCHEMA AIRBNB.DEV TO ROLE REPORTER;
GRANT SELECT ON FUTURE TABLES IN SCHEMA AIRBNB.DEV TO ROLE REPORTER;
GRANT SELECT ON FUTURE VIEWS IN SCHEMA AIRBNB.DEV TO ROLE REPORTER;
- Learn more about dbt in the docs
- Check out Discourse for commonly asked questions and answers
- Join the chat on Slack for live discussions and support
- Find dbt events near you
- Check out the blog for the latest news on dbt's development and best practices