This final project is about Dockerize ETL Pipeline using ETL tools and Airflow that extracts Public API data from PIKOBAR, then load into MySQL (Staging Area) and finally aggregate the data and save into PostgreSQL.
-
Create Docker (MySQL, Airflow and PostgreSQL) in your local computer.
-
Create Docker Database in MySQL and PostgreSQL.
-
First create connection on Airflow in order to extract data from API Endpoint.
-
Create DLL in MySQL.
-
Extract data from API Endpoint then the data will be loaded and staged in MySQL.
-
Create DDL in PostgreSQL for Fact table and Dimension table.
-
Create load data to Dimension table
-
Create script for aggregate Province Daily save to Province Daily Table
-
Create script for aggregate District Daily save to District Daily Table
-
Create DAG with schedule daily basis with task:
a. get_data_from_API.
b. Creating_NewColumns_and_Inserting_Values_for_dim_status_table_only.
c. generate_dim_table.
d. generate_district_daily_table.
e. generate_province_daily_table.