Skip to content

๐Ÿ“Š ๐Ÿ“‘This project provides a step-by-step big data analytics applied in the retail industry through the use of a variety of big data technologies. such as HDFS, Hive and Spark..

Notifications You must be signed in to change notification settings

Heisenberghj7/Retail-Store-BigData

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

16 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

๐Ÿ“‰ ๐Ÿง‘โ€๐Ÿ’ป Retail Store BigData ๐Ÿ“Š๐Ÿ“ฆ

Project Architecture ๐Ÿ“ ๐Ÿ–Š๏ธ

Part I: Data Migration & Data Analysis

Importing a Table from MySQL to HDFS:

  1. Create the database and the tables in MySQL.
  2. Use Sqoop to import the tables in the retail store database and save it in HDFS under "/user".
  3. Import the tables to a Parquet data format rather than the default file form (text file).

Data Analysis: First of all we're going to import data from HDFS to Hive, HiveQL is Hiveโ€™s query language, a dialect of SQL for big data. By using HiveQL we're going to determine:

  • Get How many Orders were placed
  • Get Average Revenue Per Order
  • Get Average Revenue Per Day Per Product

part ll : PowerBI

  • (In Progress)

Part lll : Spark SQL and PySpark

  • (In Progress)

  • ๐Ÿ“ซ Feel free to contact me if anything is wrong or if anything needs to be changed ๐Ÿ˜Ž! medhajjari9@gmail.com

Open In Colab

About

๐Ÿ“Š ๐Ÿ“‘This project provides a step-by-step big data analytics applied in the retail industry through the use of a variety of big data technologies. such as HDFS, Hive and Spark..

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages