Machine Learning With Python and Spark

This repository is about implementation of ML(LIB) library in pyspark implementing regression, classification and clustering techniques. Creating RDD's and apply transformations and actions.

Python Version :- 3.6

Spark version :- 2.3.3

Notebooks :-

Regression :-> Spark-Regression-ML(LIB).ipynb
Classification :-> Spark-Classification-ML(LIB).ipynb
Clustering :-> Spark-Clustering-ML(LIB) .ipynb
Spark_RDD :-> Airport_problem and Word Count
PySpark_Cheat_sheet :-> PySpark_SQL_Cheat_Sheet_Python.pdf

Note :-

All these notebooks are created on IBM WATSON Cloud Platform. Which provide Python + Spark Environment in python notebook.

Spark Context is created and provide as (sc) variable.

To check verison type :- sc.version

IBM Watson Link

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
Classification		Classification
Clustering		Clustering
PySpark_SQL_Cheat_Sheet_Python		PySpark_SQL_Cheat_Sheet_Python
Regression		Regression
Spark_RDD_Example		Spark_RDD_Example
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Machine Learning With Python and Spark

Python Version :- 3.6

Spark version :- 2.3.3

Notebooks :-

Note :-

About

Releases

Packages

Languages

kishanpython/Machine-Learning-Using-Python-Spark

Folders and files

Latest commit

History

Repository files navigation

Machine Learning With Python and Spark

Python Version :- 3.6

Spark version :- 2.3.3

Notebooks :-

Note :-

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages