GitHub - imalik8088/cassandra-spark: playground for cassandra spark connector

Casssandra setup

Install Cassandra as per the DataStax guide and ensure that you have cqlsh access to the database. Set up the test keyspace and table using:

    cqlsh < data/userdb/schema.cql
    cqlsh < data/musicdb/schema.cql
    cqlsh < data/musicdb/data.cql
    cqlsh < data/spark_output/schema.cql

Running the Applications

Building and running

You can run this from sbt-spark-submit using

    sbt run

CassandraSparkApp

A simple app which shows how to use C* as a datasource in a Spark job. Basically it reads data from C* and writes it back with a simple limiting of data to the keyspace spark_output.

SparkStreamingApp

Run the spark streaming application in the sbt run. This Example counts words from a book. The book was downloaded from the project gutenberg where you can download whole books as textfiles. In order to stream data to the TextStream from Spark you have to push the file over the command nc. Use the following command for more fun

    while :; do cat data/book/2600-0.txt | nc -l 9999 ; sleep 1; done;

It loads the book every second and it will be processed by spark.

CassandraStreamingApp

This Example shows the integration of Spark Streaming and Cassandra where the input of the input from nc is used to load the proper data from the cassandra table albums_by_genre. The key word to which spark is join is the release year of the album. That means you can input the year 1999 in the netcat.

    nc -lk 9999

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
conf		conf
data		data
project		project
src/main/scala		src/main/scala
.gitignore		.gitignore
README.md		README.md
build.sbt		build.sbt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Casssandra setup

Running the Applications

Building and running

CassandraSparkApp

SparkStreamingApp

CassandraStreamingApp

About

Releases

Packages

Languages

imalik8088/cassandra-spark

Folders and files

Latest commit

History

Repository files navigation

Casssandra setup

Running the Applications

Building and running

CassandraSparkApp

SparkStreamingApp

CassandraStreamingApp

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages