Spark Twitter Streaming Example

Let’s try to write a very simple Spark Streaming program that prints a sample of the tweets it receives from Twitter every second.

1) Twitter Credentials Setup

1) First of all, please follow to this guide in order to get Twitter Credentials Keys, that give you access to the twitter steam

2) Dependencies

Please ensure that in your POM file you have the following links:

<dependencies>
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-core_2.11</artifactId>
        <version>2.1.0</version>
    </dependency>
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-streaming_2.11</artifactId>
        <version>2.1.0</version>
    </dependency>
    <dependency>
        <groupId>org.apache.bahir</groupId>
        <artifactId>spark-streaming-twitter_2.11</artifactId>
        <version>2.1.0</version>
    </dependency>
</dependencies>

3) Create a Twitter Stream

import org.apache.spark.streaming.twitter.TwitterUtils
import org.apache.spark.streaming.{Seconds, StreamingContext}
import org.apache.spark.{SparkConf, SparkContext}

object SparkTwitterStreaming {
  def main(args: Array[String]): Unit = {

    val conf = new SparkConf()
    conf.setAppName("spark-sreaming")
    conf.setMaster("local[2]")
    val sc = new SparkContext(conf)

    val ssc = new StreamingContext(sc, Seconds(1))

    // Configure your Twitter credentials
    val apiKey = ""
    val apiSecret = ""
    val accessToken = "
    val accessTokenSecret = ""

    System.setProperty("twitter4j.oauth.consumerKey", apiKey)
    System.setProperty("twitter4j.oauth.consumerSecret", apiSecret)
    System.setProperty("twitter4j.oauth.accessToken", accessToken)
    System.setProperty("twitter4j.oauth.accessTokenSecret", accessTokenSecret)

    // Create Twitter Stream
    val stream = TwitterUtils.createStream(ssc, None)
    val tweets = stream.map(t => t.getText)

    tweets.print()

    ssc.start()
    ssc.awaitTermination()
  }
}

Run program and you will soon find a sample of the received tweets beeing printed on the screen (can take 10 seconds or so before it start appearing).

results matching ""

    No results matching ""