My Quotes


When U were born , you cried and the world rejoiced
Live U'r life in such a way that when you go
THE WORLD SHOULD CRY






Tuesday, April 4, 2017

Fetch TWITTER data using R

Fetch Twitter data using R


  1. twitteR Package:
    1. One of the available package in R for fetching Twitter Data. The package can be obtained from CRAN.R.PROJECT
    2. This package allows us to make REST API calls to twitter using the ConsumerKey & ConsumerSecret code. Code below illustrates
      how to extract the Twitter Data.
    3. This package offers below functionality:
      1. Authenticate with Twitter API
      2. Fetch User timeline
      3. User Followers
      4. User Mentions
      5. Search twitter
      6. User Information
      7. User Trends
      8. Convert JSON object to dataframes
  2. REST API CALLS using R - twitteR package:
    1. Register your application with twitter.
    2. After registration, you will be getting ConsumerKey & ConsumerSecret code which needs to be used for calling twitter API.
    3. Load TwitteR library in R environment.
    4. Call twitter API using OAuthFactory$new() method with ConsumerKey & ConsumerSecret code as input params.
    5. The above step will return an authorization link, which needs to be copied & pasted in the internet browser.
    6. You will be redirected to Twitter application authentication page where you need to authenticate yourself by providing you twitter credentials.
    7. After authenticating , we will be provided with a Authorization code, which needs to be pasted in the R console.
    8. Call registerTwitterOAuth().
    9. friends information
    10. Location based
  3. Source Code:
    library(twitteR)
    requestURL <-  "https://api.twitter.com/oauth/request_token"
    accessURL =    "https://api.twitter.com/oauth/access_token"
    authURL =      "https://api.twitter.com/oauth/authorize"
    consumerKey =   "XXXXXXXXXXXX"
    consumerSecret = "XXXXXXXXXXXXXXXX"
    twitCred <- OAuthFactory$new(consumerKey=consumerKey,
                                 consumerSecret=consumerSecret,
                                 requestURL=requestURL,
                                 accessURL=accessURL,
                                 authURL=authURL)
    download.file(url="http://curl.haxx.se/ca/cacert.pem",
                  destfile="cacert.pem")
    twitCred$handshake(cainfo="cacert.pem")
    save(list="twitCred", file="twitteR_credentials")
    load("twitteR_credentials")
    registerTwitterOAuth(twitCred)#Register your app with Twitter.
    
    
  4. StreamR Package:
    1. This package allows users to fetch twitter Data in real time by connecting to Twitter Stream API.
    2. We can obtain the package from STREAM.R.PROJECT
    3. Few important functions this package offers are: it allows R users to access Twitter's search streams,user streams, parse the output into data frames.
    4. filterStream() - filterStream method opens a connection to Twitter’s Streaming API that will return public statuses that match one or more filter predicates like search keywords.
    5. Tweets can be filtered by keywords, users, language, and location.
    6. The output can be saved as an object in memory or written to a text file.
    7. parseTweets() - This function parses tweets downloaded using filterStream, sampleStream or userStream and returns a data frame.
  5. Below code example shows how to fetch data in real time using RStream:
    library(streamR)
    library(twitteR)
    load("twitteR_credentials")  # make using the save credentials in the previous code.
    registerTwitterOAuth(twitCred)
    filterStream(file.name = "tweets.json", track = "#bigdata",timeout = 0, locations=c(-74,40,-73,41), oauth = twitCred)
    Executing the above will capturing Tweets on "#bigdata" from "NEW YORK" location. Here when we mention timeout=0, we are setting it to fetch continuously, to fetch records for certain time then use timeout=300 (fetches data for 300 secs)
    To Parse the fetched tweets use the below code:
    tweets.df <- parseTweets("tweets.json")