Fetch Twitter data using R
- twitteR Package:
- One of the available package in R for fetching Twitter Data. The package can be obtained from CRAN.R.PROJECT
- This package allows us to make REST API calls to twitter using the ConsumerKey & ConsumerSecret code. Code below illustrates
how to extract the Twitter Data.
- This package offers below functionality:
- Authenticate with Twitter API
- Fetch User timeline
- User Followers
- User Mentions
- Search twitter
- User Information
- User Trends
- Convert JSON object to dataframes
- Authenticate with Twitter API
- One of the available package in R for fetching Twitter Data. The package can be obtained from CRAN.R.PROJECT
- REST API CALLS using R - twitteR package:
- Register your application with twitter.
- After registration, you will be getting ConsumerKey & ConsumerSecret code which needs to be used for calling twitter API.
- Load TwitteR library in R environment.
- Call twitter API using OAuthFactory$new() method with ConsumerKey & ConsumerSecret code as input params.
- The above step will return an authorization link, which needs to be copied & pasted in the internet browser.
- You will be redirected to Twitter application authentication page where you need to authenticate yourself by providing you twitter credentials.
- After authenticating , we will be provided with a Authorization code, which needs to be pasted in the R console.
- Call registerTwitterOAuth().
- friends information
- Location based
- Register your application with twitter.
- Source Code:
library(twitteR) requestURL <- "https://api.twitter.com/oauth/request_token" accessURL = "https://api.twitter.com/oauth/access_token" authURL = "https://api.twitter.com/oauth/authorize" consumerKey = "XXXXXXXXXXXX" consumerSecret = "XXXXXXXXXXXXXXXX" twitCred <- OAuthFactory$new(consumerKey=consumerKey, consumerSecret=consumerSecret, requestURL=requestURL, accessURL=accessURL, authURL=authURL) download.file(url="http://curl.haxx.se/ca/cacert.pem", destfile="cacert.pem") twitCred$handshake(cainfo="cacert.pem") save(list="twitCred", file="twitteR_credentials") load("twitteR_credentials") registerTwitterOAuth(twitCred)#Register your app with Twitter.
- StreamR Package:
- This package allows users to fetch twitter Data in real time by connecting to Twitter Stream API.
- We can obtain the package from STREAM.R.PROJECT
- Few important functions this package offers are: it allows R users to access Twitter's search streams,user streams, parse the output into data frames.
- filterStream() - filterStream method opens a connection to Twitter’s Streaming API that will return public statuses that match one or more filter predicates like search keywords.
- Tweets can be filtered by keywords, users, language, and location.
- The output can be saved as an object in memory or written to a text file.
- parseTweets() - This function parses tweets downloaded using filterStream, sampleStream or userStream and returns a data frame.
- This package allows users to fetch twitter Data in real time by connecting to Twitter Stream API.
- Below code example shows how to fetch data in real time using RStream:
library(streamR) library(twitteR) load("twitteR_credentials") # make using the save credentials in the previous code. registerTwitterOAuth(twitCred) filterStream(file.name = "tweets.json", track = "#bigdata",timeout = 0, locations=c(-74,40,-73,41), oauth = twitCred) Executing the above will capturing Tweets on "#bigdata" from "NEW YORK" location. Here when we mention timeout=0, we are setting it to fetch continuously, to fetch records for certain time then use timeout=300 (fetches data for 300 secs) To Parse the fetched tweets use the below code: tweets.df <- parseTweets("tweets.json")