Web Scrapping Tutorial 3: Scrolling Down and Expanding Reviews by Presssing More
2024-05-23
Introduction
In this tutorial, we will look at how we can use Rselenium to :
- Scroll Down the review page
- Expand reviews by pressing More
We will also extract the text review along with time stamp and rating given
Step 0: Installing Libraries
package.name<-c("tidyverse","RSelenium")
for(i in package.name){
if(!require(i,character.only = T)){
install.packages(i)
}
library(i,character.only = T)
}
Loading required package: tidyverse
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr 1.1.4 ✔ readr 2.1.5
✔ forcats 1.0.0 ✔ stringr 1.5.1
✔ ggplot2 3.5.0 ✔ tibble 3.2.1
✔ lubridate 1.9.3 ✔ tidyr 1.3.1
✔ purrr 1.0.2
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
Loading required package: RSelenium
Step 1:Start a headless Firefox browser
The syntax for initiating a headless Firefox browser is shown below
driver <- rsDriver(
browser = c("firefox"),
chromever = NULL,
verbose = F,
extraCapabilities = list("firefoxOptions" = list(args = list("--headless")))
)
web_driver <- driver[["client"]]
Once I execute this, Firefox browser would pop up in the
background as shown below.