Tuesday, March 31, 2020

Blog 19: Select Everything

Select Everything


Introduction

There are certain situation where we want some of the columns to be present together at the front while rest of the columns pushed at the end of the sequence.This situation is normally encountered in the following cases:

  • Creating new column based on existing column
  • Renaming columns in the data set

The above thing happens when we try to create new columns.Normally, new columns created using mutate are pushed at the back of the data frame. Lets look at some of the example where we can use everything function and arrange the columns appropriately


Installing the library: dplyr,tidyr and Ecdat package

if(!require("dplyr")){
  
  install.packages("dplyr")
}else{
  
  library(dplyr)
}

if(!require("tidyr")){
  
  install.packages("tidyr")
}else{
  
  library(tidyr)
}

# For downloading the Cigarette Data
if(!require("Ecdat")){
  
  install.packages("Ecdat")
}else{
  
  library(Ecdat)
}

data(Cigar)
df<-Cigar
dim(df)
[1] 1380    9


First few records

head(df)
  state year price  pop  pop16  cpi      ndi sales pimin
1     1   63  28.6 3383 2236.5 30.6 1558.305  93.9  26.1
2     1   64  29.8 3431 2276.7 31.0 1684.073  95.4  27.5
3     1   65  29.8 3486 2327.5 31.5 1809.842  98.5  28.9
4     1   66  31.5 3524 2369.7 32.4 1915.160  96.4  29.5
5     1   67  31.6 3533 2393.7 33.4 2023.546  95.5  29.6
6     1   68  35.6 3522 2405.2 34.8 2202.486  88.4  32.0

Lets say we want state,year,price and sales together at the front while rest of the columns after that


Using everything function:Example 1

interim.df<-df%>%
  select(state,year,price,sales,everything())
 
head(interim.df,2)
  state year price sales  pop  pop16  cpi      ndi pimin
1     1   63  28.6  93.9 3383 2236.5 30.6 1558.305  26.1
2     1   64  29.8  95.4 3431 2276.7 31.0 1684.073  27.5


Using everything function:Example 2

Lets say we create year_new variable by prefixing 19 to the year attribute. Lets see how we can handle that

finaldf<-df%>%
  mutate(year_new=paste0("19",year))%>%
  select(state,year_new,price,sales,everything())

head(finaldf,2)
  state year_new price sales year  pop  pop16  cpi      ndi pimin
1     1     1963  28.6  93.9   63 3383 2236.5 30.6 1558.305  26.1
2     1     1964  29.8  95.4   64 3431 2276.7 31.0 1684.073  27.5


Final Comments

This is a useful hack that helps us arrange columns appropriately


Web Scraping Tutorial 2 - Getting the Avg Rating and Reviews Count

Web Scrapping Tutorial 2: Getting Overall rating and number of reviews ...