Machine Learning Made Easy: March 2020

Select Everything

Introduction

There are certain situation where we want some of the columns to be present together at the front while rest of the columns pushed at the end of the sequence.This situation is normally encountered in the following cases:

Creating new column based on existing column
Renaming columns in the data set

The above thing happens when we try to create new columns.Normally, new columns created using mutate are pushed at the back of the data frame. Lets look at some of the example where we can use everything function and arrange the columns appropriately

Installing the library: dplyr,tidyr and Ecdat package

if(!require("dplyr")){
  
  install.packages("dplyr")
}else{
  
  library(dplyr)
}

if(!require("tidyr")){
  
  install.packages("tidyr")
}else{
  
  library(tidyr)
}

# For downloading the Cigarette Data
if(!require("Ecdat")){
  
  install.packages("Ecdat")
}else{
  
  library(Ecdat)
}

data(Cigar)
df<-Cigar
dim(df)

[1] 1380    9

First few records

head(df)

  state year price  pop  pop16  cpi      ndi sales pimin
1     1   63  28.6 3383 2236.5 30.6 1558.305  93.9  26.1
2     1   64  29.8 3431 2276.7 31.0 1684.073  95.4  27.5
3     1   65  29.8 3486 2327.5 31.5 1809.842  98.5  28.9
4     1   66  31.5 3524 2369.7 32.4 1915.160  96.4  29.5
5     1   67  31.6 3533 2393.7 33.4 2023.546  95.5  29.6
6     1   68  35.6 3522 2405.2 34.8 2202.486  88.4  32.0

Lets say we want state,year,price and sales together at the front while rest of the columns after that

Using everything function:Example 1

interim.df<-df%>%
  select(state,year,price,sales,everything())
 
head(interim.df,2)

  state year price sales  pop  pop16  cpi      ndi pimin
1     1   63  28.6  93.9 3383 2236.5 30.6 1558.305  26.1
2     1   64  29.8  95.4 3431 2276.7 31.0 1684.073  27.5

Using everything function:Example 2

Lets say we create year_new variable by prefixing 19 to the year attribute. Lets see how we can handle that

finaldf<-df%>%
  mutate(year_new=paste0("19",year))%>%
  select(state,year_new,price,sales,everything())

head(finaldf,2)

  state year_new price sales year  pop  pop16  cpi      ndi pimin
1     1     1963  28.6  93.9   63 3383 2236.5 30.6 1558.305  26.1
2     1     1964  29.8  95.4   64 3431 2276.7 31.0 1684.073  27.5