Market Mix Modelling
Parag Verma
16th October, 2022
Introduction
Once a company manufactures its product, it needs a sound channel to push it for sales. This includes making customers aware of the product. This in turn is done through two broad types of Marketing Channels:
- Direct Marketing : Includes promotional channels such as mails,newsletter, ads,websites,etc
- Indirect Marketing : Includes Social Media, Referrals, Live Events, etc
A company uses both these strategies to market their products.Now all these marketing programs incur cost and hence it is very important to understand how effective all the channels in increasing customer sales. Based on how much each marketing tactic influence sales, the spend can then be optimized.
Some commonly used tactics for Direct and Indirect Marketing is shown below
Step 0: Importing the libraries
In this blog, we will take a sample marketing from datarium package and leverage adstock modelling to attribute product sales to different marketing acitivty.
If you unable to run the below piece of code, then you can import the dataset form my github repository.
package.name<-c("dplyr","tidyr","datarium","stats",
"ggplot2","plotly","corrplot","ggcorrplot","RColorBrewer")
for(i in package.name){
if(!require(i,character.only = T)){
install.packages(i)
}
library(i,character.only = T)
}
Step 1: Importing the dataset
data("marketing")
df<-marketing%>%
select(sales,everything())
head(df)
sales youtube facebook newspaper
1 26.52 276.12 45.36 83.04
2 12.48 53.40 47.16 54.12
3 11.16 20.64 55.08 83.16
4 22.20 181.80 49.56 70.20
5 15.48 216.96 12.96 70.08
6 8.64 10.44 58.68 90.00
The attributes are as follows:
- youtube :Advertising budget in thousand dollars
- facebook :Advertising budget in thousand dollars
- newspaper :Advertising budget in thousand dollars
- sales :Sales figures in thousand dollars
Lets assume that these are the data points for the last 200 weeks for a small market.
Step 2: Correlation Between Variables
Since we are trying to find the impact of various marketing channels on sales, lets start by looking at the correlation between variables
cor_matrix <-round(cor(df),2)
cor_matrix
sales youtube facebook newspaper
sales 1.00 0.78 0.58 0.23
youtube 0.78 1.00 0.05 0.06
facebook 0.58 0.05 1.00 0.35
newspaper 0.23 0.06 0.35 1.00
ggcorrplot(cor_matrix, hc.order = TRUE, type = "lower",
lab = TRUE,insig = "blank")
We can see that
- Sales has a high correlation with youtube spend
- Sales has a medium correlation with facebook spend
- Sales has a low correlation with newspaper spend
Since correlation between facebook ,youtube and newspaper spend is very low, hence we can just rule out the issue of multi-collinearity.
Now lets convert the marketing expenses into adstock variables.
Step 3: Defining Adstock Function
Any marketing activity has an impact on its end customer and this effect decays with time. Some channels like TV have a higher impact and its impact decay gradually whereas the impact of other channel such as YouTube ads decays very rapidly.
Our data has three marketing channels:
- facebook: ads on facebook are very limited and hence we will assume it to have a rapid decline rate
- youtube: ads on youtube are in the form of colorful video and hence we will assume it to have a moderate decline rate
- newspaper: ads on newspaper are in the form of front page displays and hence have a relatively higher retention rate
We will now create the decay rate for each of these three channels
Step 3A: Defining Adstock Rate for facebook ads
In modelling adstock, we will also assume that the effect of ad exposure decays is in the form of a moving average.Hence it will be important to define till what past periods we would want to consider in the moving average term.
Lets say, if we are taking a decay rate of 0.1 for fb and if the spend for three consecutive periods are as follows
- Period 1: 10
- Period 2: 15
- Period 3: 20
Then Adstock for Period 3 will be calculated as: 10* 0.1^2 + 15* 0.1 + 20, which will be equal to 21.6
decay_rate_fb <- 0.1
past_memory <- 2
get_adstock_fb <- rep(decay_rate_fb, past_memory+1) ^ c(0:past_memory)
get_adstock_fb
[1] 1.00 0.10 0.01
In short, the effect decays to 10% in the subsequent period and then becomes 1% in the period after that.
Lets look at the first few records of the facebook coluumn and try to come up with transformed variable
df[["facebook"]][1:10]
[1] 45.36 47.16 55.08 49.56 12.96 58.68 39.36 23.52 2.52 3.12
Lets create the third term
45.36* 0.01 + 47.16*0.1 + 55.08 which gives 60.2496
We will check and see if we get the same through adstock transformation
ads_fb <- stats::filter(c(rep(0, past_memory), df[["facebook"]]),
filter = get_adstock_fb,
method="convolution")
ads_fb <- ads_fb[!is.na(ads_fb)] # Removing leading NA
ads_fb[1:5]
[1] 45.3600 51.6960 60.2496 55.5396 18.4668
We have padded the dataset with two zeroes(rep(0,past_memory)),so that we get a valid term for the first facebook expense which from our data is 45.35.Upon adding the two zeroes, the Moving Average term will be = 0.01* 0 + 0.1* 0 + 45.36.If we dont do a zero padding, then the moving average term for the first term will be NA as there will be no past record for the first instance of facebook spend.
Now we can check the third term and it matches our calculation which is 60.2496
Lets plot the facebook adstock
fb_df<-data.frame(Week=1:nrow(df),
Fb_Spend=df[["facebook"]],
Fb_Adstock=ads_fb)
head(fb_df)
Week Fb_Spend Fb_Adstock
1 1 45.36 45.3600
2 2 47.16 51.6960
3 3 55.08 60.2496
4 4 49.56 55.5396
5 5 12.96 18.4668
6 6 58.68 60.4716
p1<-ggplot(data = fb_df, aes(x=Week, y=Fb_Spend)) +
geom_segment( aes(xend=Week, yend=0),color="blue") +
geom_line(aes(y = Fb_Adstock, colour = "red"),
size = 1) +
xlab("Week") + ylab("Facebook Adstock")+
theme(text = element_text(size=15),
axis.text.x=element_text(size=15),
axis.text.y=element_text(size=15))
p1
The segments in blue represent the original spend whereas the ones in red represents adstock transformed spend.
We will repeat the above transformation for youtube and newspaper
ads.
Step 3B: Defining Adstock Rate for YouTube ads
We will assume that Youtube ads decays at a rate less than facebook anis equal to 0.15
decay_rate_yt <- 0.15
past_memory <- 2
get_adstock_yt <- rep(decay_rate_yt, past_memory+1) ^ c(0:past_memory)
get_adstock_yt
[1] 1.0000 0.1500 0.0225
ads_yt <- stats::filter(c(rep(0, past_memory), df[["youtube"]]),
filter = get_adstock_yt,
method="convolution")
ads_yt <- ads_yt[!is.na(ads_yt)] # Removing leading NA
ads_yt[1:5]
[1] 276.1200 94.8180 34.8627 186.0975 244.6944
Lets plot the YouTube adstock
yt_df<-data.frame(Week=1:nrow(df),
Yt_Spend=df[["youtube"]],
Yt_Adstock=ads_yt)
head(yt_df)
Week Yt_Spend Yt_Adstock
1 1 276.12 276.1200
2 2 53.40 94.8180
3 3 20.64 34.8627
4 4 181.80 186.0975
5 5 216.96 244.6944
6 6 10.44 47.0745
p2<-ggplot(data = yt_df, aes(x=Week, y=Yt_Spend)) +
geom_segment( aes(xend=Week, yend=0),color="blue") +
geom_line(aes(y = Yt_Adstock, colour = "red"),
size = 1) +
xlab("Week") + ylab("Youtube Adstock")+
theme(text = element_text(size=15),
axis.text.x=element_text(size=15),
axis.text.y=element_text(size=15))
p2
The segments in blue represent the original spend whereas the ones in red represents adstock transformed spend.