### There are instances when a lot of columns are present in our data frame and we have to lets say subtract or add 2 columns in groups.In such cases we can use string match to identify set of columns that are to be subtracted/added and repeat the operation across all the columns¶

In [ ]:

```
import numpy as np
import pandas as pd
```

### Lets look at the PE ratios of some of the banks in India.Again this is a dummy data and just for illustration purpose.The main goal is to use regular expression to perform columnar operation¶

In [130]:

```
Week_Num=['Week1','Week2','Week3','Week4','Week5']
ICICI_CurrentWeek = pd.Series(np.random.rand(5))
ICICI_LastWeek = pd.Series(np.random.rand(5))
HDFC_CurrentWeek = pd.Series(np.random.rand(5))
HDFC_LastWeek = pd.Series(np.random.rand(5))
AXIS_CurrentWeek = pd.Series(np.random.rand(5))
AXIS_LastWeek = pd.Series(np.random.rand(5))
SBI_CurrentWeek = pd.Series(np.random.rand(5))
SBI_LastWeek = pd.Series(np.random.rand(5))
df=pd.DataFrame([Week_Num,
ICICI_CurrentWeek,ICICI_LastWeek,
HDFC_CurrentWeek,HDFC_LastWeek,
AXIS_CurrentWeek,AXIS_LastWeek,
SBI_CurrentWeek,SBI_LastWeek]).T
df.columns=['Week_Num','ICICI_CurrentWeek','ICICI_LastWeek',
'HDFC_CurrentWeek','HDFC_LastWeek',
'AXIS_CurrentWeek','AXIS_LastWeek',
'SBI_CurrentWeek','SBI_LastWeek']
df
```

Out[130]:

Week_Num | ICICI_CurrentWeek | ICICI_LastWeek | HDFC_CurrentWeek | HDFC_LastWeek | AXIS_CurrentWeek | AXIS_LastWeek | SBI_CurrentWeek | SBI_LastWeek | |
---|---|---|---|---|---|---|---|---|---|

0 | Week1 | 0.537536 | 0.621686 | 0.025014 | 0.81958 | 0.741462 | 0.172415 | 0.949251 | 0.600982 |

1 | Week2 | 0.217225 | 0.805213 | 0.918463 | 0.621928 | 0.148297 | 0.775873 | 0.586337 | 0.319201 |

2 | Week3 | 0.24441 | 0.695069 | 0.789786 | 0.521547 | 0.739275 | 0.767115 | 0.550886 | 0.346773 |

3 | Week4 | 0.184179 | 0.371545 | 0.91638 | 0.346759 | 0.743964 | 0.568355 | 0.078098 | 0.35065 |

4 | Week5 | 0.616801 | 0.214047 | 0.092583 | 0.441851 | 0.561522 | 0.259982 | 0.364596 | 0.390632 |

### The goal here is to subtract ICICI current week numebrs from last week.This has to be repeated for other banks as well.Lets set up Identifying patterns for column name¶

In [14]:

```
col_nm_pattern=['ICICI','HDFC','AXIS','SBI']
```

#### For each value in the above pattern, we need to identify Current week and last week columns and need to subtract them¶

In [131]:

```
l1=[]
lower_pos='LastWeek'
upper_pos='CurrentWeek'
for i in col_nm_pattern:
pos=[j for j in df.columns if i in j] # Identifies column names specific to only one Bank
v2 = df[[k for k in pos if upper_pos in k][0]] # Identifies current week values for the bank
v1 = df[[k for k in pos if lower_pos in k][0]] # Identifies last week values for the bank
diff_val=v2 - v1 # SUbtract the two columns
interim_df=pd.DataFrame([df['Week_Num'],diff_val]).T
interim_df.columns = ['Week_Num',i + ' Numbers wrt ' + lower_pos]
l1.append(interim_df)
```

#### Consolidating all the values into a single data frame¶

In [132]:

```
final_df = pd.concat(l1,axis=1)
final_df
```

Out[132]:

Week_Num | ICICI Numbers wrt LastWeek | Week_Num | HDFC Numbers wrt LastWeek | Week_Num | AXIS Numbers wrt LastWeek | Week_Num | SBI Numbers wrt LastWeek | |
---|---|---|---|---|---|---|---|---|

0 | Week1 | -0.08415 | Week1 | -0.794566 | Week1 | 0.569047 | Week1 | 0.348269 |

1 | Week2 | -0.587988 | Week2 | 0.296535 | Week2 | -0.627576 | Week2 | 0.267136 |

2 | Week3 | -0.450659 | Week3 | 0.268239 | Week3 | -0.027841 | Week3 | 0.204113 |

3 | Week4 | -0.187366 | Week4 | 0.569621 | Week4 | 0.175609 | Week4 | -0.272551 |

4 | Week5 | 0.402753 | Week5 | -0.349268 | Week5 | 0.30154 | Week5 | -0.026036 |

#### Removing multiple instances of Week_Num column¶

In [133]:

```
final_df=final_df.loc[:,~final_df.columns.duplicated()]
final_df
```

Out[133]:

Week_Num | ICICI Numbers wrt LastWeek | HDFC Numbers wrt LastWeek | AXIS Numbers wrt LastWeek | SBI Numbers wrt LastWeek | |
---|---|---|---|---|---|

0 | Week1 | -0.08415 | -0.794566 | 0.569047 | 0.348269 |

1 | Week2 | -0.587988 | 0.296535 | -0.627576 | 0.267136 |

2 | Week3 | -0.450659 | 0.268239 | -0.027841 | 0.204113 |

3 | Week4 | -0.187366 | 0.569621 | 0.175609 | -0.272551 |

4 | Week5 | 0.402753 | -0.349268 | 0.30154 | -0.026036 |

### Merging all the data into a single data frame¶

In [134]:

```
final_consolidated=df.merge(final_df,on="Week_Num")
final_consolidated
```

Out[134]:

Week_Num | ICICI_CurrentWeek | ICICI_LastWeek | HDFC_CurrentWeek | HDFC_LastWeek | AXIS_CurrentWeek | AXIS_LastWeek | SBI_CurrentWeek | SBI_LastWeek | ICICI Numbers wrt LastWeek | HDFC Numbers wrt LastWeek | AXIS Numbers wrt LastWeek | SBI Numbers wrt LastWeek | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|

0 | Week1 | 0.537536 | 0.621686 | 0.025014 | 0.81958 | 0.741462 | 0.172415 | 0.949251 | 0.600982 | -0.08415 | -0.794566 | 0.569047 | 0.348269 |

1 | Week2 | 0.217225 | 0.805213 | 0.918463 | 0.621928 | 0.148297 | 0.775873 | 0.586337 | 0.319201 | -0.587988 | 0.296535 | -0.627576 | 0.267136 |

2 | Week3 | 0.24441 | 0.695069 | 0.789786 | 0.521547 | 0.739275 | 0.767115 | 0.550886 | 0.346773 | -0.450659 | 0.268239 | -0.027841 | 0.204113 |

3 | Week4 | 0.184179 | 0.371545 | 0.91638 | 0.346759 | 0.743964 | 0.568355 | 0.078098 | 0.35065 | -0.187366 | 0.569621 | 0.175609 | -0.272551 |

4 | Week5 | 0.616801 | 0.214047 | 0.092583 | 0.441851 | 0.561522 | 0.259982 | 0.364596 | 0.390632 | 0.402753 | -0.349268 | 0.30154 | -0.026036 |

In [135]:

```
final_consolidated = final_consolidated.reindex(sorted(final_consolidated.columns), axis=1)
final_consolidated
```

Out[135]:

AXIS Numbers wrt LastWeek | AXIS_CurrentWeek | AXIS_LastWeek | HDFC Numbers wrt LastWeek | HDFC_CurrentWeek | HDFC_LastWeek | ICICI Numbers wrt LastWeek | ICICI_CurrentWeek | ICICI_LastWeek | SBI Numbers wrt LastWeek | SBI_CurrentWeek | SBI_LastWeek | Week_Num | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|

0 | 0.569047 | 0.741462 | 0.172415 | -0.794566 | 0.025014 | 0.81958 | -0.08415 | 0.537536 | 0.621686 | 0.348269 | 0.949251 | 0.600982 | Week1 |

1 | -0.627576 | 0.148297 | 0.775873 | 0.296535 | 0.918463 | 0.621928 | -0.587988 | 0.217225 | 0.805213 | 0.267136 | 0.586337 | 0.319201 | Week2 |

2 | -0.027841 | 0.739275 | 0.767115 | 0.268239 | 0.789786 | 0.521547 | -0.450659 | 0.24441 | 0.695069 | 0.204113 | 0.550886 | 0.346773 | Week3 |

3 | 0.175609 | 0.743964 | 0.568355 | 0.569621 | 0.91638 | 0.346759 | -0.187366 | 0.184179 | 0.371545 | -0.272551 | 0.078098 | 0.35065 | Week4 |

4 | 0.30154 | 0.561522 | 0.259982 | -0.349268 | 0.092583 | 0.441851 | 0.402753 | 0.616801 | 0.214047 | -0.026036 | 0.364596 | 0.390632 | Week5 |

#### Shifting Week_Num to the left¶

In [157]:

```
final_consolidated=final_consolidated[['Week_Num'] + final_consolidated.columns.tolist()]
final_consolidated
```

Out[157]:

Week_Num | AXIS Numbers wrt LastWeek | AXIS_CurrentWeek | AXIS_LastWeek | HDFC Numbers wrt LastWeek | HDFC_CurrentWeek | HDFC_LastWeek | ICICI Numbers wrt LastWeek | ICICI_CurrentWeek | ICICI_LastWeek | SBI Numbers wrt LastWeek | SBI_CurrentWeek | SBI_LastWeek | Week_Num | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|

0 | Week1 | 0.569047 | 0.741462 | 0.172415 | -0.794566 | 0.025014 | 0.81958 | -0.08415 | 0.537536 | 0.621686 | 0.348269 | 0.949251 | 0.600982 | Week1 |

1 | Week2 | -0.627576 | 0.148297 | 0.775873 | 0.296535 | 0.918463 | 0.621928 | -0.587988 | 0.217225 | 0.805213 | 0.267136 | 0.586337 | 0.319201 | Week2 |

2 | Week3 | -0.027841 | 0.739275 | 0.767115 | 0.268239 | 0.789786 | 0.521547 | -0.450659 | 0.24441 | 0.695069 | 0.204113 | 0.550886 | 0.346773 | Week3 |

3 | Week4 | 0.175609 | 0.743964 | 0.568355 | 0.569621 | 0.91638 | 0.346759 | -0.187366 | 0.184179 | 0.371545 | -0.272551 | 0.078098 | 0.35065 | Week4 |

4 | Week5 | 0.30154 | 0.561522 | 0.259982 | -0.349268 | 0.092583 | 0.441851 | 0.402753 | 0.616801 | 0.214047 | -0.026036 | 0.364596 | 0.390632 | Week5 |