Tuesday, September 7, 2021

Blog 7: Get Highest Salary for Each Department using Pandas Groupby

Highest Salary within a Dept using Pandas
In [17]:
#Adjusting the cell width and margin
from IPython.display import display, HTML

display(HTML(data="""
<style>
    div#notebook-container    { width: 100%; }
    div#notebook-container    { margin-left: -2.8%; }
    div#menubar-container     { width: 65%; }
    div#maintoolbar-container { width: 99%; }
</style>
"""))
In [ ]:
# Importing the libraries
import pandas as pd
import numpy as np
In [18]:
# Creating a sample data frame
df=pd.DataFrame(np.array(
    [['Production','Production','Production',
      'Sales','Sales','Sales'],
       [10,20,30,15,14,50]])).T
df.columns=['Department','Salary']
df
Out[18]:
Department Salary
0 Production 10
1 Production 20
2 Production 30
3 Sales 15
4 Sales 14
5 Sales 50
In [13]:
# Find the highest salary from each dept
# Step 1:Sort the data frame based on 
# Department and Salary
# Step 2:Use group by on Department 
# and get the first record
In [15]:
df2=df.sort_values(by = ['Department','Salary'],ascending=False)
df2
Out[15]:
Department Salary
5 Sales 50
3 Sales 15
4 Sales 14
2 Production 30
1 Production 20
0 Production 10
In [16]:
df2.groupby('Department').head(1)
Out[16]:
Department Salary
5 Sales 50
2 Production 30

Web Scraping Tutorial 2 - Getting the Avg Rating and Reviews Count

Web Scrapping Tutorial 2: Getting Overall rating and number of reviews ...