Skip to content Skip to sidebar Skip to footer

Selective Building Of New Dataframe With Existing Dataframes In Addition To Calculation

Fill in the Pandas code below to create a new DataFrame, customer_spend, that contains the following columns in this order: customer_id, name, and total_spend. total_spend is a new

Solution 1:

Consider first aggregating orders by customer_id, then merging the resulting customer_id-indexed DataFrame onto the desired columns of customers:

cust2spend = orders.groupby('customer_id').sum()[['order_total']].reset_index()
cust2spend
customer_id     order_total
        1001211.84101602.42102786.80103961.96104651.22

# Before merging, rename the order_total column to total_spend.
# Note that axis=1 could also be axis='columns'.
cust2spend.rename({'order_total': 'total_spend'}, axis=1, inplace=True)

pd.merge(customers[['customer_id', 'name']], cust2spend, on='customer_id')
   customer_id                   name  total_spend
0100      Prometheus Barwis      1211.841101         Alain Hennesey       602.422102            Chao Peachy       786.803103  Somtochukwu Mouritsen       961.964104        Elisabeth Berry       651.22

Solution 2:

#Sorting so that data are consistent
customers = customers.sort_values(by = 'customer_id', ascending=True)

# Creating an empty DataFrame
customer_spend = pd.DataFrame(columns= 'customer_id,name,total_spend'.split(','))

#Filling columns details
customer_spend[['customer_id', 'name']] = customers[['customer_id', 'name']]
customer_spend['total_spend'] = orders.groupby('customer_id').sum().sort_values(by='customer_id', ascending=True)['order_total'].tolist()

If using merge is not mandatory try this.

Post a Comment for "Selective Building Of New Dataframe With Existing Dataframes In Addition To Calculation"