Pandas Tricks — Pass Multiple Columns To Lambda

Pandas is one of the most powerful tool for analyzing and manipulating data. In this article, I will be sharing with you the solutions for a very common issues you might have been facing with pandas when dealing with your data — how to pass multiple columns to lambda or self-defined functions.

You will have to install pandas on your working environment:

pip install pandas

When dealing with data, you will always have the scenario that you want to calculate something based on the value of a few columns, and you may need to use lambda or self-defined function to write the calculation logic, but how to pass multiple columns to lambda function as parameters?

Let me use some real world example, so that easier for you to understand the issue that I am talking about. Below table shows partial of the e-com delivery charges offered by some company, so the delivery charges are determined by package size (H+L+W), package weight and the delivery mode you are choosing.

And assuming we have the below order data, and we want to simulate the delivery charges. Let’s create the data in a pandas dataframe.

import pandas as pddf = pd.DataFrame({ "Order#" : ["1", "2", "3", "4"], "Weight" : [5.0, 2.1, 8.1, 7.5], "Package Size" : [80, 45, 110, 90], "Delivery Mode": ["Same Day", "Next Day", "Express", "Next Day"]})

If you view dataframe from Jupyter Notebook (you can sign up here to use it for free), you shall be able to see the data as per below.

Let’s also implement a calculate_rate function where we need to pass in the weight, package size, and delivery mode in order to calculate the delivery charges:

def calculate_rate(weight, package_size, delivery_mode): 
#set the charges as $20 since we do not have the complete rate card
charges = 20
if weight <=1 and package_size <60:
if delivery_mode == "Express":
charges = 13
elif delivery_mode == "Next Day":
charges = 8
else:
charges = 10
elif weight <=5 and package_size <80:
if delivery_mode == "Express":
charges = 15
elif delivery_mode == "Next Day":
charges = 9
else:
charges = 11
elif weight <=8 and package_size <100:
if delivery_mode == "Express":
charges = 17
elif delivery_mode == "Next Day":
charges = 11
else:
charges = 13
return charges

Here comes to the most important part. You probably already know data frame has the apply function where you can apply the lambda function to the selected dataframe. We will also use the apply function, and we have a few ways to pass the columns to our calculate_rate function.

We can select the columns that involved in our calculation as a subset of the original data frame, and use the apply function to it.

And in the apply function, we have the parameter axis=1 to indicate that the x in the lambda represents a row, so we can unpack the x with *x and pass it to calculate_rate function.

df["Delivery Charges"] = df[["Weight", "Package Size", "Delivery Mode"]].apply(lambda x : calculate_rate(*x), axis=1)

If we check the df again in Jupyter Notebook, you should see the new column “Delivery Charges” with the figures calculated based on the logic we defined in calculate_rate function.

If you do not want to get a subset of the data frame and then apply the lambda, you can also directly use the apply function to the original data frame. In this case, you will need to select the columns before passing to the calculate_rate function. Same as above, we will need to specify the axis=1 to indicate it’s applying to each row.

df["Delivery Charges"] = df.apply(lambda x : calculate_rate(x["Weight"], x["Package Size"], x["Delivery Mode"]), axis=1)

This will produce the same result as option 1. And you can also use format like x.Weight instead of x[“Weight”] when passing in the parameter.

The two options we discussed to pass multiple columns to lambda are basically the same, and it’s either applying to the subset or the original data frame. I have not yet tested with a large set of data, so there might be some differences in terms of the performance, you may need to take a note if you are dealing with a lot of data.

You may also interested to read some other articles related to pandas.

Originally published at https://www.codeforests.com on July 18, 2020.

Resources and tutorials for python, data science and automation solutions

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store