Today, I’ll be breaking down a very popular stackoverflow post at https://stackoverflow.com/questions/17071871/select-rows-from-a-dataframe-based-on-values-in-a-column-in-pandas.
df.loc[df['column_name'] == some_value]
Above, df
is the name of the data frame. You should replace it twice with the name of your data frame.
.loc
is a keyword.
Next, replace column_name
with the name of the column that contains the values you want to filter.
Finally, replace some_value
with the desired value.
For example, if I have a data frame named “my_shoe_collection” and I want to select only the rows where the value of “color” is “blue” then:
my_shoe_collection.loc[my_shoe_collection['color'] == 'blue']
Also, if I have the same data frame named and I want to only select rows where the value of “price” is less than $50, then:
my_shoe_collection.loc[my_shoe_collection['price'] <= 50]
Notice how I got rid of the single quotation marks since I’m dealing with an actual number?