I’m continuing to break down a very popular stackoverflow post at https://stackoverflow.com/questions/17071871/select-rows-from-a-dataframe-based-on-values-in-a-column-in-pandas.
df.loc[(df['column_name'] >= A) & (df['column_name'] <= B)]
Here, the df
‘s in the line of code above represent the name of data frame that we are working with.
.loc
is a keyword.
The outermost square brackets [...]
contains the condition that you want to use as filter. In this case, we have two conditions inside separated by &
with the each condition enclosed by an open and close parenthesis (...)
.
This is condition #1:
(df['column_name'] >= A)
This is condition #2:
(df['column_name'] <= B)
The column_name in condition #1 can be the same or different from condition #2; it all depends on how you want to filter the data frame. The same goes for the >=
and <=
comparison signs, and A
& B
.
Suppose I have a data frame called my_tie_collection
. If I only want ties that are both blue and made from silk, then, I would type:
my_tie_collection.loc[(my_tie_collection['color'] == 'blue') & (my_tie_collection['material'] == 'silk')]
Take note, I used ==
to denote matching and I also put single quotes around blue
and silk
.