When I started working with pandas I oftenly used the square brackets [ ] for accessing rows and columns of a data frame. But as time passed, I had to get rid of this as things started to become more complicated and python suggested no longer use them but instead go for .loc, .iloc, .at, .iat, or .ix. So let us discuss all of them in detail.There are primarily two ways in which we can access a row or column in a data frame –
- By labels
- By position
Before we start, let me make the task easier, the .ix is deprecated and ambiguous and should never be used. Let’s discuss the other ones one by one.
Consider the following data frame (df) for our understanding –
Access a group of rows and columns by label(s) or a boolean array.1. A single label (returns a series)
1. A single label (returns a series)
2. A list or array of labels
While accessing multiple rows and columns using .loc, represent the row and column labels in separate square brackets, preferably.
Remember in slicing operation on data frames, the end is inclusive, Also, use only single square brackets.
4. Using Boolean values
A boolean array of the same length as the axis being sliced is to be used. For our data frame, it will be a 5×5 array. Due to this condition, we generally don’t prefer using this. You need to mention True at the row/column position you want to select and False for the row/column position you don’t want to select.
The operation of .iloc is the same as .locexcept for the fact that we use integer positions of the rows and columns instead of labels. The letter i stands for integer.
1. Using indexes
Similar to what we have seen before, we can mention single index positions or multiple index positions of the row/column that we want to select.
Note that the slicing operation in .iloc shifts back to the traditional way of python where the start is inclusive and the end is exclusive.
The operation of .at is similar to .loc but it is capable of selecting a single cell or value. Consider we want to know the marks of Yash for Maths subject –
The operation of .iat is similar to .iloc but it is capable of selecting a single cell or value just like .at
Indexing Operator [ ]
Now, that we have covered all the data frame selection methods, let us talk about how we can use only [ ] for selecting the values of a data frame. However, this is not recommended and I would rather suggest using any of the above methods.
The indexing operator [ ] can select rows and columns but not simultaneously. This is the major difference between indexing operators and the other methods.
1. Selecting columns
Use double square brackets [] for selecting columns of a data frame. Mention the column labels in the bracket that you want to select.
If you mention a single column name in single bracket [ ] operator then it will return a series containing that columns and all the row value associated with the column. Also, slicing of columns is not possible using index operators. You have to explicitly mention the column labels to be selected.
2. Slicing / Selecting rows
The only way to select rows is to use the slicing operation in single square brackets [ ]. You can either mention row labels or index positions.
Here, we come to an end of the basic comparison of how to select rows and columns of a data frame using various methods. If you would like to add value to this post, let me know in the comments.