Pandas indexing
👋 Hey, I'm Ajay Kumar Joshi – a Python & JavaScript enthusiast passionate about breaking down complex programming concepts into simple, digestible lessons.
I created PythonJS to help developers, students, and professionals learn Python and JavaScript the easy way—through byte-sized lessons, real-world examples, and a structured approach to coding.
🚀 Follow along as I simplify tough topics, share coding insights, and build a community of lifelong learners.
🔗 Explore more at pythonjs.org
Pandas Indexing 🐼
Pandas is a powerful library in Python that facilitates data manipulation and analysis. A fundamental concept in Pandas is indexing, which is used to select specific rows or columns from a DataFrame.
Indexing Methods
There are several ways to index a DataFrame in Pandas:
.loc: This is a label-based indexing method, allowing you to select rows or columns by their labels. It can be used with DataFrame and Series and supports boolean conditions.
.iloc: This is an integer-based indexing method, enabling selection by integer position. It is applicable to both DataFrame and Series.
[] operator: A shorthand method for indexing, allowing selection by labels or integer positions. It also supports boolean indexing.
.at: Used for scalar value retrieval, it is faster than .loc for accessing a single value.
.iat: Similar to .at, but uses integer indexing instead of labels.
Examples
Selecting a single column by label:
df.loc[:, 'column_name']Selecting multiple columns by label:
df.loc[:, ['column_1', 'column_2']]Selecting a single row by integer position:
df.iloc[0]Selecting multiple rows by integer position:
df.iloc[0:5]Selecting a single element by label:
df.loc['row_label', 'column_label']Selecting a single element by integer position:
df.iloc[0, 0]
Pandas uses zero-based indexing, so the first row and column have an index of 0.
Test DataFrame
Here is an example of a small DataFrame to test these concepts:
import pandas as pd
data = {
'name': ['John', 'Mike', 'Sara', 'Kate', 'Bob'],
'age': [35, 28, 31, 22, 45],
'city': ['New York', 'Los Angeles', 'Chicago', 'Houston', 'Phoenix']
}
df = pd.DataFrame(data, columns=['name', 'age', 'city'])
This DataFrame has three columns: 'name', 'age', and 'city', and five rows. You can test the different indexing methods using the following commands:
Selecting a single column by label:
print(df.loc[:, 'name'])Selecting multiple columns by label:
print(df.loc[:, ['name', 'age']])Selecting a single row by integer position:
print(df.iloc[0])Selecting multiple rows by integer position:
print(df.iloc[0:3])Selecting a single element by label:
print(df.at[0, 'name'])Selecting a single element by integer position:
print(df.iat[0, 0])
Experiment with different indices to see how the results change.
Advanced Indexing Techniques
In addition to these basic methods, Pandas offers advanced indexing techniques:
Boolean indexing: Select rows that meet certain conditions, e.g.,
df[df['age'] > 30]..query() method: Filter DataFrames using a query string, similar to SQL queries, useful for multiple conditions.
.reindex() method: Reorder the rows or columns of a DataFrame based on a new index.
.set_index() method: Reset the index of a DataFrame to a column of your choice.
These examples cover the basics, but Pandas provides many more advanced indexing options to explore.
