Skip to main content

Command Palette

Search for a command to run...

Pandas indexing

Updated
2 min read
P

👋 Hey, I'm Ajay Kumar Joshi – a Python & JavaScript enthusiast passionate about breaking down complex programming concepts into simple, digestible lessons.

I created PythonJS to help developers, students, and professionals learn Python and JavaScript the easy way—through byte-sized lessons, real-world examples, and a structured approach to coding.

🚀 Follow along as I simplify tough topics, share coding insights, and build a community of lifelong learners.

🔗 Explore more at pythonjs.org

Pandas Indexing 🐼

Pandas is a powerful library in Python that facilitates data manipulation and analysis. A fundamental concept in Pandas is indexing, which is used to select specific rows or columns from a DataFrame.

Indexing Methods

There are several ways to index a DataFrame in Pandas:

  • .loc: This is a label-based indexing method, allowing you to select rows or columns by their labels. It can be used with DataFrame and Series and supports boolean conditions.

  • .iloc: This is an integer-based indexing method, enabling selection by integer position. It is applicable to both DataFrame and Series.

  • [] operator: A shorthand method for indexing, allowing selection by labels or integer positions. It also supports boolean indexing.

  • .at: Used for scalar value retrieval, it is faster than .loc for accessing a single value.

  • .iat: Similar to .at, but uses integer indexing instead of labels.

Examples

  • Selecting a single column by label: df.loc[:, 'column_name']

  • Selecting multiple columns by label: df.loc[:, ['column_1', 'column_2']]

  • Selecting a single row by integer position: df.iloc[0]

  • Selecting multiple rows by integer position: df.iloc[0:5]

  • Selecting a single element by label: df.loc['row_label', 'column_label']

  • Selecting a single element by integer position: df.iloc[0, 0]

Pandas uses zero-based indexing, so the first row and column have an index of 0.

Test DataFrame

Here is an example of a small DataFrame to test these concepts:

import pandas as pd

data = {
    'name': ['John', 'Mike', 'Sara', 'Kate', 'Bob'],
    'age': [35, 28, 31, 22, 45],
    'city': ['New York', 'Los Angeles', 'Chicago', 'Houston', 'Phoenix']
}

df = pd.DataFrame(data, columns=['name', 'age', 'city'])

This DataFrame has three columns: 'name', 'age', and 'city', and five rows. You can test the different indexing methods using the following commands:

  • Selecting a single column by label: print(df.loc[:, 'name'])

  • Selecting multiple columns by label: print(df.loc[:, ['name', 'age']])

  • Selecting a single row by integer position: print(df.iloc[0])

  • Selecting multiple rows by integer position: print(df.iloc[0:3])

  • Selecting a single element by label: print(df.at[0, 'name'])

  • Selecting a single element by integer position: print(df.iat[0, 0])

Experiment with different indices to see how the results change.

Advanced Indexing Techniques

In addition to these basic methods, Pandas offers advanced indexing techniques:

  • Boolean indexing: Select rows that meet certain conditions, e.g., df[df['age'] > 30].

  • .query() method: Filter DataFrames using a query string, similar to SQL queries, useful for multiple conditions.

  • .reindex() method: Reorder the rows or columns of a DataFrame based on a new index.

  • .set_index() method: Reset the index of a DataFrame to a column of your choice.

These examples cover the basics, but Pandas provides many more advanced indexing options to explore.