A (very) Brief Overview of Pandas

Defining Data Frames

Pandas is a Python library that unlocks “data frames” – a row/column style arrangement similar to spreadsheets – directly in Python. Much like Google Sheets or Microsoft Excel, a data frame has data cells, named columns, and numbered rows.

DataFrames can be constructed as follows:

import numpy as np
import pandas as pd

demo_data = np.array([181, 81], [170, 72], [174, 93])

demo_columns = ['height', 'weight']

demo_dataframe = pd.DataFrame(data=demo_data, columns=demo_columns)

You can add extra columns easily. For example:

demo_dataframe["weight_in_lbs"] = demo_dataframe["weight"] * 2.2

Retrieving Data from Data Frames

You can get all data by calling the data frame itself (eg: print(demo_dataframe)), or call subsets of the data frame as follows:

demo_dataframe.head(2) # Get rows 0-2 of the dataframe
demo_dataframe[3:17] # Get rows 3-17
demo_dataframe['some_column'] # Get all rows for the column 'some_column'
demo_dataframe.iloc[9] # Get row 9

Leave a Comment

Your email address will not be published. Required fields are marked *