Defining Data Frames
Pandas is a Python library that unlocks “data frames” – a row/column style arrangement similar to spreadsheets – directly in Python. Much like Google Sheets or Microsoft Excel, a data frame has data cells, named columns, and numbered rows.
DataFrames can be constructed as follows:
import numpy as np
import pandas as pd
demo_data = np.array([181, 81], [170, 72], [174, 93])
demo_columns = ['height', 'weight']
demo_dataframe = pd.DataFrame(data=demo_data, columns=demo_columns)
You can add extra columns easily. For example:
demo_dataframe["weight_in_lbs"] = demo_dataframe["weight"] * 2.2
Retrieving Data from Data Frames
You can get all data by calling the data frame itself (eg: print(demo_dataframe)
), or call subsets of the data frame as follows:
demo_dataframe.head(2) # Get rows 0-2 of the dataframe
demo_dataframe[3:17] # Get rows 3-17
demo_dataframe['some_column'] # Get all rows for the column 'some_column'
demo_dataframe.iloc[9] # Get row 9