Data Analytics in Python – How to use .loc, .iloc, .ix in Pandas

by | Nov 17, 2022 | Uncategorized | 0 comments

In the field of Data Science and Machine Learning, the very first thing after getting access to data is to Analyze it. Data Analysis is the most important part of extracting any valuable information from the data.

Before applying any Machine Learning Model or Techniques it is necessary to get to know the data attributes and dimensions in order to treat it accordingly. In this tutorial, we will be using Hands On approach to go through and analyze an actual data which is used for Machine Learning. We will be using Python and Pandas for this purpose and use .loc, .iloc, .ix in Pandas. We will start with loading the data and defining its Labels and Classes as per Data description mentioned in the Machine Learning Data Repository.

import pandas as pd
 
df = pd.read_csv(
    filepath_or_buffer='https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data',
    header=None,
    sep=',')
 
df.columns=['sepal_len', 'sepal_wid', 'petal_len', 'petal_wid', 'class']
 
df.dropna(how="all", inplace=True) # drops the empty line at file-end
 
df.head()
df.tail()
 
df = df.set_index('class')

SELECTING A COLUMN IN PANDAS:

df['petal_len']

SELECTING MULTIPLE COLUMN IN PANDAS:

df[['petal_len', 'petal_wid']]

SELECTING ALL ROWS BY INDEX LABEL:

# Select all rows with class 'Iris-virginica'
df.loc['Iris-virginica']

SELECTING ROWS IN PANDAS

# Select every row up to 5
df.iloc[:4]
 
# Select the forth and fifth row
df.iloc[3:4]
 
# Select every row after the fifth row
df.iloc[4:]

SELECTING COLUMNS IN PANDAS

# Select the first 2 columns
df.iloc[:,:2]