There are several ways of selecting data from a Pandas DataFrame and iloc is one of them. In Pandas, iloc for DataFrame is integer-location based indexing for selection by position.
According to the documentation:
.iloc[] is primarily integer position based (from 0 to length-1 of the axis), but may also be used with a boolean array.
Allowed inputs are:
- An integer, e.g.
5. - A list or array of integers, e.g.
[4, 3, 0]. - A slice object with ints, e.g.
1:7. - A boolean array.
- A
callablefunction with one argument (the calling Series, DataFrame or Panel) and that returns valid output for indexing (one of the above). This is useful in method chains, when you don’t have a reference to the calling object, but would like to base your selection on some value.
Following is the code with comments, description and results of the commands to be run in Python 3 :
Generating a Pandas Dataframe in Python
#import Pandas
import pandas as pd
# Generate a Pandas DataFrame
df_students = pd.DataFrame(
{'Name': ['John', 'Sally', 'Joe', 'Anthony', 'Jim', 'Alexander', 'Anna'],
'Salary':[100000, 108000, 100000, 378000, 110000, 80000, 118000]})
# Get first 5 rows of your Pandas DataFrame
df_students.head()
Name Salary
0 John 100000
1 Sally 108000
2 Joe 100000
3 Anthony 378000
4 Jim 110000
Using .iloc in different ways to extract rows from Pandas DataFrame
# This will return a Series
print(df_students.iloc[0])
Name John Salary 100000 Name: 0, dtype: object
type(df_students.iloc[0])
pandas.core.series.Series
# This will return a DataFrame
print(df_students.iloc[[0]])
Name Salary 0 John 100000
type(df_students.iloc[[0]])
pandas.core.frame.DataFrame
print(df_students.iloc[:2])
Name Salary 0 John 100000 1 Sally 108000
type(df_students.iloc[:2])
pandas.core.frame.DataFrame
print(df_students.iloc[1:2])
Name Salary 1 Sally 108000
print(df_students.iloc[0:1])
Name Salary 0 John 100000

