DataFrame Methods
Ali Atiyab Husain
In this chapter we will learn some crucial dataframe methods required for Data Analysis and Cleaning.
DataFrame Methods
To clean our data for pandas DataFrame we use a variety of function some of which are -
1 . pd.DataFrame.isnull()
- This function is used to detect any null or nan values in a DataFrame . This helps us in replacing or getting rid of null /nan (empty) values .
- This function returns a pandas DataFrame containing True or False boolean values : True wherever there is a null value and False wherever there is not.
- We can use the isnull function along with the sum function to get the number of null/nan values in each column of our DataFrame .
Code to create a pandas DataFrame containing null values
OUTPUT
Here the Age and Job columns each have some null/nan value which we can see where the True is .
We can infer that Age and Job columns have 1 and 2 null values respectively.
Giving our DataFrame a total of 3 null values.
- We can directly get the number of null values of each column by just applying the sum function just after the isnull function.
2 . pd.DataFrame.info()
- This function identifies the data type of each column as well as the number of non null values of each column.
- It also tells us about the total number of rows and columns of the DataFrame.
|
- Here we are using our old DataFrame containing the Name, Age and Job columns.
- Here we can see the number of non-null values in Job and Age(which are 2 and 3 respectively),
- Also we can see the datatype of each column .
|
- This function returns a pandas DataFrame and prints the DataFrame by itself
3 . pd.DataFrame.describe()
- This function returns a pandas DataFrame showing the count, mean, median, max, min, std, 25%, 50% and 75% of each column.
- This function helps us in analysing the pattern and statistics of each column.
EXAMPLE