Data Frame Analysis
To analyze the data in the dataframes, and derive conclusions from it, pandas has inbuilt methods.
Data set
Importing our dataset. Our dataset contains bestseller books.
df.min()
Returns min values of every column of the dataset
df.max()
Returns max value of every col of the dataset.
In both cases(max and min) the type of the data is
type(houses.max())
pandas.core.series.Series
df.sum()
Sum of every value on every column.
To specify that only numeric type data values are summed:
df.sum(numeric_only=True)
df.count()
Count the no. of values present in every column.
df.mean()
Mean of all the values in every column.
If you want only the mean of the first 50 elements.
df.median()
Median or middle value of every column.
df.mode(numeric_only=True)
Mode or maximum occuring value of every column.
For only numerical value columns.
df.describe()
If you quickly want all the statistical value.
If you want data for all non numeric type cols
df.describe(include=[“object”])
or
df.describe(include=[“O”])