Skip to content Skip to sidebar Skip to footer

Printing Info() At Pandas At The Report The Entries And Index Number Are Not The Same

at Jupyter notebook I Printed df.info() the result is print(df.info()) Int64Index: 20620 entries, 0 to 24867 Data columns (total 3 c

Solution 1:

It means that not every possible index value has been used. For example,

In [13]: df = pd.DataFrame([10,20], index=[0,100])

In [14]: df.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 2 entries, 0 to 100
Data columns (total 1 columns):
02 non-null int64
dtypes: int64(1)
memory usage: 32.0 bytes

df has 2 entries, but the Int64Index "ranges" from 0 to 100.

DataFrames can easily end up like this if rows have been deleted, or if df is a sub-DataFrame of another DataFrame.

If you reset the index, the index labels will be renumbered in order, starting from 0:

In [17]: df.reset_index(drop=True)
Out[17]: 
    0010120In [18]: df.reset_index(drop=True).info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2 entries, 0to1
Data columns (total 1 columns):
02 non-null int64
dtypes: int64(1)
memory usage: 96.0 bytes

To be more precise, as Chris points out, the line

Int64Index:2 entries, 0to100

is merely reporting the first and last value in the Int64Index. It's not reporting min or max values. There can be higher or lower integers in the index:

In [32]: pd.DataFrame([10,20,30], index=[50,0,50]).info()
<class'pandas.core.frame.DataFrame'>Int64Index:3 entries, 50to50  # notice index value 0isnot mentioned
Data columns (total 1 columns):
03 non-null int64
dtypes: int64(1)
memory usage: 48.0 bytes

Post a Comment for "Printing Info() At Pandas At The Report The Entries And Index Number Are Not The Same"