I Am Unable To Set The Xticks Of My Lineplot In Seaborn To The Values Of The Coresponding Hour
Solution 1:
Since I do not have access to your data, I created fake one in order to have some data to work with. You can just use your df
.
Check this code:
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
N = 1440
time = pd.date_range('2020-01-01', periods = N, freq = 'min')
globalpower = np.random.randn(N)
df = pd.DataFrame({'time': time,
'globalpower': globalpower})
graph = sns.lineplot(df.time, df.globalpower, data = df)
graph.xaxis.set_major_locator(mdates.HourLocator(interval = 1))
graph.xaxis.set_major_formatter(mdates.DateFormatter('%H:%M'))
plt.xticks(rotation = 90)
plt.show()
which gives me this plot:
You can adjust the x axis ticks and labels with:
graph.xaxis.set_major_locator(mdates.HourLocator(interval = 1))
to set ticks each hoursgraph.xaxis.set_major_formatter(mdates.DateFormatter('%H:%M'))
to set the format of the x axis label to "hours:minutes"plt.xticks(rotation = 90)
to rotate by 90 degrees the x axis labels in order to improve the visualization
Solution 2:
Only a little to add onto Andrea's answer, just to explain what I think was going on in your original code. Here's toy data with minute-precision time strings and random values:
In[0]:
import pandas as pd
import numpy as np
import seaborn as sns
times = []
for h inrange(24):
for m inrange(60):
times.append('{0}:{1}:00'.format(f'{h:02}',f'{m:02}'))
values = np.random.rand(1440*3) #1400 minutes in a day
df = pd.DataFrame({'time':times*3,
'globalpower':values,})
df
Out[0]:
time globalpower
000:00:000.564812100:01:000.429477200:02:000.827994300:03:000.525569400:04:000.113478
... ...
719523:55:000.624546719623:56:000.981141719723:57:000.096928719823:58:000.170131719923:59:000.398853
[7200 rows x 2 columns]
Note that I repeat each time 3x so that sns.lineplot
has something to average for each unique time. Graphing this data with your code creates the same error you described:
graph = sns.lineplot(df.time, df.globalpower, data=df)
graph.set_xticks(range(0,24))
graph.set_xticklabels(['01:00','02:00','03:00','04:00','05:00','06:00','07:00','08:00','09:00','10:00','11:00','12:00','13:00','14:00','15:00','16:00','17:00','18:00','19:00','20:00','21:00','22:00','23:00','24:00'])
The basic discrepancy is that neither your plotting function nor your x-axis arguments are aware that there is any time information. When you call sns.lineplot
with x=df.time
and y=df.globalpower
, seaborn
basically does a groupby operation on the time column for each unique entry and averages the global power values. But it is only seeing unique strings in the time column, these unique strings are sorted when plotted, which just happens to match the order of times in a day because of how they are written alphanumerically.
To see this, consider that instead using an array of non-time-formatted strings (e.g. '0000', '0001', '0002', etc...) will result in the same graph:
names = []
for h inrange(24):
for m inrange(60):
names.append(str(f'{h:02}') + str(f'{m:02}'))
#names = ['0001','0002','0003',...]
df2 = pd.DataFrame({'name':names*3,
'globalpower':values,})
graph2 = sns.lineplot(df2.name, df2.globalpower, data=df)
graph2.set_xticks(range(0,24))
graph2.set_xticklabels(['01:00','02:00','03:00','04:00','05:00','06:00','07:00','08:00','09:00','10:00','11:00','12:00','13:00','14:00','15:00','16:00','17:00','18:00','19:00','20:00','21:00','22:00','23:00','24:00'])
So when you get to your tick arguments, saying set_xticks(range(0,24))
and set_xticklabels(['01:00','02:00','03:00'...])
means basically "set ticks at positions 0 through 23 with these 24 labels", though the plot is graphing (in this case) 1440 unique x-values, so 0-23 only spans a sliver of the values.
The fix is basically what Andrea answered: get your time information into a datetime
format, and then use matplotlib.dates
to format the ticks. For your strings of times (without dates), you can simply do:
df['time'] = pd.to_datetime(df['time'])
And then follow their answer. This will give every time a full timestamp on January 1st, 1970 (what is default in pandas
); but the weird year doesn't matter if you only care about plotting a 24-hour period averaged for each recurring time.
Post a Comment for "I Am Unable To Set The Xticks Of My Lineplot In Seaborn To The Values Of The Coresponding Hour"