Python Pandas add rows based on missing sequential values in a timeseries -
i'm new python , struggling manipulate data in pandas library. have pandas database this:
year value 0 91 1 1 93 4 2 94 7 3 95 10 4 98 13
and want complete missing years creating rows empty values, this:
year value 0 91 1 1 92 0 2 93 4 3 94 7 4 95 10 5 96 0 6 97 0 7 98 13
how do in python? (i wanna can plot values without skipping years)
i create new dataframe has year index , includes entire date range need cover. can set values across 2 dataframes, , index make sure correct rows matched (i've had use fillna set missing years zero, default set nan
):
df = pd.dataframe({'year':[91,93,94,95,98],'value':[1,4,7,10,13]}) df.index = df.year df2 = pd.dataframe({'year':range(91,99), 'value':0}) df2.index = df2.year df2.value = df.value df2= df2.fillna(0) df2 value year year 91 1 91 92 0 92 93 4 93 94 7 94 95 10 95 96 0 96 97 0 97 98 13 98
finally can use reset_index
if don't want year index:
df2.drop('year',1).reset_index() year value 0 91 1 1 92 0 2 93 4 3 94 7 4 95 10 5 96 0 6 97 0 7 98 13
Comments
Post a Comment