Python Pandas add rows based on missing sequential values in a timeseries -

March 15, 2015

i'm new python , struggling manipulate data in pandas library. have pandas database this:

    year  value 0    91     1 1    93     4 2    94     7 3    95    10 4    98    13

and want complete missing years creating rows empty values, this:

    year  value 0    91     1 1    92     0 2    93     4 3    94     7 4    95    10 5    96     0 6    97     0 7    98    13

how do in python? (i wanna can plot values without skipping years)

i create new dataframe has year index , includes entire date range need cover. can set values across 2 dataframes, , index make sure correct rows matched (i've had use fillna set missing years zero, default set nan):

df = pd.dataframe({'year':[91,93,94,95,98],'value':[1,4,7,10,13]}) df.index = df.year df2 = pd.dataframe({'year':range(91,99), 'value':0}) df2.index = df2.year  df2.value = df.value df2= df2.fillna(0) df2       value  year year              91        1    91 92        0    92 93        4    93 94        7    94 95       10    95 96        0    96 97        0    97 98       13    98

finally can use reset_index if don't want year index:

df2.drop('year',1).reset_index()     year  value 0    91      1 1    92      0 2    93      4 3    94      7 4    95     10 5    96      0 6    97      0 7    98     13

Search This Blog

Call

Python Pandas add rows based on missing sequential values in a timeseries -

Comments

Post a Comment

Popular posts from this blog

node.js - Using Node without global install -

php - CakePHP HttpSockets send array of paramms -

angularjs - ADAL JS Angular- WebAPI add a new role claim to the token -