python - resampling non-time-series data -

August 15, 2012

i have data i'm handling dataframes , pandas. contain 10 000 rows , 6 columns.

the problem is, have done several trials , different datasets have different index numbers. (it's "force - length" testing several materials , of course measurement points not alined perfectly.)

now idea was, "resample" data using index contains value length. seems resampling function in pandas available datetime datatypes.

i tried convert index via to_datetime , succeeded. after resampling, need original scale. kind of from_datetime function.

is there way?

or on wrong track , should better use functions groupby?

thank already!

edit:

sorry not asking enough. i'm unexperienced python user , new forum..

data loks below. lenght usesed index. of dataframes have few woulf nice allign them same "framerate" , cut them e.g. can compare different datasets.

the idea tried one:

    df_1_dt = df_1 #generate table conversion     df_1_dt.index = pd.to_datetime(df_1_dt.index, unit='s') # convert simulating seconds.. idea?!     df_1_dt_rs= df_1_dt # generate df resampling     df_1_dt_rs = df_1_dt_rs.resample (rule='s') #resample generatet time

data:

+---------------------------------------------------+   ¦  index (lenght)   ¦    force1     ¦    force2     ¦   ¦-------------------+---------------+---------------¦   ¦ 8.04662074828e-06 ¦ 4.74251270294 ¦ 4.72051584721 ¦   ¦ 8.0898882798e-06  ¦ 4.72051584721 ¦ 4.72161570191 ¦   ¦ 1.61797765596e-05 ¦ 4.69851899147 ¦ 4.72271555662 ¦   ¦ 1.65476570973e-05 ¦ 4.65452528    ¦ 4.72491526604 ¦   ¦ 2.41398605024e-05 ¦ 4.67945501539 ¦ 4.72589291467 ¦   ¦ 2.42696630876e-05 ¦ 4.70438475079 ¦ 4.7268705633  ¦   ¦ 9.60953101751e-05 ¦ 4.72931448619 ¦ 4.72784821192 ¦   ¦ 0.00507703541206  ¦ 4.80410369237 ¦ 4.73078115781 ¦   ¦ 0.00513927175509  ¦ 4.87889289856 ¦ 4.7337141037  ¦   ¦ 0.00868965311878  ¦ 4.9349848032  ¦ 4.74251282215 ¦   ¦ 0.00902026197556  ¦ 4.99107670784 ¦ 4.7513115406  ¦   ¦ 0.00929150878827  ¦ 5.10326051712 ¦ 4.76890897751 ¦   ¦ 0.0291729332784   ¦ 5.14945375919 ¦ 4.78650641441 ¦   ¦ 0.0296332588857   ¦ 5.17255038023 ¦ 4.79530513287 ¦   ¦ 0.0297080942518   ¦ 5.19564700127 ¦ 4.80410385132 ¦   ¦ 0.0362595526707   ¦ 5.2187436223  ¦ 4.80850321054 ¦   ¦ 0.0370305483177   ¦ 5.24184024334 ¦ 4.81290256977 ¦   ¦ 0.0381506204153   ¦ 5.28803348541 ¦ 4.82170128822 ¦   ¦ 0.0444440795306   ¦ 5.30783069134 ¦ 4.83050000668 ¦   ¦ 0.0450121369102   ¦ 5.3177292943  ¦ 4.8348993659  ¦   ¦ 0.0453465140473   ¦ 5.32762789726 ¦ 4.83929872513 ¦   ¦ 0.0515533437013   ¦ 5.33752650023 ¦ 4.85359662771 ¦   ¦ 0.05262489708     ¦ 5.34742510319 ¦ 4.8678945303  ¦   ¦ 0.0541273847206   ¦ 5.36722230911 ¦ 4.89649033546 ¦   ¦ 0.0600755845953   ¦ 5.37822067738 ¦ 4.92508614063 ¦   ¦ 0.0607712385295   ¦ 5.38371986151 ¦ 4.93938404322 ¦   ¦ 0.0612954159368   ¦ 5.38921904564 ¦ 4.9536819458  ¦   ¦ 0.0670288249293   ¦ 5.39471822977 ¦ 4.97457891703 ¦   ¦ 0.0683640870058   ¦ 5.4002174139  ¦ 4.99547588825 ¦   ¦ 0.0703192637772   ¦ 5.41121578217 ¦ 5.0372698307  ¦   ¦ 0.0757871634772   ¦ 5.43981158733 ¦ 5.07906377316 ¦   ¦ 0.0766597757545   ¦ 5.45410948992 ¦ 5.09996074438 ¦   ¦ 0.077317850103    ¦ 5.4684073925  ¦ 5.12085771561 ¦   ¦ 0.0825991083545   ¦ 5.48270529509 ¦ 5.13295596838 ¦   ¦ 0.0841354654428   ¦ 5.49700319767 ¦ 5.14505422115 ¦   ¦ 0.0865525182528   ¦ 5.52559900284 ¦ 5.1692507267  ¦   +---------------------------------------------------+

it sounds want round length figures lower precision.

if case, use in-built rounding function:

(dummy data)

>>> df=pd.dataframe([[1.0000005,4],[1.232463632,5],[5.234652,9],[5.675322,10]],columns=['length','force']) >>> df 33:      length  force 0  1.000001      4 1  1.232464      5 2  5.234652      9 3  5.675322     10 >>> df['rounded_length'] = df.length.apply(round, ndigits=0) >>> df 34:      length  force  rounded_length 0  1.000001      4             1.0 1  1.232464      5             1.0 2  5.234652      9             5.0 3  5.675322     10             6.0 >>>

then replicate resample().... workflow using groupby:

>>> df.groupby('rounded_length').mean().force 35: rounded_length 1.0     4.5 5.0     9.0 6.0    10.0 name: force, dtype: float64

generally, resample dates. if you're using other dates, there's more elegant solution!

Search This Blog

Call

python - resampling non-time-series data -

Comments

Post a Comment

Popular posts from this blog

node.js - Using Node without global install -

php - CakePHP HttpSockets send array of paramms -

angularjs - ADAL JS Angular- WebAPI add a new role claim to the token -