python - Classifying Data in a New Column -
i have following df:
column 1 1 2435 3345 104 505 6005 10000 80000 100000 4000000 4440 520 ...
this structure not best plot histogram, main purpose. bins don't solve problem either, @ least i've tested far. that's why create own bins in new column:
i want assign every value within range in column 1 bucket in column2, this:
column 1 column2 1 < 10000 2435 < 10000 3345 < 10000 104 < 10000 505 < 10000 6005 < 10000 10000 < 50000 80000 < 150000 100000 < 150000 4000000 < 250000 4440 < 10000 520 < 10000 ...
once there, creating plot easier.
thanks!
there pandas equivalent cut
there section describing here. cut
returns open closed intervals each value:
in [29]: df['bin'] = pd.cut(df['column 1'], bins = [0,10000, 50000, 150000, 25000000]) df out[29]: column 1 bin 0 1 (0, 10000] 1 2435 (0, 10000] 2 3345 (0, 10000] 3 104 (0, 10000] 4 505 (0, 10000] 5 6005 (0, 10000] 6 10000 (0, 10000] 7 80000 (50000, 150000] 8 100000 (50000, 150000] 9 4000000 (150000, 25000000] 10 4440 (0, 10000] 11 520 (0, 10000]
the dtype of column category
, can used filtering, counting, plotting etc.
Comments
Post a Comment