Making this C array processing code more python (and even numpy) -
i'm trying head around amazing list processing abilities of python (and numpy). i'm converting c code wrote python.
i have text datafile first row header, , every odd row input data , every row output data. data space separated. i'm quite chuffed managed read data lists using nested list comprehensions. amazing stuff.
with open('data.txt', 'r') f: # lines list of strings lines = list(f) # convert header row list of ints , info header = map(int, lines[0].split(' ')) num_samples = header[0] input_dim = header[1] output_dim = header[2] del header # bad ass list comprehensions inputs = [[float(x) x in l.split()] l in lines[1::2]] outputs = [[float(x) x in l.split()] l in lines[2::2]] del x, l, lines
then want produce new list each element function of corresponding input-output pair. couldn't figure out how python specific optimizations. here in c-style python:
# calculate position pos_list = []; pos_y = 0 in range(num_samples): pantilt = outputs[i]; target = inputs[i]; if(pantilt[0] > 90): pantilt[0] -=180 pantilt[1] *= -1 elif pantilt[0] < -90: pantilt[0] += 180 pantilt[1] *= -1 tan_pan = math.tan(math.radians(pantilt[0])) tan_tilt = math.tan(math.radians(pantilt[1])) pos = [0, pos_y, 0] pos[2] = tan_tilt * (target[1] - pos[1]) / math.sqrt(tan_pan * tan_pan + 1) pos[0] = pos[2] * tan_pan pos[0] += target[0] pos[2] += target[2] pos_list.append(pos) del pantilt, target, tan_pan, tan_tilt, pos, pos_y
i tried comprehension, or map couldn't figure out how to:
- draw 2 different lists (both input , output) each element of pos_list array
- put body of algorithm in comprehension. have separate function or there funky way of using lambdas this?
- would possible no loops @ all, stick in numpy , vectorize whole thing?
one vectorized approach using boolean-indexing/mask
-
import numpy np def mask_vectorized(inputs,outputs,pos_y): # create copy of outputs array editing purposes pantilt_2d = outputs[:,:2].copy() # mask correspindig if conditional statements in original code mask_col0_lt = pantilt_2d[:,0]<-90 mask_col0_gt = pantilt_2d[:,0]>90 # edit first column per statements in original code pantilt_2d[:,0][mask_col0_gt] -= 180 pantilt_2d[:,0][mask_col0_lt] += 180 # edit second column per statements in original code pantilt_2d[ mask_col0_lt | mask_col0_gt,1] *= -1 # vectorized tan_pan , tan_tilt tan_pan_tilt = np.tan(np.radians(pantilt_2d)) # vectorized calculation for: "tan_tilt * (target[1] .." original code v = (tan_pan_tilt[:,1]*(inputs[:,1] - pos_y))/np.sqrt((tan_pan_tilt[:,0]**2)+1) # setup output numpy array pos_array_vectorized = np.empty((num_samples,3)) # put in values columns of output array pos_array_vectorized[:,0] = inputs[:,0] + tan_pan_tilt[:,0]*v pos_array_vectorized[:,1] = pos_y pos_array_vectorized[:,2] = inputs[:,2] + v # convert list, if desired final output # (keeping numpy array boost performance further) return pos_array_vectorized.tolist()
runtime tests
in [415]: # parameters , setup input arrays ...: num_samples = 1000 ...: outputs = np.random.randint(-180,180,(num_samples,5)) ...: inputs = np.random.rand(num_samples,6) ...: pos_y = 3.4 ...: in [416]: %timeit original(inputs,outputs,pos_y) 100 loops, best of 3: 2.44 ms per loop in [417]: %timeit mask_vectorized(inputs,outputs,pos_y) 10000 loops, best of 3: 181 µs per loop
Comments
Post a Comment