amazon web services - No module named simplejson in python UDF on EMR -


i'm running amazon elastic mapreduce (emr) job using pig. i'm having trouble importing json or simplejson modules python user defined function (udf).

here code:

#!/usr/bin/env python import simplejson json @outputschema('m:map[]') def flattenjson(text):     j = json.loads(text)     ... 

when try register function in pig error saying "no module named simplejson"

grunt> register 's3://chopperui-emr/code/flattendict.py' using jython flatten; 2015-05-31 16:53:43,041 [main] error org.apache.pig.tools.grunt.grunt - error 1121: python error. traceback (most recent call last): file "/tmp/pig6071834754384533869tmp/flattendict.py", line 32, in <module> import simplejson json importerror: no module named simplejson 

however, amazon ami includes python 2.6, includes json standard package (using import json doesn't work either). also, if try install simplejson using pip says it's installed (on both master , core nodes).

[hadoop@ip-172-31-46-71 ~]$ pip install simplejson requirement satisfied (use --upgrade upgrade): simplejson in /usr/local/lib64/python2.6/site-packages 

also, works fine if run python interactively command line on master node

[hadoop@ip-172-31-46-71 ~]$ python python 2.6.9 (unknown, apr  1 2015, 18:16:00)  [gcc 4.8.2 20140120 (red hat 4.8.2-16)] on linux2 type "help", "copyright", "credits" or "license" more information. >>> import json >>>  

there must different how emr or pig setting python environment, what?

pig udf uses jython, not work simplejson.

you can try: jyson json parser


Comments

Popular posts from this blog

angularjs - ADAL JS Angular- WebAPI add a new role claim to the token -

php - CakePHP HttpSockets send array of paramms -

node.js - Using Node without global install -