hadoop - Add Spark to Oozie shared lib -


by default, oozie shared lib directory provides libraries hive, pig, , map-reduce. if want run spark job on oozie, might better add spark lib jars oozie's shared lib instead of copy them app's lib directory.
how can add spark lib jars (including spark-core , dependencies) oozie's shared lib? comment / answer appreciated.

spark action scheduled released oozie 4.2.0, though doc seems bit behind. see related jira here : oozie jira - add spark action executor

cloudera's release cdh 5.4 has though, see official doc here: cdh 5.4 oozie doc - oozie spark action extension

with older version of oozie, jars shared various approaches. first approach may work best. complete listings anyway :

below various ways include jar workflow:

set oozie.libpath=/path/to/jars,another/path/to/jars in job.properties.

this useful if have many workflows need same jar; can put in 1 place in hdfs , use many workflows. jars available actions in workflow. there no need ever point @ sharelib location. (i see in lot of workflows.) oozie knows sharelib , include automatically if set oozie.use.system.libpath=true in job.properties.

create directory named “lib” next workflow.xml in hdfs , put jars in there.

this useful if have jars need 1 workflow. oozie automatically make jars available actions in workflow.

specify tag in action path single jar; can have multiple tags.

this useful if want jars specific action , not actions in workflow. downside have specify them in workflow.xml, if ever need add/remove jars, have change workflow.xml.

add jars sharelib (e.g. /user/oozie/share/lib/lib_/pig)

while work, it’s not recommended 2 reasons: additional jars included every workflow using sharelib, may unexpected workflows , users. when upgrading sharelib, you’ll have recopy additional jars new sharelib.

quoted rober kanter's blog here : how-to: use sharelib in apache oozie (cdh 5)


Comments

Popular posts from this blog

angularjs - ADAL JS Angular- WebAPI add a new role claim to the token -

php - CakePHP HttpSockets send array of paramms -

node.js - Using Node without global install -