python - PySpark Streaming example does not seem to terminate -

August 15, 2015

i trying understand python api of spark streaming simple example.

from pyspark.streaming import streamingcontext dvc = [[-0.1, -0.1], [0.1, 0.1], [1.1, 1.1], [0.9, 0.9]] dvc = [sc.parallelize(i, 1) in dvc] ssc = streamingcontext(sc, 2.0) input_stream = ssc.queuestream(dvc)  def get_output(rdd):     print(rdd.collect()) input_stream.foreachrdd(get_output) ssc.start()

this prints the required output, prints lot of empty lists @ end , not terminate. can tell me might going wrong.

streaming in cases(unless terminated conditions in code) supposed infinite. purpose of streaming application consume data coming in @ regular intervals. , hence after processing first 4 rdds (i.e. [[-0.1, -0.1], [0.1, 0.1], [1.1, 1.1], [0.9, 0.9]]) have nothing in queue whereas spark streaming builds on notion new might come queuestream

if doing one-time etl might consider dropping streaming.

Search This Blog

Call

python - PySpark Streaming example does not seem to terminate -

Comments

Post a Comment

Popular posts from this blog

node.js - Using Node without global install -

How to access a php class file from PHPFox framework into javascript code written in simple HTML file? -

java - Null response to php query in android, even though php works properly -