2016-08-10 3 views
0

я пытаюсь выполнить пример водопропускной потоковым, но не может иметь мои банки файлы работать: Здесь https://github.com/spark-packages/dstream-flume/blob/master/examples/src/main/python/streaming/flume_wordcount.py они указывают наПоток потокового света - недостающие пакеты?

bin/spark-submit --jars \ 
     external/flume-assembly/target/scala-*/spark-streaming-flume-assembly-*.jar 

Я понятия не имею, что это за «внешний» реж?

На моей искры (1.6.0) Lib я положил несколько банок (я пробовал оба 1.6.0 и 1.6.0):

$ pwd 
/Users/romain/Informatique/zoo/spark-1.6.0-bin-hadoop2.4/lib 
$ ls *flume* 
spark-streaming-flume-assembly_2.10-1.6.0.jar 
spark-streaming-flume-assembly_2.10-1.6.2.jar 

spark-streaming-flume-sink_2.10-1.6.2.jar 
spark-streaming-flume-sink_2.10-1.6.0.jar 

spark-streaming-flume_2.10-1.6.0.jar 
spark-streaming-flume_2.10-1.6.2.jar 

Тогда я делаю:

$ ./bin/pyspark --master ip:7077 --total-executor-cores 1 --packages com.databricks:spark-csv_2.10:1.4.0 
--jars /Users/romain/Informatique/zoo/spark-1.6.0-bin-hadoop2.4/lib/spark-streaming-flume-sink_2.10-1.6.0.jar 
--jars /Users/romain/Informatique/zoo/spark-1.6.0-bin-hadoop2.4/lib/spark-streaming-flume_2.10-1.6.0.jar 
--jars /Users/romain/Informatique/zoo/spark-1.6.0-bin-hadoop2.4/lib/spark-streaming-flume-assembly_2.10-1.6.0.jar 

сервер ноутбуков питон выстреливает вверх, но потом, когда я прошу штормовой объекта:

from pyspark.streaming.flume import FlumeUtils 
from pyspark   import SparkContext 
from pyspark   import SparkConf 
from pyspark.streaming import StreamingContext 
try : sc.stop() 
except : pass 
try : ssc.stop() 
except : pass 
conf = SparkConf() 
conf.setAppName("Streaming Flume") 
conf.set("spark.executor.memory","1g") 
conf.set("spark.driver.memory","1g") 
conf.set("spark.cores.max","5") 
conf.set("spark.driver.extraClassPath", "/Users/romain/Informatique/zoo/spark-1.6.0-bin-hadoop2.4/lib/") 
conf.set("spark.executor.extraClassPath", "/Users/romain/Informatique/zoo/spark-1.6.0-bin-hadoop2.4/lib/") 
sc = SparkContext(conf=conf) 
ssc = StreamingContext(sc, 10) 
FlumeUtils.createStream(ssc, "localhost", "4949") 

он терпит неудачу:

________________________________________________________________________________________________ 

    Spark Streaming's Flume libraries not found in class path. Try one of the following. 

    1. Include the Flume library and its dependencies with in the 
    spark-submit command as 

    $ bin/spark-submit --packages org.apache.spark:spark-streaming-flume:1.6.0 ... 

    2. Download the JAR of the artifact from Maven Central http://search.maven.org/, 
    Group Id = org.apache.spark, Artifact Id = spark-streaming-flume-assembly, Version = 1.6.0. 
    Then, include the jar in the spark-submit command as 

    $ bin/spark-submit --jars <spark-streaming-flume-assembly.jar> ... 

________________________________________________________________________________________________ 

Я пытался добавить

--packages org.apache.spark:spark-streaming-flume-sink.1.6.0 

в конце моей искровым представить, но я получаю еще один вопрос:

org.apache.spark#spark-streaming-flume-sink added as a dependency 
:: resolving dependencies :: org.apache.spark#spark-submit-parent;1.0 
    confs: [default] 
:: resolution report :: resolve 2344ms :: artifacts dl 0ms 
    :: modules in use: 
    --------------------------------------------------------------------- 
    |     |   modules   || artifacts | 
    |  conf  | number| search|dwnlded|evicted|| number|dwnlded| 
    --------------------------------------------------------------------- 
    |  default  | 1 | 0 | 0 | 0 || 0 | 0 | 
    --------------------------------------------------------------------- 

:: problems summary :: 
:::: WARNINGS 
     module not found: org.apache.spark#spark-streaming-flume-sink;1.6.0 

    ==== local-m2-cache: tried 

     file:/Users/romain/.m2/repository/org/apache/spark/spark-streaming-flume-sink/1.6.0/spark-streaming-flume-sink-1.6.0.pom 

     -- artifact org.apache.spark#spark-streaming-flume-sink;1.6.0!spark-streaming-flume-sink.jar: 

     file:/Users/romain/.m2/repository/org/apache/spark/spark-streaming-flume-sink/1.6.0/spark-streaming-flume-sink-1.6.0.jar 

    ==== local-ivy-cache: tried 

     /Users/romain/.ivy2/local/org.apache.spark/spark-streaming-flume-sink/1.6.0/ivys/ivy.xml 

    ==== central: tried 

     https://repo1.maven.org/maven2/org/apache/spark/spark-streaming-flume-sink/1.6.0/spark-streaming-flume-sink-1.6.0.pom 

     -- artifact org.apache.spark#spark-streaming-flume-sink;1.6.0!spark-streaming-flume-sink.jar: 

     https://repo1.maven.org/maven2/org/apache/spark/spark-streaming-flume-sink/1.6.0/spark-streaming-flume-sink-1.6.0.jar 

    ==== spark-packages: tried 

     http://dl.bintray.com/spark-packages/maven/org/apache/spark/spark-streaming-flume-sink/1.6.0/spark-streaming-flume-sink-1.6.0.pom 

     -- artifact org.apache.spark#spark-streaming-flume-sink;1.6.0!spark-streaming-flume-sink.jar: 

     http://dl.bintray.com/spark-packages/maven/org/apache/spark/spark-streaming-flume-sink/1.6.0/spark-streaming-flume-sink-1.6.0.jar 

     :::::::::::::::::::::::::::::::::::::::::::::: 

     ::   UNRESOLVED DEPENDENCIES   :: 

     :::::::::::::::::::::::::::::::::::::::::::::: 

     :: org.apache.spark#spark-streaming-flume-sink;1.6.0: not found 

     :::::::::::::::::::::::::::::::::::::::::::::: 



:: USE VERBOSE OR DEBUG MESSAGE LEVEL FOR MORE DETAILS 
Exception in thread "main" java.lang.RuntimeException: [unresolved dependency: org.apache.spark#spark-streaming-flume-sink;1.6.0: not found] 

Я никогда не использовал pom.xml - может быть, я должен ?

ответ