spark run using IDE / Maven

来自:http://*.com/questions/26892389/org-apache-spark-sparkexception-job-aborted-due-to-stage-failure-task-from-app

  1. Create a Fat Jar ( One which includes all dependencies ). Use Shade Plugin for this. Example pom :
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>2.2</version>
<configuration>
<filters>
<filter>
<artifact>*:*</artifact>
<excludes>
<exclude>META-INF/*.SF</exclude>
<exclude>META-INF/*.DSA</exclude>
<exclude>META-INF/*.RSA</exclude>
</excludes>
</filter>
</filters>
</configuration>
<executions>
<execution>
<id>job-driver-jar</id>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
<configuration>
<shadedArtifactAttached>true</shadedArtifactAttached>
<shadedClassifierName>driver</shadedClassifierName>
<transformers>
<transformer implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer"/>
<!--
Some care is required:
http://doc.akka.io/docs/akka/snapshot/general/configuration.html
-->
<transformer implementation="org.apache.maven.plugins.shade.resource.AppendingTransformer">
<resource>reference.conf</resource>
</transformer>
<transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
<mainClass>mainClass</mainClass>
</transformer>
</transformers>
</configuration>
</execution>
<execution>
<id>worker-library-jar</id>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
<configuration>
<shadedArtifactAttached>true</shadedArtifactAttached>
<shadedClassifierName>worker</shadedClassifierName>
<transformers>
<transformer implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer"/>
</transformers>
</configuration>
</execution>
</executions>
</plugin>
  1. Now we have to send the compiled jar file to the cluster. For this, specify the jar file in the spark config like this :

SparkConf conf = new SparkConf().setAppName("appName").setMaster("spark://machineName:7077").setJars(new String[] {"target/appName-1.0-SNAPSHOT-driver.jar"});

  1. Run mvn clean package to create the Jar file. It will be created in your target folder.

  2. Run using your IDE or using maven command :

mvn exec:java -Dexec.mainClass="className"

This does not require spark-submit. Just remember to package file before running

If you don't want to hardcode the jar path, you can do this :

  1. In the config, write :

SparkConf conf = new SparkConf() .setAppName("appName") .setMaster("spark://machineName:7077") .setJars(JavaSparkContext.jarOfClass(this.getClass()));

  1. Create the fat jar ( as above ) and run using maven after running package command :

java -jar target/application-1.0-SNAPSHOT-driver.jar

This will take the jar from the jar the class was loaded.

上一篇:node创建一个简单的web服务


下一篇:超详细,新手都能看懂 !使用SpringBoot+Dubbo 搭建一个简单的分布式服务