大数据开发,一直都用java spring boot框架,加载scala,spark,scala兼容包等,spark开发用scala,其他的开发用java,打包用mvn,感觉还是挺不错的。
如果只是开spark,scala就足够了,没必要在用spring boot。这样感觉不会那么怪。
1,什么sbt
sbt是类似ANT、MAVEN的构建工具,全称为Simple build tool,是Scala事实上的标准构建工具。
主要特性:
原生支持编译Scala代码和与诸多Scala测试框架进行交互;
使用Scala编写的DSL(领域特定语言)构建描述
使用Ivy作为库管理工具
持续编译、测试和部署
整合scala解释器快速迭代和调试
支持Java与Scala混合的项目
2,创建scala项目,并引入spark包
name := "scalatest" version := "0.1" scalaVersion := "2.11.8" //加载包 libraryDependencies ++= Seq( "org.apache.spark" % "spark-core_2.11" % "2.3.0", "org.apache.spark" % "spark-sql_2.11" % "2.3.0", "com.alibaba" % "fastjson" % "1.2.49" ) //多版本冲突解决 assemblyMergeStrategy in assembly := { case PathList("javax", "inject", xs @ _*) => MergeStrategy.first case PathList("org", "apache", xs @ _*) => MergeStrategy.first case "application.conf" => MergeStrategy.concat case "git.properties" => MergeStrategy.first case PathList("org", "aopalliance", xs @ _*) => MergeStrategy.first case "unwanted.txt" => MergeStrategy.discard case m if m.toLowerCase.endsWith("manifest.mf") => MergeStrategy.discard case PathList("META-INF", xs @ _*) => MergeStrategy.discard case x => val oldStrategy = (assemblyMergeStrategy in assembly).value oldStrategy(x) }
MergeStrategy.deduplicate 是上面描述的默认值
MergeStrategy.first 选择classpath顺序中的第一个匹配文件
MergeStrategy.last 挑选最后一个
MergeStrategy.singleOrError 在冲突时出现错误消息
MergeStrategy.concat 简单地连接所有匹配的文件并包含结果
MergeStrategy.filterDistinctLines 也可以连接,但在此过程中会留下重复的内容
MergeStrategy.rename 重命名源自jar文件的文件
MergeStrategy.discard 只是丢弃匹配的文件
3,安装sbt-assembly
MacBook-Pro:scalatest zhangying$ cat project/plugins.sbt addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.14.9")
在项目,根目录/project/下创建一下plugins.sbt文件,添加以上内容。IDEA会重新加载包。
注意,高版本的sbt,build.sbt不需要import AssemblyKeys._,如果加了会报错如下:
import AssemblyKeys._
^
[error] Type error in expression
我用的sbt是:
MacBook-Pro:scalatest zhangying$ cat project/build.properties sbt.version = 1.2.8
为什么要引入assembly呢?因为sbt package的时候,包依赖并不能被打包,包依赖并不能被打包,包依赖并不能被打包,这样打出来的包是不能直接运行的,感觉有点坑。
4,sbt 打包 scala 项目
MacBook-Pro:scalatest zhangying$ sbt clean compile assembly
也可以进入sbt的命令行下打包。
在打包过程中遇到的多版本问题
[error] deduplicate: different file contents found in the following:
[error] /Users/zhangying/.ivy2/cache/org.apache.spark/spark-core_2.11/jars/spark-core_2.11-2.3.0.jar:org/apache/spark/unused/UnusedStubClass.class
[error] /Users/zhangying/.ivy2/cache/org.apache.arrow/arrow-vector/jars/arrow-vector-0.8.0.jar:git.properties
[error] /Users/zhangying/.ivy2/cache/aopalliance/aopalliance/jars/aopalliance-1.0.jar:org/aopalliance/intercept/Invocation.class
[error] /Users/zhangying/.ivy2/cache/javax.inject/javax.inject/jars/javax.inject-1.jar:javax/inject/Inject.class
[error] /Users/zhangying/.ivy2/cache/org.glassfish.hk2.external/aopalliance-repackaged/jars/aopalliance-repackaged-2.4.0-b34.jar:org/aopalliance/aop/AspectException.class
MacBook-Pro:org.apache.spark zhangying$ ll total 0 drwxr-xr-x 26 zhangying staff 884 7 31 16:19 ./ drwxr-xr-x 209 zhangying staff 7106 8 1 21:37 ../ drwxr-xr-x 10 zhangying staff 340 7 31 16:22 spark-catalyst_2.11/ drwxr-xr-x 7 zhangying staff 238 7 31 15:34 spark-catalyst_2.12/ drwxr-xr-x 10 zhangying staff 340 7 31 16:19 spark-core_2.11/ drwxr-xr-x 7 zhangying staff 238 7 31 15:34 spark-core_2.12/ drwxr-xr-x 7 zhangying staff 238 7 31 15:30 spark-graphx_2.11/ drwxr-xr-x 10 zhangying staff 340 7 31 16:19 spark-kvstore_2.11/ drwxr-xr-x 7 zhangying staff 238 7 31 15:35 spark-kvstore_2.12/ drwxr-xr-x 10 zhangying staff 340 7 31 16:19 spark-launcher_2.11/ drwxr-xr-x 7 zhangying staff 238 7 31 15:35 spark-launcher_2.12/ drwxr-xr-x 7 zhangying staff 238 7 31 15:30 spark-mllib-local_2.11/ drwxr-xr-x 10 zhangying staff 340 7 31 16:19 spark-network-common_2.11/ drwxr-xr-x 7 zhangying staff 238 7 31 15:34 spark-network-common_2.12/ drwxr-xr-x 10 zhangying staff 340 7 31 16:19 spark-network-shuffle_2.11/ drwxr-xr-x 7 zhangying staff 238 7 31 15:35 spark-network-shuffle_2.12/ drwxr-xr-x 8 zhangying staff 272 7 31 16:19 spark-parent_2.11/ drwxr-xr-x 5 zhangying staff 170 7 31 15:32 spark-parent_2.12/ drwxr-xr-x 10 zhangying staff 340 7 31 16:22 spark-sketch_2.11/ drwxr-xr-x 7 zhangying staff 238 7 31 15:35 spark-sketch_2.12/ drwxr-xr-x 10 zhangying staff 340 7 31 16:21 spark-sql_2.11/ drwxr-xr-x 7 zhangying staff 238 7 31 15:34 spark-sql_2.12/ drwxr-xr-x 10 zhangying staff 340 7 31 16:19 spark-tags_2.11/ drwxr-xr-x 7 zhangying staff 238 7 31 15:34 spark-tags_2.12/ drwxr-xr-x 10 zhangying staff 340 7 31 16:19 spark-unsafe_2.11/ drwxr-xr-x 7 zhangying staff 238 7 31 15:34 spark-unsafe_2.12/
解决办法:1,可以删除多余版本;2,解决多版本冲突
//多版本冲突解决 assemblyMergeStrategy in assembly := { case PathList("javax", "inject", xs @ _*) => MergeStrategy.first case PathList("org", "apache", xs @ _*) => MergeStrategy.first case "application.conf" => MergeStrategy.concat case "git.properties" => MergeStrategy.first case PathList("org", "aopalliance", xs @ _*) => MergeStrategy.first case "unwanted.txt" => MergeStrategy.discard case m if m.toLowerCase.endsWith("manifest.mf") => MergeStrategy.discard case PathList("META-INF", xs @ _*) => MergeStrategy.discard case x => val oldStrategy = (assemblyMergeStrategy in assembly).value oldStrategy(x) }
注意:
匹配规则在:后面,而不是开头,上面的红色字体标识。
匹配规则在:后面,而不是开头,上面的红色字体标识。
匹配规则在:后面,而不是开头,上面的红色字体标识。
转载请注明
作者:海底苍鹰
地址:http://blog.51yip.com/hadoop/2165.html