通过sql的方式,读取数据,根我们常用的关系型数据库差不多,更容易上手,当然没有updata和delete。
1,启动spark-shell
# spark-shell --master yarn Setting default log level to "WARN". To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel). Spark context Web UI available at http://bigserver1:4040 Spark context available as 'sc' (master = yarn, app id = application_1547025808071_0015). //sc Spark session available as 'spark'. //spark Welcome to ____ __ / __/__ ___ _____/ /__ _\ \/ _ \/ _ `/ __/ '_/ /___/ .__/\_,_/_/ /_/\_\ version 2.4.0 /_/ Using Scala version 2.11.12 (OpenJDK 64-Bit Server VM, Java 1.8.0_191) Type in expressions to have them evaluated. Type :help for more information.
2,方法一
scala> val sqlDF = spark.sql("SELECT * FROM tanktest.test");
sqlDF: org.apache.spark.sql.DataFrame = [id: int, name: string]
scala> sqlDF.show();
+---+---------+
| id| name|
+---+---------+
| 1| tank|
| 2| zhang|
| 3| ying|
| 5|tanktest1|
| 6|tanktest2|
| 4| tanktest|
| 7| denggei|
+---+---------+
3,方法2
scala> val test = spark.sqlContext.sql("SELECT * FROM tanktest.test");
test: org.apache.spark.sql.DataFrame = [id: int, name: string]
scala> test.show();
+---+---------+
| id| name|
+---+---------+
| 1| tank|
| 2| zhang|
| 3| ying|
| 5|tanktest1|
| 6|tanktest2|
| 4| tanktest|
| 7| denggei|
4,方法3
scala> import org.apache.spark.sql.SQLContext
import org.apache.spark.sql.SQLContext
scala> val sqlContext = new org.apache.spark.sql.SQLContext(sc)
warning: there was one deprecation warning; re-run with -deprecation for details
sqlContext: org.apache.spark.sql.SQLContext = org.apache.spark.sql.SQLContext@4ac73165
scala> val df = sqlContext.sql("SELECT * FROM tanktest.test")
df: org.apache.spark.sql.DataFrame = [id: int, name: string]
scala> df.show();
+---+---------+
| id| name|
+---+---------+
| 1| tank|
| 2| zhang|
| 3| ying|
| 5|tanktest1|
| 6|tanktest2|
| 4| tanktest|
| 7| denggei|
+---+---------+
转载请注明
作者:海底苍鹰
地址:http://blog.51yip.com/hadoop/2043.html