通过sql的方式,读取数据,根我们常用的关系型数据库差不多,更容易上手,当然没有updata和delete。
1,启动spark-shell
# spark-shell --master yarn Setting default log level to "WARN". To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel). Spark context Web UI available at http://bigserver1:4040 Spark context available as 'sc' (master = yarn, app id = application_1547025808071_0015). //sc Spark session available as 'spark'. //spark Welcome to ____ __ / __/__ ___ _____/ /__ _\ \/ _ \/ _ `/ __/ '_/ /___/ .__/\_,_/_/ /_/\_\ version 2.4.0 /_/ Using Scala version 2.11.12 (OpenJDK 64-Bit Server VM, Java 1.8.0_191) Type in expressions to have them evaluated. Type :help for more information.
2,方法一
scala> val sqlDF = spark.sql("SELECT * FROM tanktest.test"); sqlDF: org.apache.spark.sql.DataFrame = [id: int, name: string] scala> sqlDF.show(); +---+---------+ | id| name| +---+---------+ | 1| tank| | 2| zhang| | 3| ying| | 5|tanktest1| | 6|tanktest2| | 4| tanktest| | 7| denggei| +---+---------+
3,方法2
scala> val test = spark.sqlContext.sql("SELECT * FROM tanktest.test"); test: org.apache.spark.sql.DataFrame = [id: int, name: string] scala> test.show(); +---+---------+ | id| name| +---+---------+ | 1| tank| | 2| zhang| | 3| ying| | 5|tanktest1| | 6|tanktest2| | 4| tanktest| | 7| denggei|
4,方法3
scala> import org.apache.spark.sql.SQLContext import org.apache.spark.sql.SQLContext scala> val sqlContext = new org.apache.spark.sql.SQLContext(sc) warning: there was one deprecation warning; re-run with -deprecation for details sqlContext: org.apache.spark.sql.SQLContext = org.apache.spark.sql.SQLContext@4ac73165 scala> val df = sqlContext.sql("SELECT * FROM tanktest.test") df: org.apache.spark.sql.DataFrame = [id: int, name: string] scala> df.show(); +---+---------+ | id| name| +---+---------+ | 1| tank| | 2| zhang| | 3| ying| | 5|tanktest1| | 6|tanktest2| | 4| tanktest| | 7| denggei| +---+---------+
转载请注明
作者:海底苍鹰
地址:http://blog.51yip.com/hadoop/2043.html