$ spark-shell 127 ↵ 20/02/29 14:22:26 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties Setting default log level to "WARN". To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel). Spark context Web UI available at http://localhost:4040 Spark context available as 'sc' (master = local[*], app id = local-1582957351388). Spark session available as 'spark'. Welcome to ____ __ / __/__ ___ _____/ /__ _\ \/ _ \/ _ `/ __/ '_/ /___/ .__/\_,_/_/ /_/\_\ version 3.0.0-preview2 /_/
Using Scala version 2.12.10 (OpenJDK 64-Bit Server VM, Java 1.8.0_212) Type in expressions to have them evaluated. Type :help for more information. scala>
scala> spark.version res1: String = 3.0.0-preview2 scala> val strings = spark.read.text("README.md") strings: org.apache.spark.sql.DataFrame = [value: string] scala> strings.show(10, false) +--------------------------------------------------------------------------------+ |value | +--------------------------------------------------------------------------------+ |# Apache Spark | | | |Spark is a unified analytics engine for large-scale data processing. It provides| |high-level APIs in Scala, Java, Python, and R, and an optimized engine that | |supports general computation graphs for data analysis. It also supports a | |rich set of higher-level tools including Spark SQL for SQL and DataFrames, | |MLlib for machine learning, GraphX for graph processing, | |and Structured Streaming for stream processing. | | | |<https://spark.apache.org/> | +--------------------------------------------------------------------------------+ only showing top 10 rows scala> strings.count() res3: Long = 109 scala>