英文:
How import spark.sqlContext.implicits._ works in scala?
问题
以下是翻译的内容:
我是Scala的新手
这是我试图理解的内容
这段代码片段给我提供了RDD[Int],而没有提供使用toDF
的选项
var input = spark.sparkContext.parallelize(List(1,2,3,4,5,6,7,8,9))
但是当我导入import spark.sqlContext.implicits._
时,它给我提供了使用toDF
的选项
import spark.sqlContext.implicits._
var input = spark.sparkContext.parallelize(List(1,2,3,4,5,6,7,8,9)).toDF
所以我查看了源代码,implicits
存在于SQLContext
类中作为object
。我无法理解,为什么在导入后RDD
实例能够调用toDF
?
有人可以帮助我理解吗?
更新
在SQLContext类中找到了下面的代码片段
object implicits extends SQLImplicits with Serializable {
protected override def _sqlContext: SQLContext = self
}
英文:
I'm new in Scala
Here's what I'm trying to understand
This code snippet gives me RDD[Int], not give option to use toDF
var input = spark.sparkContext.parallelize(List(1,2,3,4,5,6,7,8,9))
But when I import import spark.sqlContext.implicits._
, it gives me an option to use toDF
import spark.sqlContext.implicits._
var input = spark.sparkContext.parallelize(List(1,2,3,4,5,6,7,8,9)).toDF
So I looked into the source code, implicits
is present in SQLContext
class as object
. I'm not able to understand, how come RDD
instance is able to call toDF
after import ?
Can anyone help me to understand ?
update
found below code snippet in SQLContext class
object implicits extends SQLImplicits with Serializable {
protected override def _sqlContext: SQLContext = self
}
答案1
得分: 3
toDF
是一个扩展方法。通过导入,您将必要的隐式参数引入了作用域。
例如,Int
类型没有 foo
方法:
1.foo() // 不能编译通过
但是如果您定义了一个扩展方法并导入隐式参数:
object implicits {
implicit class IntOps(i: Int) {
def foo() = println("foo")
}
}
import implicits._
1.foo() // 可以编译通过
编译器将 1.foo()
转换为 new IntOps(1).foo()
。
类似地,
object implicits extends SQLImplicits ...
abstract class SQLImplicits ... {
...
implicit def rddToDatasetHolder[T : Encoder](rdd: RDD[T]): DatasetHolder[T] = {
DatasetHolder(_sqlContext.createDataset(rdd))
}
implicit def localSeqToDatasetHolder[T : Encoder](s: Seq[T]): DatasetHolder[T] = {
DatasetHolder(_sqlContext.createDataset(s))
}
}
case class DatasetHolder[T] private[sql](private val ds: Dataset[T]) {
def toDS(): Dataset[T] = ds
def toDF(): DataFrame = ds.toDF()
def toDF(colNames: String*): DataFrame = ds.toDF(colNames : _*)
}
import spark.sqlContext.implicits._
将 spark.sparkContext.parallelize(List(1,2,3,4,5,6,7,8,9)).toDF
转换为 rddToDatasetHolder(spark.sparkContext.parallelize...).toDF
即 DatasetHolder(_sqlContext.createDataset(spark.sparkContext.parallelize...)).toDF
。
您可以阅读关于 Scala 中的隐式参数和扩展方法的信息:
关于 spark.implicits._
:
- 在 Scala 中导入 Spark 隐式参数
- 使用 Spark 隐式参数导入的内容
- 在没有 SparkSession 实例的情况下导入隐式转换
- 在任何地方都导入 Spark 隐式参数的解决方法
- 为什么 Spark 隐式参数嵌入在将任何 RDD 转换为 Dataset 之前
英文:
toDF
is an extension method. With the import you bring necessary implicits to the scope.
For example Int
doesn't have method foo
1.foo() // doesn't compile
But if you define an extension method and import implicit
object implicits {
implicit class IntOps(i: Int) {
def foo() = println("foo")
}
}
import implicits._
1.foo() // compiles
The compiler transforms 1.foo()
into new IntOps(1).foo()
.
Similarly,
object implicits extends SQLImplicits ...
abstract class SQLImplicits ... {
...
implicit def rddToDatasetHolder[T : Encoder](rdd: RDD[T]): DatasetHolder[T] = {
DatasetHolder(_sqlContext.createDataset(rdd))
}
implicit def localSeqToDatasetHolder[T : Encoder](s: Seq[T]): DatasetHolder[T] = {
DatasetHolder(_sqlContext.createDataset(s))
}
}
case class DatasetHolder[T] private[sql](private val ds: Dataset[T]) {
def toDS(): Dataset[T] = ds
def toDF(): DataFrame = ds.toDF()
def toDF(colNames: String*): DataFrame = ds.toDF(colNames : _*)
}
import spark.sqlContext.implicits._
transforms spark.sparkContext.parallelize(List(1,2,3,4,5,6,7,8,9)).toDF
into rddToDatasetHolder(spark.sparkContext.parallelize...).toDF
i.e. DatasetHolder(_sqlContext.createDataset(spark.sparkContext.parallelize...)).toDF
.
You can read about implicits, extension methods in Scala
https://stackoverflow.com/questions/10375633/understanding-implicit-in-scala
https://stackoverflow.com/questions/5598085/where-does-scala-look-for-implicits
https://stackoverflow.com/questions/65844327/understand-scala-implicit-classes
https://docs.scala-lang.org/overviews/core/implicit-classes.html
https://docs.scala-lang.org/scala3/book/ca-extension-methods.html
https://docs.scala-lang.org/scala3/reference/contextual/extension-methods.html
https://stackoverflow.com/questions/76033008/how-extend-a-class-is-diff-from-implicit-class
About spark.implicits._
https://stackoverflow.com/questions/39151189/importing-spark-implicits-in-scala
https://stackoverflow.com/questions/50878224/what-is-imported-with-spark-implicits
https://stackoverflow.com/questions/45724290/workaround-for-importing-spark-implicits-everywhere
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论