PySpark的`clearCache()`方法会清除哪些存储级别?

huangapple go评论52阅读模式
英文:

Which storage levels are cleared by PySpark's `clearCahce()`?

问题

根据文档来看,似乎 spark.sql.Catalog.clearCache() 只会清除内存中已持久化的数据框架。

如果我将表格持久化到磁盘上 (df.persist(StorageLevel.DISK_ONLY))),cearCache() 会解除持久化吗?

英文:

Judging by the docs, it seems like spark.sql.Catalog.clearCache() only clears dataframes that are persisted in memory.

If I were to persist a table in disk (df.persist(StorageLevel.DISK_ONLY))), would cearCache() unpersist it too?

答案1

得分: 0

在Spark中,cache是数据persistence的选项之一。clearCache()不会取消持久化示例中的数据,请使用unpersist()。它会将DataFrame标记为非持久化,并从内存和磁盘中删除所有块。

英文:

In Spark, cache is one of the options for data persistence. clearCache() will not unpersist the data in your example, use unpersist(). It will marks the DataFrame as non-persistent, and remove all blocks for it from memory and disk.

huangapple
  • 本文由 发表于 2023年6月15日 01:07:02
  • 转载请务必保留本文链接:https://go.coder-hub.com/76476002.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定