英文:
Maven dependency “Cannot resolve symbol VectorAssembler'” in IntelliJ IDEA
问题
IntelliJ IDEA无法导入Spark mllib,尽管我已经在Maven中添加了依赖项。与Spark的其他部分没有问题。在项目结构中 -> 库 中存在spark mllib。
import org.apache.spark.ml.feature.VectorAssembler; -> 无法解析符号 'VectorAssembler'
pom.xml:
<dependencies>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.12</artifactId>
<version>3.0.0</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.12</artifactId>
<version>3.0.0</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-mllib_2.12</artifactId>
<version>3.0.0</version>
<scope>runtime</scope>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-hdfs</artifactId>
<version>3.3.0</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-core</artifactId>
<version>1.2.1</version>
</dependency>
</dependencies>
[项目结构][1]
我尝试过刷新Maven并清除Maven的存储库文件夹。没有任何帮助。
英文:
IntelliJ IDEA cannot import Spark mllib, when I added dependency in maven. With other parts of Spark no problems. In project Structure -> Libraries spark mllib is present.
import org.apache.spark.ml.feature.VectorAssembler; -> Cannot resolve symbol 'VectorAssembler'
pom.xml:
<dependencies>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.12</artifactId>
<version>3.0.0</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.12</artifactId>
<version>3.0.0</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-mllib_2.12</artifactId>
<version>3.0.0</version>
<scope>runtime</scope>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-hdfs</artifactId>
<version>3.3.0</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-core</artifactId>
<version>1.2.1</version>
</dependency>
</dependencies>
I tried refreshing maven and clearing folder with repositories of maven. Nothing helped.
答案1
得分: 0
你将mllib
依赖项指定为runtime
- 这意味着该依赖项在执行时是必需的,但在编译时不需要,因此不会放入用于编译代码的类路径中。有关Maven中可用不同范围的说明,请参阅此博文。
用单个依赖项替换所有Spark依赖项(mllib
,core
,sql
)(还要删除hadoop依赖项):
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-mllib_${spark.scala.version}</artifactId>
<version>${spark.version}</version>
<scope>provided</scope>
</dependency>
其中变量被定义为:
<properties>
<spark.version>3.0.1</spark.version>
<spark.scala.version>2.12</spark.scala.version>
</properties>
英文:
You specified mllib dependency as runtime
- this means that dependency is required for execution, but not for compilation, so it won't be put into classpath for compiling your code. See this blog post for description of of different scopes available in Maven.
Replace all spark dependencies (mllib
, core
, sql
) with just single dependency (also remove hadoop dependencies):
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-mllib_${spark.scala.version}</artifactId>
<version>${spark.version}</version>
<scope>provided</scope>
</dependency>
where variables are defined as
<properties>
<spark.version>3.0.1</spark.version>
<spark.scala.version>2.12</spark.scala.version>
</properties>
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论