获取GraphFrames中的最短路径,使用Java。

huangapple go评论104阅读模式
英文:

Getting shortestPaths in GraphFrames with Java

问题

以下是翻译好的内容:

我对Spark和GraphFrames还不熟悉。

当我想学习GraphFrame中的shortestPaths方法时,GraphFrames文档给了我一个Scala代码示例,但没有Java版本的。

在他们的文档中,他们提供了以下Scala代码示例:

import org.graphframes.{examples, GraphFrame}
val g: GraphFrame = examples.Graphs.friends  // 获取示例图

val results = g.shortestPaths.landmarks(Seq("a", "d")).run()
results.select("id", "distances").show()

而在Java中,我尝试了以下代码:

import org.graphframes.GraphFrames;
import scala.collection.Seq;
import scala.collection.JavaConverters;

GraphFrame g = new GraphFrame(..., ...);
Seq landmarkSeq = JavaConverters.collectionAsScalaIterableConverter(Arrays.asList((Object) "a", (Object) "d")).asScala().toSeq();
g.shortestPaths().landmarks(landmarkSeq).run().show();

或者

g.shortestPaths().landmarks(new ArrayList<Object>(List.of((Object) "a", (Object) "d"))).run().show();

由于API需要Seq<Object>或ArrayList<Object>,所以需要进行java.lang.Object的转换,我不能将ArrayList<String>直接传递以使其正确编译。

运行代码后,我看到了以下消息:

Exception in thread "main" org.apache.spark.sql.AnalysisException: You're using untyped Scala UDF, which does not have the input type information. Spark may blindly pass null to the Scala closure with primitive-type argument, and the closure will see the default value of the Java type for the null argument, e.g. `udf((x: Int) => x, IntegerType)`, the result is 0 for null input. To get rid of this error, you could:
1. use typed Scala UDF APIs(without return type parameter), e.g. `udf((x: Int) => x)`
2. use Java UDF APIs, e.g. `udf(new UDF1<String, Integer> { @Override public Integer call(String s) throws Exception { return s.length(); } }, IntegerType)`, if input types are all non primitive
3. set spark.sql.legacy.allowUntypedScalaUDF to true and use this API with caution;

为了遵循第3点,我添加了以下代码:

System.setProperty("spark.sql.legacy.allowUntypedScalaUDF", "true");

但情况没有改变。

由于关于GraphFrames在Java中的示例代码或StackOverflow问题数量有限,在寻找解决方案时我无法找到有用的信息。

有没有在这个领域有经验的人能帮助我解决这个问题呢?

英文:

I am new to Spark and GraphFrames.

When I wanted to learn about shortestPaths method in GraphFrame, GraphFrames documentation gave me a sample code in Scala, but not in Java.

In their document, they provided following (Scala code):

import org.graphframes.{examples,GraphFrame}
val g: GraphFrame = examples.Graphs.friends  // get example graph

val results = g.shortestPaths.landmarks(Seq(&quot;a&quot;, &quot;d&quot;)).run()
results.select(&quot;id&quot;, &quot;distances&quot;).show()

and in Java, I tried:

import org.graphframes.GraphFrames;
import scala.collection.Seq;
import scala.collection.JavaConverters;

GraphFrame g = new GraphFrame(...,...);
Seq landmarkSeq = JavaConverters.collectionAsScalaIterableConverter(Arrays.asList((Object)&quot;a&quot;,(Object)&quot;d&quot;)).asScala().toSeq();
g.shortestPaths().landmarks(landmarkSeq).run().show();

or

g.shortestPaths().landmarks(new ArrayList&lt;Object&gt;(List.of((Object)&quot;a&quot;,(Object)&quot;d&quot;))).run().show();

Casting to java.lang.Object was necessary since the API demands Seq&lt;Object&gt; or ArrayList&lt;Object&gt; and I could not pass ArrayList&lt;String&gt; to compile it right.

After running the code, I saw the message:

Exception in thread &quot;main&quot; org.apache.spark.sql.AnalysisException: You&#39;re using untyped Scala UDF, which does not have the input type information. Spark may blindly pass null to the Scala closure with primitive-type argument, and the closure will see the default value of the Java type for the null argument, e.g. `udf((x: Int) =&gt; x, IntegerType)`, the result is 0 for null input. To get rid of this error, you could:
1. use typed Scala UDF APIs(without return type parameter), e.g. `udf((x: Int) =&gt; x)`
2. use Java UDF APIs, e.g. `udf(new UDF1[String, Integer] { override def call(s: String): Integer = s.length() }, IntegerType)`, if input types are all non primitive
3. set spark.sql.legacy.allowUntypedScalaUDF to true and use this API with caution;

To follow the 3., I have added the code:

System.setProperty(&quot;spark.sql.legacy.allowUntypedScalaUDF&quot;,&quot;true&quot;);

but situation did not change.

Since there are limited number of sample code or stackoverflow questions about GraphFrames in Java, I could not find any useful information while seeking around.

Could anyone experienced in this area help me solve this problem?

答案1

得分: 0

这似乎是GraphFrames 0.8.0中的一个错误。

请查看问题 #367在github.com中。

英文:

This seems a bug in GraphFrames 0.8.0.

See Issue #367 in github.com

huangapple
  • 本文由 发表于 2020年8月27日 13:13:39
  • 转载请务必保留本文链接:https://go.coder-hub.com/63609595.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定