英文:
Creating generic classes in Java containing an Array
问题
我试图解决一个经典问题(找到随机集合中的前K个元素)。我试图使用泛型来做这个,但是我一直遇到奇怪的“错误”。我不确定我所做的是否是正确的方式,但老实说,我没有看到另一种方式。
接口声明
public interface TopK<T> {
T[] getMostOccurrences(T[] items, int k);
}
实现
public class TopKHeap<T> implements TopK<T> {
private T[] items;
private Map<T, Integer> occurrences;
@Override
public T[] getMostOccurrences(T[] items, int k) {
this.items = items;
countOccurrences();
PriorityQueue<T> minHeap = new PriorityQueue<>((n1, n2) -> occurrences.get(n1) - occurrences.get(n2));
for (T t : occurrences.keySet()) {
minHeap.add(t);
if (minHeap.size() > k) {
minHeap.poll();
}
}
List<T> topItems = new ArrayList<>(k);
for (int idx = k - 1; idx >= 0; idx--) {
topItems.add(0, minHeap.poll());
}
return topItems.toArray((T[]) new Object[0]);
}
private void countOccurrences() {
occurrences = new HashMap<>();
for (T t : items) {
occurrences.put(t, occurrences.getOrDefault(t, 0) + 1);
}
}
}
测试用例
@Test
public void testTopItems() {
String[] input = {
"John",
"John",
"John",
"Jane",
"Jane",
"Jane",
"Jane",
"Michael",
"Emily",
"Emily"
};
TopK<String> top = new TopKHeap<>();
assertThat(new String[]{"Jane", "John"}, Matchers.arrayContaining(top.getMostOccurrences(input, 2)));
}
我一直不断地收到以下错误:
java.lang.ClassCastException: class [Ljava.lang.Object; cannot be cast to class [Ljava.lang.String; ([Ljava.lang.Object; and [Ljava.lang.String; are in module java.base of loader 'bootstrap')
我尝试了几种不同的方法。最初,我没有使用ArrayList,而是使用了T[] topItems = (T[]) new Object[k];
,但结果相同。我用ArrayList替换了数组,因为在Java中,集合通常更好地处理泛型,但显然没有解决问题。
我想问题有两个方面。为什么它一直抛出这个错误?第二个问题是,有没有更优雅的方式来解决这个问题?我的意思是,创建一个使用数组的泛型类。
编辑:
方法本身运行正常,但一旦我创建对输出结果的引用,它就会抛出ClassCastException。所以每当我这样做时,它就会失败:
String[] result = top.getMostOccurrences(input, 2);
英文:
I'm trying to solve a classic problem (find the top K elements in a random collection). I'm trying to do this with Generics, but I keep on getting strange "errors". I'm not sure if what I'm doing is the right way of doing these things, but I don't see another way to be honest.
Interface declaration
public interface TopK<T> {
T[] getMostOccurrences(T[] items, int k);
}
Implementation
public class TopKHeap<T> implements TopK<T> {
private T[] items;
private Map<T, Integer> occurrences;
@Override
public T[] getMostOccurrences(T[] items, int k) {
this.items = items;
countOccurrences();
PriorityQueue<T> minHeap = new PriorityQueue<>((n1, n2) -> occurrences.get(n1) - occurrences.get(n2));
for (T t : occurrences.keySet()) {
minHeap.add(t);
if (minHeap.size() > k) {
minHeap.poll();
}
}
List<T> topItems = new ArrayList<>(k);
for (int idx = k - 1; idx >= 0; idx--) {
topItems.add(0, minHeap.poll());
}
return (T[]) topItems.toArray();
}
private void countOccurrences() {
occurrences = new HashMap<>();
for (T t : items) {
occurrences.put(t, occurrences.getOrDefault(t, 0) + 1);
}
}
}
Test Case
@Test
public void testTopItems() {
String[] input = {
"John",
"John",
"John",
"Jane",
"Jane",
"Jane",
"Jane",
"Michael",
"Emily",
"Emily"
};
TopK<String> top = new TopKHeap<>();
assertThat(new String[]{"Jane", "John"}, Matchers.arrayContaining(top.getMostOccurrences(input, 2)));
}
I constantly get the following error:
java.lang.ClassCastException: class [Ljava.lang.Object; cannot be cast to class [Ljava.lang.String; ([Ljava.lang.Object; and [Ljava.lang.String; are in module java.base of loader 'bootstrap')
I've tried a couple of different approaches here. Initially I didn't use the ArrayList, but used T[] topItems = (T[]) new Object[k];
, which yielded the same result. I replaced the array with an ArrayList, because Collections in general are better in handling Generics in Java, but apparently that didn't solve it.
The question is two-fold I guess. Why does it keep on throwing that error? And second question is, what is a more elegant way to solve this problem? By problem I mean, creating a generic class that uses Arrays.
edit>
The method itself runs fine, but the moment I create a reference to the result of the output, it throws the ClassCastException. So whenever I do this, it falls over:
String[] result = top.getMostOccurrences(input, 2);
答案1
得分: 2
你提出了两个问题:
- 为什么会发生异常?
- 在创建泛型类时,有什么更优雅的解决方案来使用数组?
请阅读我可以每次只提一个问题吗?以供以后参考。
英文:
You asked two questions:
- Why is the exception occurring?
- What would be a more elegant solution to the problem of creating generic classes using arrays?
For future posts, please read: Can I ask only one question per post?
Why is the exception occurring?
There are two parts to this answer. First, the question is why the exception occurs. Second, the question why it occurs where it occurs and why it does not occur when the returned value is ignored on the calling side.
Why does the exception occur?
Looking at the implementation of method getMostOccurrences(...)
in TopKHeap
, we see that (T[]) topItems.toArray()
is returned. topItems
is a List
, and List::toArray()
returns an Object[]
. That means, if any type check were to be executed to assert that the value returned by TopKHeap::getMostOccurences
is something other than Object[]
, it has to fail.
Since in the test, the type system tries to match Object[]
to a String[]
, we see a ClassCastException
.
Why does the exception occur when and where it occurs?
Instead of looking at the original code, we are going to look at a simplified version of the same problem:
class Test {
public static void main (String... args) {
Test.<String>foo();
Object bar = Ideone.<String>foo();
String baz = Ideone.<String>foo(); // <- ClassCastException thrown here
}
static <T> T foo() {
return (T) new Object();
}
}
This code will throw a ClassCastException
on line 5. The obvious questions are:
- Why do the two calls to
Test.<String>foo()
on lines 3 and 4 not throw aClassCastException
? - Why is the exception thrown on line 5, not on the
return ...
in methodfoo()
?
Both questions have the same answer: the JLS does not define where to palce the type check, "and it is up to the compiler implementation to decide where to insert casts or not, as long as the erased code meets the type safety rules of non-generic code." (kudos to newacct).
In short: the compiler is free to place the typecast where it sees fit. And in the implementation use in this answer, the compiler placed the type safety check at the assignment rather than the method return.
What would be a more elegant solution to the problem of creating generic classes using arrays?
First and foremost, try to avoid mixing arrays and generics. Since arrays are covariant and retained, while generics are invariant and erased, using them in combination is a recipe for trouble.
Second, if you cannot avoid it, try to use the array as internal state. The implementation of ArrayList
uses an Object[]
internall as backing data structure, but never leaks the array to the outside (ArrayList::toArray
returns a copy of the internal array).
If you have to leak an interal array in a generic way, the implementation tends to get clumsy (expecting Class<T>
instances as parameters for array creation) and rely on reflection, as shown in this answer by Michael Queue.
答案2
得分: 0
为什么它一直抛出那个错误?
ClassCastException
被抛出是因为 (T[]) topItems.toArray();
返回一个 Object[]
,而您试图在 Junit 断言中将其强制转换为 String[]
,如 assertThat(new String[]{"Jane", "John"}, Matchers.arrayContaining(top.getMostOccurrences(input, 2)));
有没有更优雅的解决方法?
List
的 <T> T[] toArray(T[] a);
方法接受类型为 T
的数组并返回 T[]
。您可以使用反射获取数组的类型(数组类型在运行时保留)如下 -
Class<?> itemsClass = items.getClass().getComponentType();
return topItems.toArray(((T[]) Array.newInstance(itemsClass, topItems.size())));
这里的 items.getClass().getComponentType()
返回数组元素的类类型。即使数组是 协变 的,您仍然可以安全地使用这个方法,因为类型由 TopK<String> top = new TopKHeap<>();
强制执行。
完整的方法 -
@Override
public T[] getMostOccurrences(T[] items, int k) {
this.items = items;
countOccurrences();
PriorityQueue<T> minHeap =
new PriorityQueue<>((n1, n2) -> occurrences.get(n1) - occurrences.get(n2));
for (T t : occurrences.keySet()) {
minHeap.add(t);
if (minHeap.size() > k) {
minHeap.poll();
}
}
List<T> topItems = new ArrayList<>(k);
for (int idx = k - 1; idx >= 0; idx--) {
topItems.add(0, minHeap.poll());
}
Class<?> itemsClass = items.getClass().getComponentType();
return topItems.toArray(((T[]) Array.newInstance(itemsClass, topItems.size())));
}
英文:
Why does it keep on throwing that error?
ClassCastException
is thrown because (T[]) topItems.toArray();
returns an Object[]
and you are trying to cast that to String[]
in Junit assert as assertThat(new String[]{"Jane", "John"}, Matchers.arrayContaining(top.getMostOccurrences(input, 2)));
What is a more elegant way to solve this problem?
<T> T[] toArray(T[] a);
method of List
takes array of type T
and returns T[]
. You can get the type of array using reflection (array type is retained at runtime) as -
Class<?> itemsClass = items.getClass().getComponentType();
return topItems.toArray(((T[]) Array.newInstance(itemsClass, topItems.size())));
Here items.getClass().getComponentType()
return class type of array element. Even though array is covariant you can still use this safely as type is enforced by TopK<String> top = new TopKHeap<>();
Complete method -
@Override
public T[] getMostOccurrences(T[] items, int k) {
this.items = items;
countOccurrences();
PriorityQueue<T> minHeap =
new PriorityQueue<>((n1, n2) -> occurrences.get(n1) - occurrences.get(n2));
for (T t : occurrences.keySet()) {
minHeap.add(t);
if (minHeap.size() > k) {
minHeap.poll();
}
}
List<T> topItems = new ArrayList<>(k);
for (int idx = k - 1; idx >= 0; idx--) {
topItems.add(0, minHeap.poll());
}
Class<?> itemsClass = items.getClass().getComponentType();
return topItems.toArray(((T[]) Array.newInstance(itemsClass, topItems.size())));
}
答案3
得分: 0
A1: 它抛出该错误是因为您调用了无参数的 List.toArray();它被指定返回一个 Object[]
。
所以当您执行 String[] result = top.getMostOccurrences(input, 2)
时,您会得到一个 ClassCastException
,原因与您不能执行 String notAString = new Object()
相同。换句话说:一个 String
是一个 Object
,但一个 Object
不是一个 String
。
但是反过来是可以的:Object[] OK = new String[]{ "foo", "bar" }
A2: 这取决于您认为什么是「优雅」:) 在我看来,我在这里实现的解决方案 符合您的要求...
public class TopKHeap<T extends String> implements TopK<T> { /* <-- T的显式限制 */
...
@Override
public T[] getMostOccurrences(int k, T...items) { /* <-- 不是必需的;只是我的个人喜好 */
...
T[] typeEnforcer = copyOf( items, topItems.size( ) ); /* <-- 这是使其类型安全的关键... */
return topItems.toArray( typeEnforcer ); /* <-- ...不需要强制转换 */
}
}
然后您可以像最初那样使用它。在[*我的演示*](https://www.browxy.com#USER_306319)中...
TopK<String> top = new TopKHeap<>();
String[] output = top.getMostOccurrences( 2, input);
for(String name : output )
out.printf("%s ", name);
...我只是打印出我得到的...
Jane John
英文:
My answers are a lot simpler (in my opinion) than the two that preceded mine:
> Q1: „...Why does it keep on throwing that error?...“
A1: It throws that error because you're calling the no-arg List.toArray(); which is specified to return an Object[]
.
So you get a ClassCastException
when you do String[] result = top.getMostOccurrences(input, 2)
, for the same reason you can't do String notAString = new Object()
. In other words: A String
IS A Object
but an Object
IS NOT A String
.
Going the other way around would be OK though: Object[] OK = new String[]{ "foo", "bar" }
> Q2: „...a more elegant way to solve this problem?...“
A2: That depends on what you consider to be „elegant“ In my opinion, the solution I implemented here meets your requirements…
public class TopKHeap<T extends String> implements TopK<T> { /* <-- An explicit bound on T*/
...
@Override
public T[] getMostOccurrences(int k, T...items) { /* <-- Not essential; just my personal preference */
...
T[] typeEnforcer = copyOf( items, topItems.size( ) ); /* <-- This is what makes it type safe... */
return topItems.toArray( typeEnforcer ); /* <-- ...no need to cast */
}
}
Then you'd use that just like you originally did. In my demo though…
TopK<String> top = new TopKHeap<>();
String[] output = top.getMostOccurrences( 2, input);
for(String name : output )
out.printf("%s ", name);
…I just print out what I get…
Jane John
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论