问题

图像分类模型，如在iNaturalist或iWildcam上训练的动物分类数据，有时会与背景产生虚假关联。如何衡量仅由这些虚假关联引起的模型性能限制，而不是其他合理的（非虚假）原因（即两种动物确实非常相似）？

英文:

Image Classification models trained on animal classification data like iNaturalist or iWildcam sometimes developed spurious correlations with the background. How to measure model performance limitations caused only by such spurious correlations as opposed to other plausible (non-spurious) reasons (i.e 2 animals do look a lot like each other) ?!

答案1

得分: 0

Google [1],[4] 定义了“内分布鲁棒性”为模型在相同数据的保留测试集上的性能。而“外分布鲁棒性”（这是问题的重点）是模型在不同数据集上对相同对象的分类性能。Google 用来展示他们的新一流模型“ViT-Plex”的基准数据集包括：CIFAR10Vs100 [2]，CIFAR100Vs10，ImageNet 对 Places3 和 RETINA。此外，在 PapersWihCode 上，还有多个用于OOD检测的其他基准数据集 [3]。

[1] https://ai.googleblog.com/2022/07/towards-reliability-in-deep-learning.html

[2] https://paperswithcode.com/sota/out-of-distribution-detection-on-cifar-100-vs

[3] https://paperswithcode.com/task/out-of-distribution-detection

[4] https://arxiv.org/pdf/2207.07411.pdf

英文:

Google [1],[4] defines In-Distribution Robustness as a model's performance on the same data hold-out test set. While Out-of-Distribution robustness (which is the focus of the question) is the model's performance on classifying the same object but on a different dataset. Benchmark datasets Google used to demo their new state-of-the-art model "ViT-Plex" were: CIFAR10Vs100 [2], CIFAR100Vs10, ImageNet Vs.Places3 and RETINA. Also in PapersWihCode, there are multiple other benchmark datasets for OOD Detection [3].

[1] https://ai.googleblog.com/2022/07/towards-reliability-in-deep-learning.html

[2] https://paperswithcode.com/sota/out-of-distribution-detection-on-cifar-100-vs

[3] https://paperswithcode.com/task/out-of-distribution-detection

[4]https://arxiv.org/pdf/2207.07411.pdf

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如何衡量图像分类模型的稳健性？

问题

答案1

TensorFlow Keras: Input is empty. [[{{node decode_image/DecodeImage}}]] [[IteratorGetNext]] [Op:__inference_train_function_2877]

无法将大小为16884608的数组重塑为形状（4221152,3）

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论