英文:
ARFaceTrackingConfiguration: How to distinguish pictures from real faces?
问题
我们在应用商店中有几个应用程序,它们使用 ARFaceTrackingConfiguration
在具有FaceID相机的iOS设备上检测用户的脸部。
正如您可能已经注意到的,ARKit还会跟踪您放在iPad Pro/iPhoneX前面的脸部图片,就好像它们是真正的脸部一样。例如,从我们的一个应用程序中获取一张图片(要复制,可以下载并运行苹果的ARFaceTrackingConfiguration示例应用程序):
现在我注意到,在内部,ARKit对待真实的脸部与对待脸部图片不同。因为通常情况下(无论是对于ARWorldTrackingConfiguration
还是ARFaceTrackingConfiguration
),ARKit
尝试匹配真实世界的尺寸和虚拟对象的尺寸,即在您的3D编辑软件中为10x10厘米的对象将与真实世界中的相同尺寸为10x10厘米的对象匹配。
但是当使用面部跟踪时,手机检测到异常大小的脸部(例如上面图片中的小脸,或者脸部较大的人的海报),它将按照检测到的脸部是正常大小的头来缩放FaceGeometry,即头宽度约为14厘米左右。然后,所有虚拟对象将相应缩放,这将导致它们在真实世界中的尺寸不正确。请参考下一张图片:
眼镜的3D模型约为14厘米宽,但它们只显示为4厘米的对象。
相比之下,如果您将眼镜戴在真正的3D脸部上,它们将是正确的尺寸,如果戴在小人头上(约为12厘米),它们将稍微大一些,如果戴在大人头上(约为16厘米),它们将稍微小一些(因为它们在这两种情况下都是实际的14厘米)。
我甚至可以看到ARKit在以下两种模式之间切换:
- 仅使用摄像头图像进行平面脸部检测
- 使用FaceID TrueDepth相机进行脸部检测。
当您将婴儿放在应用程序前面时,这尤其明显。
对于婴儿的头部,ARKit
首先会尝试缩放一切,以使婴儿的头部在虚拟场景中宽14厘米,并且眼镜适合成人。然后,通常在头部出现在摄像头中后的1-2秒内,ARFaceTrackingConfiguration
将从模式(1)切换到模式(2)并显示3D对象的实际尺寸,这会导致小婴儿头部戴着成人大小的眼镜的超可爱照片(此处不显示,因为SO不适用于共享婴儿照片)。
那么,现在的问题是:##
有没有办法确定ARKit是处于模式1还是2?
英文:
we have several apps in the Store that use ARFaceTrackingConfiguration
to detect the users face in iOS devices with FaceID cameras.
As you might have seen, ARKit will also track picture of faces you put in front of your iPad Pro/iPhoneX, as if they were faces. E.g. take a picture from one of our apps (to replicate one can download&run Apples example app for ARFaceTrackingConfiguration):
Now I have noticed that internally ARKit treats real faces differently then it does pictures of faces. Because generally (both for ARWorldTrackingConfiguration
as well as ARFaceTrackingConfiguration
) ARKit
tries to match real world sizes and virtual object sizes, i.e. and object that is 10x10cm in your 3D editing software will match a real world object of the same 10x10cm.
BUT when face-tracking is used, and the phone detects an abnormally sized face (small 4cm wide face as in the picture above or a poster of a person where the face is much bigger) it will scale the FaceGeometry as if the detected face is a normal sized head, i.e. the measurements will be around ~14cm for the head width. All virtual objects will then be scaled accordingly which will make then the wrong size in the real world. C.f. the next picture:
The glasses 3D model is about 14cm wide, yet they are only presented as a 4cm object.
In comparison, if you put the glasses on a real 3D face, they will be the correct size, on a small persons head (like 12cm) they will be slightly too big, on a big persons head (like 16cm) they will be slightly too small (as they will be their real 14cm in both cases).
I can even see ARKit switching between:
- Flat Face-detection using just the camera image
- Face-Detection using the FaceID TrueDepth camera.
which is especially prominent when you hold a baby in front of the app.
With a babys head, ARKit
will first attempt to scale up everything so that the babys head is 14cm wide in the virtual scene and the glasses fit like on an adult.
Then, usually 1-2s after the head appears in the camera ARFaceTrackingConfiguration
will switch from mode (1) to mode (2) and show the real size of the 3D Object, which leads to supercute pictures of small baby heads with adult sized glasses (not shown here as SO isn't for sharing baby pictures).
So, now for the question:
Is there a way of determining whether ARKit is in mode 1 or 2 ?
答案1
得分: 3
以下是您要翻译的内容:
在目前的ARKit 3.0 API中,没有办法进行操作。
> ARKit会话的ARFaceTrackingConfiguration
会不断从运动传感器获取数据,频率为1000 Hz
,从前置RGB摄像头获取数据,频率为60 Hz
,并且从红外摄像头获取数据,频率为15 Hz
。而TrueDepth传感器在会话运行时一直在工作。您无法在ARKit中手动停止TrueDepth传感器。
工作距离在ARFaceTrackingConfiguration
中约为15...100 cm
,因此在该距离内,您可以有效地检测到ARKit 3.0中的最多3张脸。但是在ARKit人脸检测中存在一些逻辑错误 - 您可以同时跟踪自己的脸和您身后海报上的大脸(但是海报上的脸是平的,因为它有等距离深度)。因此,规范蒙版的比例取决于检测到的脸的大小(正如您之前所说),但由于面部跟踪非常CPU密集,ARKit不能立即调整规范蒙版(ARFaceGeometry)的比例。
Apple的TrueDepth模块工作距离范围非常窄,因为来自IR投影仪的3万个点必须具有确定的亮度、模糊度、覆盖范围和点大小,以便ARKit有效使用。
使用此代码,您可以测试TrueDepth模块是否参与了过程:
@available(iOS 13.0, *)
class ViewController: UIViewController {
@IBOutlet var sceneView: ARSCNView!
override func viewDidLoad() {
super.viewDidLoad()
sceneView.session.delegate = self
}
}
extension ViewController: ARSessionDelegate {
func session(_ session: ARSession, didUpdate frame: ARFrame) {
print(sceneView.session.currentFrame?.capturedDepthData?.depthDataQuality as Any)
}
}
通常,每隔四帧都会打印深度数据(但有时间隔可能大于4帧)。
只有一种情况下TrueDepth传感器不会对RGB数据产生影响:当您将智能手机靠近海报或您的脸太近时 - 您只会看到打印出nil
。
英文:
There's no way to do it in ARKit 3.0 API at the moment.
>ARKit session's ARFaceTrackingConfiguration
is constantly getting data from motion sensors at 1000 Hz
, from front RGB camera at 60 Hz
, and from IR camera at 15 Hz
. And TrueDepth sensor is working while the session is running. You can't manually stop TrueDepth sensor in ARKit.
A working distance in ARFaceTrackingConfiguration
is approximately 15...100 cm
, so you can effectively detect up to 3 faces in ARKit 3.0 within that distance. But there's some logical bug in ARKit face detection – you can track your face at the same time as you're tracking a big face on a poster behind you (but face on a poster is flat because it has equidistant depth). So, a canonical mask's scale depends on the size of detected face (as you said before) but ARKit can't momentarily adapt a scale for that canonical mask (ARFaceGeometry) due to the fact that Face Tracking is very CPU intensive.
Apple's TrueDepth module has so narrow working distance range 'cause 30K dots coming from IR projector must have definite brightness, blurriness, coverage and dot size to be effectively used by ARKit.
With this code you could test whether TrueDepth module is involved in a process or not:
@available(iOS 13.0, *)
class ViewController: UIViewController {
@IBOutlet var sceneView: ARSCNView!
override func viewDidLoad() {
super.viewDidLoad()
sceneView.session.delegate = self
}
}
extension ViewController: ARSessionDelegate {
func session(_ session: ARSession, didUpdate frame: ARFrame) {
print(sceneView.session.currentFrame?.capturedDepthData?.depthDataQuality as Any)
}
}
Usually, every fourth frame with depth data is printed (but sometimes a gap is bigger than 4 frames):
There's only one case when TrueDepth sensor doesn't contribute to RGB data: when you move a smartphone too close to a poster or too close to your face – so you'll only see nils
being printed.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论