How to Convert a Screen Point to Real-World Position Using Depth
Lightship's depth map output allows for dynamically placing objects in an AR scene without the use of planes or a mesh. This How-To covers the process of choosing a point on the screen and placing an object in 3D space by using the depth output.
Prerequisites
You will need a Unity project with Lightship AR enabled. For more information, see Installing ARDK 3.
If this is your first time using depth, Accessing and Displaying Depth Information provides a simpler use case for depth and is easier to start with.
Steps
If the main scene is not AR-ready, set it up:
Remove the Main Camera.
Add an ARSession and XROrigin to the Hierarchy, then add an AR Occlusion Manager Component to XROrigin. If you want higher quality occlusions, see How to Set Up Real-World Occlusion to learn how to use the
LightshipOcclusionExtension
.Create a script that will handle depth picking and placing prefabs. Name it
Depth_ScreenToWorldPosition
.Add Camera Update Events:
This will store the most recently used
DisplayMatrix
to later align the depth texture to the screen space.Add a serialized
ARCameraManager
and a privateMatrix4x4
field to the top of the script.[SerializeField]
private ARCameraManager _arCameraManager;
private Matrix4x4 _displayMatrix;Subscribe to
OnCameraFrameEventReceived
for receiving camera updates.private void OnCameraFrameEventReceived(ARCameraFrameEventArgs args)
{
// Cache the screen to image transform
if (args.displayMatrix.HasValue)
{
#if UNITY_IOS
_displayMatrix = args.displayMatrix.Value.transpose;
#else
_displayMatrix = args.displayMatrix.Value;
#endif
}
}noteYou must transpose the DisplayMatrix on iOS.
Collect Depth Images on Update
Add a serialized
AROcclusionManager
and a privateXRCpuImage
field.[SerializeField]
private AROcclusionManager _occlusionManager;
private XRCpuImage? _depthimage;noteDepth images are surfaced from the Machine Learning model rotated 90 degrees and need to be sampled with respect to the display transform. The display transform gives a mapping to convert from screen space to the ML coordinate system.
XRCpuImage
is being used instead of a GPU Texture, so theSample(Vector2 uv, Matrix4x4 transform)
method can be used on the CPU.In the
Update
method:- Check that the
XROcclusionSubsystem
is valid and running. - Call
_occlusionManager.TryAcquireEnvironmentDepthCpuImage
to retrieve the latest depth image form theAROcclusionManager
. - Dispose the old depth image and cache the new value.
void Update()
{
if (!_occlusionManager.subsystem.running)
{
return;
}
if (_occlusionManager.TryAcquireEnvironmentDepthCpuImage(out var image))
{
// Dispose the old image
_depthImage?.Dispose();
_depthImage = image;
}
else
{
return;
}
}- Check that the
Set up code to Handle Touch Inputs:
- Create a private Method named "HandleTouch".
- In editor, we'll use "Input.MouseDown" to detect mouse clicks.
- For phone, the "Input.GetTouch"
- Then, get the 2D screenPosition Coordinates from the device.
private void HandleTouch()
{
// in the editor we want to use mouse clicks, on phones we want touches.
#if UNITY_EDITOR
if (Input.GetMouseButtonDown(0) || Input.GetMouseButtonDown(1) || Input.GetMouseButtonDown(2))
{
var screenPosition = new Vector2(Input.mousePosition.x, Input.mousePosition.y);
#else
//if there is no touch or touch selects UI element
if (Input.touchCount <= 0)
return;
var touch = Input.GetTouch(0);
// only count touches that just began
if (touch.phase == UnityEngine.TouchPhase.Began)
{
var screenPosition = touch.position;
#endif
// do something with touches
}
}
}Convert touch points from the screen to 3D Coordinates using Depth
In the HandleTouch method, check for a valid depth image when a touch is detected.
// do something with touches
if (_depthImage.HasValue)
{
// 1. Sample eye depth
// 2. Get world position
// 3. Spawn a thing on the depth map
}Sample the depth image at the screenPosition to get the z-value
// 1. Sample eye depth
var uv = new Vector2(screenPosition.x / Screen.width, screenPosition.y / Screen.height);
var eyeDepth = (float) _depthImage.Value.Sample(uv, _displayMatrix);Add a
Camera
field to the top of script:[SerializeField]
private Camera _camera;This will use Unity's
Camera.ScreenToWorldPoint
function. Call the method in "HandleTouch" to convert screenPosition and eyeDepth to worldPositions.// 2. Get world position
var worldPosition =
_camera.ScreenToWorldPoint(new Vector3(screenPosition.x, screenPosition.y, eyeDepth));Spawn a GameObject at this location in world space:
Add a
GameObject
field to the top of the script:[SerializeField]
private GameObject _prefabToSpawn;Instantiate a copy of this prefab at this position:
// 3. Spawn a thing on the depth map
Instantiate(_prefabToSpawn, worldPosition, Quaternion.identity);
Add
HandleTouch
to the end of theUpdate
method.private void HandleTouch()
{
// in the editor we want to use mouse clicks, on phones we want touches.
#if UNITY_EDITOR
if (Input.GetMouseButtonDown(0) || Input.GetMouseButtonDown(1) || Input.GetMouseButtonDown(2))
{
var screenPosition = new Vector2(Input.mousePosition.x, Input.mousePosition.y);
#else
//if there is no touch or touch selects UI element
if (Input.touchCount <= 0)
return;
var touch = Input.GetTouch(0);
// only count touches that just began
if (touch.phase == UnityEngine.TouchPhase.Began)
{
var screenPosition = touch.position;
#endif
// do something with touches
if (_depthImage.HasValue)
{
// Sample eye depth
var uv = new Vector2(screenPosition.x / Screen.width, screenPosition.y / Screen.height);
var eyeDepth = (float) _depthImage.Value.Sample(uv, _displayMatrix);
// Get world position
var worldPosition =
_camera.ScreenToWorldPoint(new Vector3(screenPosition.x, screenPosition.y, eyeDepth));
//spawn a thing on the depth map
Instantiate(_prefabToSpawn, worldPosition, Quaternion.identity);
}
}
}Add the
Depth_ScreenToWorldPosition
script as a Component of theXROrigin
in the Hierarchy:- In the Hierarchy window, select the
XROrigin
, then click Add Component in the Inspector. - Search for the
Depth_ScreenToWorldPosition
script, then select it.
- In the Hierarchy window, select the
Create a Cube to use as the object that will spawn into the scene:
- In the Hierarchy, right-click, then, in the Create menu, mouse over 3D Object and select Cube.
- Drag the new Cube object from the Hierarchy to the Assets window to create a prefab of it, then delete it from the Hierarchy. (The Cube in the Assets window should remain.)
Assign the fields in the
Depth_ScreenToWorldPosition
script:- In the Hierarchy window, select the
XROrigin
, then expand theDepth_ScreenToWorldPosition
Component in the Inspector window. - Assign the
XROrigin
to the AROcclusionManager field. - Assign the Main Camera to the Camera field.
- Assign the Main Camera to the ARCameraManager field.
- Assign your new Cube prefab to the Prefab to Spawn field.
- In the Hierarchy window, select the
Try running the scene in-editor using Playback or open Build Settings, then click Build and Run to build to device.
If something did not work, double check the steps above and compare your script to the one below.
Click to show the Depth_ScreenToWorldPosition script
using Niantic.Lightship.AR.Utilities;
using UnityEngine;
using UnityEngine.XR.ARFoundation;
using UnityEngine.XR.ARSubsystems;
public class Depth_ScreenToWorldPosition : MonoBehaviour
{
[SerializeField]
private AROcclusionManager _occlusionManager;
[SerializeField]
private ARCameraManager _arCameraManager;
[SerializeField]
private Camera _camera;
[SerializeField]
private GameObject _prefabToSpawn;
private Matrix4x4 _displayMatrix;
private XRCpuImage? _depthImage;
private void OnEnable()
{
_arCameraManager.frameReceived += OnCameraFrameEventReceived;
}
private void OnDisable()
{
_arCameraManager.frameReceived -= OnCameraFrameEventReceived;
}
private void OnCameraFrameEventReceived(ARCameraFrameEventArgs args)
{
// Cache the screen to image transform
if (args.displayMatrix.HasValue)
{
#if UNITY_IOS
_displayMatrix = args.displayMatrix.Value.transpose;
#else
_displayMatrix = args.displayMatrix.Value;
#endif
}
}
void Update()
{
if (!_occlusionManager.subsystem.running)
{
return;
}
if (_occlusionManager.TryAcquireEnvironmentDepthCpuImage(out
var image))
{
// Dispose the old image
_depthImage?.Dispose();
_depthImage = image;
}
else
{
return;
}
HandleTouch();
}
private void HandleTouch()
{
// in the editor we want to use mouse clicks, on phones we want touches.
#if UNITY_EDITOR
if (Input.GetMouseButtonDown(0) || Input.GetMouseButtonDown(1) || Input.GetMouseButtonDown(2))
{
var screenPosition = new Vector2(Input.mousePosition.x, Input.mousePosition.y);
#else
//if there is no touch or touch selects UI element
if (Input.touchCount <= 0)
return;
var touch = Input.GetTouch(0);
// only count touches that just began
if (touch.phase == UnityEngine.TouchPhase.Began)
{
var screenPosition = touch.position;
#endif
// do something with touches
if (_depthImage.HasValue)
{
// Sample eye depth
var uv = new Vector2(screenPosition.x / Screen.width, screenPosition.y / Screen.height);
var eyeDepth = (float) _depthImage.Value.Sample(uv, _displayMatrix);
// Get world position
var worldPosition =
_camera.ScreenToWorldPoint(new Vector3(screenPosition.x, screenPosition.y, eyeDepth));
//spawn a thing on the depth map
Instantiate(_prefabToSpawn, worldPosition, Quaternion.identity);
}
}
}
}
More Information
You can also try combining this How-To with ObjectDetection or Semantics to know where things are in 3D space.