How to Convert a Screen Point to Real-World Position Using Depth

Lightship's depth map output allows for dynamically placing objects in an AR scene without the use of planes or a mesh. This How-To covers the process of choosing a point on the screen and placing an object in 3D space by using the depth output.

Prerequisites

You will need a Unity project with Lightship AR enabled. For more information, see Installing ARDK 3.

tip

If this is your first time using depth, Accessing and Displaying Depth Information provides a simpler use case for depth and is easier to start with.

Steps

If the main scene is not AR-ready, set it up:

Remove the Main Camera.
Add an ARSession and XROrigin to the Hierarchy, then add an AR Occlusion Manager Component to XROrigin. If you want higher quality occlusions, see How to Set Up Real-World Occlusion to learn how to use the LightshipOcclusionExtension.
Create a script that will handle depth picking and placing prefabs. Name it Depth_ScreenToWorldPosition.

Collect Depth Images on Update

Add a serialized AROcclusionManager and a private XRCpuImage field.

[SerializeField]
private AROcclusionManager _occlusionManager;

private XRCpuImage? _depthimage;

Create a new method called UpdateImage:

Check that the XROcclusionSubsystem is valid and running.
Call _occlusionManager.TryAcquireEnvironmentDepthCpuImage to retrieve the latest depth image form the AROcclusionManager.
Dispose the old depth image and cache the new value.

private void UpdateImage()
{
    if (!_occlusionManager.subsystem.running)
    {
        return;
    }

    if (_occlusionManager.TryAcquireEnvironmentDepthCpuImage(out var image))
    {
        // Dispose the old image
        _depthImage?.Dispose();

        // Cache the new image
        _depthImage = image;
    }
}

Invoke the UpdateImage method within the Update callback:
```
private void Update()
{
    UpdateImage();
}
```

Calculate the display matrix: Because depth images are oriented towards the sensor when surfaced from the Machine Learning model, they need to be sampled with respect to the current screen orientation. The display transform provides a mapping to convert from screen space to the image coordinate system. We use XRCpuImage instead of a GPU Texture so that the Sample(Vector2 uv, Matrix4x4 transform) method can be used on the CPU.

Add a private Matrix4x4 and a ScreenOrientation field.

private Matrix4x4 _displayMatrix;
private ScreenOrientation? _latestScreenOrientation;

Create a new method called UpdateDisplayMatrix.
Check that the script has a valid XRCpuImage cached.
Check if the matrix needs to be recalculated by testing whether the screen orientation has changed.
Call CameraMath.CalculateDisplayMatrix to calculate a matrix that transforms the screen coordinates to image coordinates.

private void UpdateDisplayMatrix()
{
    // Make sure we have a valid depth image
    if (_depthImage is {valid: true})
    {
        // The display matrix only needs to be recalculated if the screen orientation changes
        if (!_latestScreenOrientation.HasValue ||
            _latestScreenOrientation.Value != XRDisplayContext.GetScreenOrientation())
        {
            _latestScreenOrientation = XRDisplayContext.GetScreenOrientation();
            _displayMatrix = CameraMath.CalculateDisplayMatrix(
                _depthImage.Value.width,
                _depthImage.Value.height,
                Screen.width,
                Screen.height,
                _latestScreenOrientation.Value,
                invertVertically: true);
        }
    }
}

Invoke the UpdateDisplayMatrix method within the Update callback:

private void Update()
{
    ...
    UpdateDisplayMatrix();
}

Set up code to Handle Touch Inputs:

Create a private Method named "HandleTouch".
In editor, we'll use "Input.MouseDown" to detect mouse clicks.
For phone, the "Input.GetTouch"
Then, get the 2D screenPosition Coordinates from the device.

private void HandleTouch()
{
    // in the editor we want to use mouse clicks, on phones we want touches.
#if UNITY_EDITOR
        if (Input.GetMouseButtonDown(0) || Input.GetMouseButtonDown(1) || Input.GetMouseButtonDown(2))
        {
            var screenPosition = new Vector2(Input.mousePosition.x, Input.mousePosition.y);
#else
        //if there is no touch or touch selects UI element
        if (Input.touchCount <= 0)
            return;
        var touch = Input.GetTouch(0);

        // only count touches that just began
        if (touch.phase == UnityEngine.TouchPhase.Began)
        {
            var screenPosition = touch.position;
#endif
            // do something with touches
        }
    }
}

Convert touch points from the screen to 3D Coordinates using Depth

In the HandleTouch method, check for a valid depth image when a touch is detected.

    // do something with touches
    if (_depthImage.HasValue)
    {
        // 1. Sample eye depth

        // 2. Get world position

        // 3. Spawn a thing on the depth map
    }

Sample the depth image at the screenPosition to get the z-value

// 1. Sample eye depth
var uv = new Vector2(screenPosition.x / Screen.width, screenPosition.y / Screen.height);
var eyeDepth = (float) _depthImage.Value.Sample(uv, _displayMatrix);

Add a Camera field to the top of script:
```
[SerializeField]
private Camera _camera;
```
This will use Unity's Camera.ScreenToWorldPoint function. Call the method in "HandleTouch" to convert screenPosition and eyeDepth to worldPositions.
```
// 2. Get world position
var worldPosition =
    _camera.ScreenToWorldPoint(new Vector3(screenPosition.x, screenPosition.y, eyeDepth));
```
Spawn a GameObject at this location in world space:

Add a GameObject field to the top of the script:

[SerializeField]
private GameObject _prefabToSpawn;

Instantiate a copy of this prefab at this position:

// 3. Spawn a thing on the depth map
Instantiate(_prefabToSpawn, worldPosition, Quaternion.identity);

Add HandleTouch to the end of the Update method.

    private void HandleTouch()
    {
        // in the editor we want to use mouse clicks, on phones we want touches.
#if UNITY_EDITOR
        if (Input.GetMouseButtonDown(0) || Input.GetMouseButtonDown(1) || Input.GetMouseButtonDown(2))
        {
            var screenPosition = new Vector2(Input.mousePosition.x, Input.mousePosition.y);
#else
        //if there is no touch or touch selects UI element
        if (Input.touchCount <= 0)
            return;
        var touch = Input.GetTouch(0);

        // only count touches that just began
        if (touch.phase == UnityEngine.TouchPhase.Began)
        {
            var screenPosition = touch.position;
#endif
            // do something with touches
            if (_depthImage.HasValue)
            {
                // Sample eye depth
                var uv = new Vector2(screenPosition.x / Screen.width, screenPosition.y / Screen.height);
                var eyeDepth = (float) _depthImage.Value.Sample(uv, _displayMatrix);
                
                // Get world position
                var worldPosition =
                    _camera.ScreenToWorldPoint(new Vector3(screenPosition.x, screenPosition.y, eyeDepth));
                
                //spawn a thing on the depth map
                Instantiate(_prefabToSpawn, worldPosition, Quaternion.identity);
            }
        }
    }

Add the Depth_ScreenToWorldPosition script as a Component of the XROrigin in the Hierarchy:
1. In the Hierarchy window, select the XROrigin, then click Add Component in the Inspector.
2. Search for the Depth_ScreenToWorldPosition script, then select it.
Create a Cube to use as the object that will spawn into the scene:
1. In the Hierarchy, right-click, then, in the Create menu, mouse over 3D Object and select Cube.
2. Drag the new Cube object from the Hierarchy to the Assets window to create a prefab of it, then delete it from the Hierarchy. (The Cube in the Assets window should remain.)
Assign the fields in the Depth_ScreenToWorldPosition script:
1. In the Hierarchy window, select the XROrigin, then expand the Depth_ScreenToWorldPosition Component in the Inspector window.
2. Assign the XROrigin to the AROcclusionManager field.
3. Assign the Main Camera to the Camera field.
4. Assign the Main Camera to the ARCameraManager field.
5. Assign your new Cube prefab to the Prefab to Spawn field.
Try running the scene in-editor using Playback or open Build Settings, then click Build and Run to build to device.
If something did not work, double check the steps above and compare your script to the one below.

Click to show the Depth_ScreenToWorldPosition script

using Niantic.Lightship.AR.Utilities;
using UnityEngine;
using UnityEngine.XR.ARFoundation;
using UnityEngine.XR.ARSubsystems;

public class Depth_ScreenToWorldPosition : MonoBehaviour
{
    [SerializeField]
    private AROcclusionManager _occlusionManager;

    [SerializeField]
    private Camera _camera;

    [SerializeField]
    private GameObject _prefabToSpawn;

    private Matrix4x4 _displayMatrix;
    private XRCpuImage? _depthImage;
    private ScreenOrientation? _latestScreenOrientation;

    private void Update()
    {
        UpdateImage();
        UpdateDisplayMatrix();
        HandleTouch();
    }

    private void OnDestroy()
    {
        // Dispose the cached depth image
        _depthImage?.Dispose();
    }

    private void UpdateImage()
    {
        if (!_occlusionManager.subsystem.running)
        {
            return;
        }

        if (_occlusionManager.TryAcquireEnvironmentDepthCpuImage(out var image))
        {
            // Dispose the old image
            _depthImage?.Dispose();

            // Cache the new image
            _depthImage = image;
        }
    }

    private void UpdateDisplayMatrix()
    {
        // Make sure we have a valid depth image
        if (_depthImage is {valid: true})
        {
            // The display matrix only needs to be recalculated if the screen orientation changes
            if (!_latestScreenOrientation.HasValue ||
                _latestScreenOrientation.Value != XRDisplayContext.GetScreenOrientation())
            {
                _latestScreenOrientation = XRDisplayContext.GetScreenOrientation();
                _displayMatrix = CameraMath.CalculateDisplayMatrix(
                    _depthImage.Value.width,
                    _depthImage.Value.height,
                    Screen.width,
                    Screen.height,
                    _latestScreenOrientation.Value,
                    invertVertically: true);
            }
        }
    }

    private void HandleTouch()
    {
        // In the editor we want to use mouse clicks, on phones we want touches.
#if UNITY_EDITOR
        if (Input.GetMouseButtonDown(0) || Input.GetMouseButtonDown(1) || Input.GetMouseButtonDown(2))
        {
            var screenPosition = new Vector2(Input.mousePosition.x, Input.mousePosition.y);
#else
        // If there is no touch or touch selects UI element
        if (Input.touchCount <= 0)
            return;
        var touch = Input.GetTouch(0);

        // Only count touches that just began
        if (touch.phase == UnityEngine.TouchPhase.Began)
        {
            var screenPosition = touch.position;
#endif
            // Do something with touches
            if (_depthImage is {valid: true})
            {
                // Sample eye depth
                var uv = new Vector2(screenPosition.x / Screen.width, screenPosition.y / Screen.height);
                var eyeDepth = _depthImage.Value.Sample<float>(uv, _displayMatrix);

                // Get world position
                var worldPosition =
                    _camera.ScreenToWorldPoint(new Vector3(screenPosition.x, screenPosition.y, eyeDepth));

                // Spawn a thing on the depth map
                Instantiate(_prefabToSpawn, worldPosition, Quaternion.identity);
            }
        }
    }
}

More Information

You can also try combining this How-To with ObjectDetection or Semantics to know where things are in 3D space.

Placing Cubes with Depth and Object Detection

How to Convert a Screen Point to Real-World Position Using Depth

Prerequisites​

Steps​

More Information​

Prerequisites

Steps

More Information