Semantics
Semantic segmentation is the process of assigning class labels to specific regions in an image. This richer understanding of the environment enables a variety of creative AR features. For example, an AR pet could identify ground to run along, AR planets could fill the sky, the real-world ground could turn into AR lava, and more!
ARDK 3 provides this feature through an ARSemanticSegmentationManager
that implements a new XR Semantics subsystem. ARSemanticSegmentationManager
serves semantic predictions as a buffer of unsigned integers for each pixel of the depth map. Each of a pixel’s 32 bits correspond to a semantic class and is either enabled (value is 1) or disabled (value is 0), depending on whether a part of an object in that class exists at that pixel. The ARDK also provides a normalized (values between 0 and 1) confidence map for each category that applications can query individually.
Each pixel can have more than one class label, e.g. ground
and natural_ground
.
Available Semantic Channels
The following table lists the current set of semantic channels. Because the ordering of channels in this list may change with new versions of ARDK, we recommend that you use names rather than indices in your app. Use LightshipSemanticsSubsystem.GetChannelNames
to verify names at runtime.
Index | Channel Name | Notes |
---|---|---|
[0] | sky | Includes clouds. Does not include fog. |
[1] | ground | Includes everything in natural_ground and artificial_ground . Ground may be more reliable than the combination of the two where there is ambiguity about natural vs artificial. |
[2] | natural_ground | Includes dirt, grass, sand, mud, and other organic / natural ground. Ground with heavy vegetation or foliage may be detected as foliage instead. |
[3] | artificial_ground | Includes roads, sidewalks, tracks, carpets, rugs, flooring, paths, gravel, and some playing fields. |
[4] | water | Includes rivers, seas, ponds, lakes, pools, waterfalls, some puddles. Does not include drinking water, water in cups, bowls, sinks, baths. Water with strong reflections may be detected as what is in the reflection instead of as water. |
[5] | person | Includes body parts, hair, and clothes worn. Does not include accessories or carried objects. Does not distinguish between individuals. Mannequins, toys, statues, paintings, and other artistic expressions of a person are not considered a “person”, though some detections may occur. Photorealistic images of a person are detected as a “person”. Model performance may suffer when a person is partially visible in the image or is in an unusual body posture, such as when crouching or widely spreading their arms. |
[6] | building | Includes residential and commercial buildings, both modern and traditional. Should not be considered synonymous with walls. |
[7] | foliage | Includes bushes, shrubs, leafy parts and trunks of trees, potted plants, and flowers. |
[8] | grass | Grassy ground, e.g. lawns, rather than tall grass. |
Experimental Semantic Channels
In addition to these standard channels, Lightship also offers a set of experimental semantic channels. For more information, see Experimental Semantic Channels.
Example of Usage
In this example, the semantic segmentation system detects the sky as sky
and sets index[0]
to 1, then detects the ground as ground
and sets index[1]
to 1, as shown in the array diagrams.