ACM ISS 2023
Sun 5 - Wed 8 November 2023

Segmenting and determining the 3D bounding boxes of objects of interest in RGB videos is an important task for a variety of applications such as augmented reality, navigation, and robotics. Supervised machine learning techniques are commonly used for this, but they need training datasets: sets of images with associated 3D bounding boxes manually defined by human annotators using a labelling tool. However, precisely placing 3D bounding boxes can be difficult using conventional 3D manipulation tools on a 2D interface. To alleviate that burden, we propose a novel technique with which 3D bounding boxes can be created by simply drawing 2D bounding rectangles on multiple frames of a video sequence showing the object from different angles. The method uses reconstructed dense 3D point clouds from the video and computes tightly fitting 3D bounding boxes of desired objects selected by back-projecting the 2D rectangles. We show concrete application scenarios of our interface, including training dataset creation and editing 3D spaces and videos. An evaluation comparing our technique with a conventional 3D annotation tool shows that our method results in higher accuracy. We also confirm that the bounding boxes created with our interface have a lower variance, likely yielding more consistent labels and datasets.

Wed 8 Nov

Displayed time zone: Eastern Time (US & Canada) change

11:00 - 12:15
Session 6: Immersion, Audio, and MultimediaPapers at Schenley Ballroom
Chair(s): Tigmanshu Bhatnagar
11:00
25m
Talk
Embodied Provenance for Immersive Sensemaking
Papers
A: Yidan Zhang Monash University, A: Barrett Ens Monash University, A: Kadek Satriadi Monash University, A: Ying Yang Monash University, A: Sarah Goodwin Monash
DOI Media Attached
11:25
25m
Talk
Hum-ble Beginnings: Developing Touch- and Proximity-Input-Based Interfaces for Zoo-Housed Giraffes’ Audio EnrichmentHonorable Mention
Papers
A: Alana Grant University of Glasgow, A: Vilma Kankaanpää University of Glasgow, A: Ilyena Hirskyj-Douglas University of Glasgow
DOI
11:50
25m
Talk
Interactive 3D Annotation of Objects in Moving Videos from Sparse Multi-view Frames
Papers
A: Kotaro Oomori The University of Tokyo, A: Wataru Kawabe The University of Tokyo, A: Fabrice Matulic Preferred Networks, A: Takeo Igarashi The University of Tokyo, A: Keita Higuchi Preferred Networks
DOI