OmniObject3D

OmniObject3D: Large-Vocabulary 3D Object Dataset for Realistic Perception, Reconstruction and Generation

CVPR 2023 (Award Candidate)

Tong Wu^1,2, Jiarui Zhang^1,3, Xiao Fu¹, Yuxin Wang^1,4, Jiawei Ren⁵, Liang Pan⁵,
Wayne Wu¹, Lei Yang^1,3, Jiaqi Wang¹, Chen Qian¹, Dahua Lin^1,2✉, Ziwei Liu⁵✉

¹Shanghai Artificial Intelligence Laboratory, ²The Chinese University of Hong Kong, ³SenseTime Research,
⁴Hong Kong University of Science and Technology, ⁵S-Lab, Nanyang Technological University

Paper Code Dataset Challenge

Abstract

We propose OmniObject3D, a large vocabulary 3D object dataset with massive high-quality real-scanned 3D objects to facilitate the development of 3D perception, reconstruction, and generation in the real world.

OmniObject3D has several appealing properties:
1) Large Vocabulary: It comprises 6,000 scanned objects in 190 daily categories, sharing common classes with popular 2D datasets (e.g., ImageNet and LVIS), benefiting the pursuit of generalizable 3D representations.
2) Rich Annotations: Each 3D object is captured with both 2D and 3D sensors, providing textured meshes, point clouds, multi-view rendered images, and multiple real-captured videos.
3) Realistic Scans: The professional scanners support high-quality object scans with precise shapes and realistic appearances.

With the vast exploration space offered by OmniObject3D, we carefully set up four evaluation tracks: a) robust 3D perception, b) novel-view synthesis, c) neural surface reconstruction, and d) 3D object generation.

Statistics and Distribution

Dataset	Real	Full 3D	Video	#Objects	#Classes	R^LVIS(%)
ShapeNet		✓		51k	55	4.1
ModelNet		✓		12k	40	2.4
3D-Future		✓		16k	34	1.3
ABO		✓		8k	63	3.5
Toys4K		✓		4k	105	7.7
CO3D	✓		✓	19K	50	4.2
DTU	✓	✓		124	-	0
ScanObjectNN	✓			15k	15	1.3
GSO	✓	✓		1k	17	0.9
AKB-48	✓	✓		2k	48	1.8
Ours	✓	✓	✓	6k	190	10.8

Table 1. A comparison between OmniObject3D and other commonly-used 3D object datasets. It is the largest among all the real-world scanned object datasets.

Figure 1. Semantic distribution of our dataset.

Benchmarks

Robust 3D Perception

OmniObject3D boosts robustness analysis of point cloud classification by disentangling the two critical out-of-distribution (OOD) challenges introduced in the paper, i.e., OOD styles and OOD corruptions.

Figure 2. Analysis on robustness to OOD styles and OOD corruptions.

Novel View Synthesis

We study several representative methods on OmniObject3D for novel view synthesis (NVS) in two settings: 1) training on a single scene with densely captured images and 2) learning priors across scenes from the dataset to explore the generalization ability of NeRF-style models. We show examples of single-scene NVS by Mip-NeRF.

We show examples of cross-scene NVS by pixelNeRF, MVSNeRF, and IBRNet given 3 views (ft denotes fine-tuned with 13 views).

Neural Surface Reconstruction

Precise surface reconstruction from multi-view images enables a broad range of applications. We include representative methods for dense-view and sparse-view surface reconstruction, respectively. We show examples of dense-view surface reconstruction by NeuS. More results on the sparse-view setting can be found in the paper.

3D Object Generation

State-of-the-art generative models can directly generate textured 3D meshes. We train We train GET3D on OmniObject3D and show examples of the generated shapes.

Concurrent works

Some concurrent works also focus on building large-scale 3D object datasets:

Objaverse is a massive dataset with 800K+ annotated 3D objects collected from Sketchfab.
ScanNeRF provides an effective pipeline for scanning real objects in quantity and effortlessly for evaluating Neural Rendering frameworks.

Bibtex

@inproceedings{wu2023omniobject3d, author = {Tong Wu and Jiarui Zhang and Xiao Fu and Yuxin Wang and Jiawei Ren, Liang Pan and Wayne Wu and Lei Yang and Jiaqi Wang and Chen Qian and Dahua Lin and Ziwei Liu}, title = {OmniObject3D: Large-Vocabulary 3D Object Dataset for Realistic Perception, Reconstruction and Generation}, booktitle={IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, year={2023} }