-
Notifications
You must be signed in to change notification settings - Fork 0
/
index.html
297 lines (267 loc) · 15.5 KB
/
index.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
<!DOCTYPE html>
<html>
<head lang="en">
<meta charset="UTF-8">
<meta http-equiv="x-ua-compatible" content="ie=edge">
<title>MVImgNet</title>
<meta name="description" content="">
<meta name="viewport" content="width=device-width, initial-scale=1">
<!-- <base href="/"> -->
<!--FACEBOOK-->
<!-- <meta property="og:image" content="https://jonbarron.info/mipnerf/img/rays_square.png"> -->
<meta property="og:image:type" content="image/png">
<meta property="og:image:width" content="682">
<meta property="og:image:height" content="682">
<meta property="og:type" content="website" />
<meta property="og:url" content="https://jonbarron.info/mipnerf/"/>
<meta property="og:title" content="MVImgNet" />
<meta property="og:description" content="Project page for MVImgNet: A Large-scale Dataset of Multi-view Images." />
<!--TWITTER-->
<meta name="twitter:card" content="summary_large_image" />
<meta name="twitter:title" content="MVImgNet" />
<meta name="twitter:description" content="Project page for MVImgNet: A Large-scale Dataset of Multi-view Images." />
<meta name="twitter:image" content="" />
<!-- <link rel="apple-touch-icon" href="apple-touch-icon.png"> -->
<!-- <link rel="icon" type="image/png" href="img/seal_icon.png"> -->
<!-- Place favicon.ico in the root directory -->
<link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.5/css/bootstrap.min.css">
<link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/font-awesome/4.4.0/css/font-awesome.min.css">
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/codemirror/5.8.0/codemirror.min.css">
<link rel="stylesheet" href="css/bootstrap.min.css">
<link rel="stylesheet" href="css/app.css">
<script src="https://ajax.googleapis.com/ajax/libs/jquery/1.11.3/jquery.min.js"></script>
<script src="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.5/js/bootstrap.min.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/codemirror/5.8.0/codemirror.min.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/clipboard.js/1.5.3/clipboard.min.js"></script>
<script src="js/jquery-3.6.3.min.js"></script>
<script src="js/app.js"></script>
</head>
<body>
<div class="container" id="main">
<div class="row">
<h2 class="col-md-12 text-center">
<b>MVImgNet: A Large-scale Dataset of Multi-view Images</b> <br>
<small>
CVPR 2023
</small>
</h2>
</div>
<div class="row">
<div class="col-md-12 text-center">
<ul class="list-inline">
<li>
<a href="mailto:[email protected]">
Xianggang Yu <sup style="font-size:small">*</sup>
</a>
</li>
<li>
<a href="https://mutianxu.github.io/">
Mutian Xu <sup style="font-size:small">*</sup>
</a>
</li>
<li>
<a href="mailto:[email protected]">
Yidan Zhang <sup style="font-size:small">*</sup>
</a>
</li>
<li>
<a href="mailto:[email protected]">
Haolin Liu <sup style="font-size:small">*</sup>
</a>
</li>
<li>
<a href="mailto:[email protected]">
Chongjie Ye <sup style="font-size:small">*</sup>
</a>
</li><br>
<li>
<a href="mailto:[email protected]">
Yushuang Wu
</a>
</li>
<li>
<a href="mailto:[email protected]">
Zizheng Yan
</a>
</li>
<li>
<a href="maito:[email protected]">
Chenming Zhu
</a>
</li>
<li>
<a href="mailto:[email protected]">
Zhangyang Xiong
</a>
</li>
<li>
<a href="mailto:[email protected]">
Tianyou Liang
</a>
</li><br>
<li>
<a href="https://guanyingc.github.io/">
Guanying Chen
</a>
</li>
<li>
<a href="mailto:[email protected]">
Shuguang Cui
</a>
</li>
<li>
<a href="https://gaplab.cuhk.edu.cn/">
Xiaoguang Han <sup>†</sup>
</a>
</li>
</ul>
<span id="cuhksz"><a href="https://gaplab.cuhk.edu.cn/" class='gap'>GAP Lab, The Chinese University of Hong Kong, Shenzhen</a> </span>
</div>
</div>
<div class="row">
<div class="col-md-8 col-md-offset-2">
<image src="img/teaser.png" class="teaser_img" id="teaser" alt="overview"><br>
</div>
</div>
<!--
<div class="row">
<div class="col-md-4 col-md-offset-4 text-center">
<ul class="nav nav-pills nav-justified">
<li>
<a href="https://arxiv.org/abs/2103.13415">
<image src="img/mip_paper_image.jpg" height="60px">
<h4><strong>Paper</strong></h4>
</a>
</li>
<li>
<a href="https://youtu.be/EpH175PY1A0">
<image src="img/youtube_icon.png" height="60px">
<h4><strong>Video</strong></h4>
</a>
</li>
<li>
<a href="https://github.com/google/mipnerf">
<image src="img/github.png" height="60px">
<h4><strong>Code</strong></h4>
</a>
</li>
</ul>
</div>
</div> -->
<div class="abstract">
<div class="row">
<div class="col-md-8 col-md-offset-2">
<h3>
Abstract
</h3>
<p class="text-justify">
Being data-driven is one of the most iconic properties of deep learning algorithms. The born of <a href="https://image-net.org/" style="font-weight:initial">ImageNet</a> drives a remarkable trend of "learning from large-scale data" in computer vision. Pretraining on ImageNet to obtain rich universal representations has been manifested to benefit various 2D visual tasks, and becomes a standard in 2D vision. However, due to the laborious collection of real-world 3D data, there is yet no generic dataset serving as a counterpart of ImageNet in 3D vision, thus how such a dataset can impact the 3D community is unraveled. To remedy this defect, we introduce <span><strong>MVImgNet</strong></span>, a large-scale dataset of multi-view images, which is highly convenient to gain by shooting videos of real-world objects in human daily life. It contains <span><strong>6.5 million</strong></span> frames from <span><strong>219,188</strong></span> videos crossing objects from <span><strong>238</strong></span> classes, with rich annotations of object masks, camera parameters, and point clouds. The multi-view attribute endows our dataset with 3D-aware signals, making it a soft bridge between 2D and 3D vision. <br> <br>
We conduct pilot studies for probing the potential of MVImgNet on a variety of 3D and 2D visual tasks, including radiance field reconstruction, multi-view stereo, and view-consistent image understanding, where MVImgNet demonstrates promising performance, remaining lots of possibilities for future explorations. <br> <br>
Besides, via dense reconstruction on MVImgNet, a 3D object point cloud dataset is derived, called <span><strong>MVPNet</strong></span>, covering <span><strong>80,000</strong></span> samples from <span><strong>150</strong></span> categories, with the class label on each point cloud. Experiments show that MVPNet can benefit the real-world 3D object classification while posing new challenges to point cloud understanding. <br> <br>
MVImgNet and MVPNet will be publicly available soon, hoping to inspire the broader vision community.
</p>
</div>
</div>
</div>
<div class="taxonomy">
<div class="row">
<div class="col-md-8 col-md-offset-2">
<h3>
Dataset -- MVImgNet
</h3>
<h4>
Statistics
</h4>
<p class="text-justify">
The statistics of MVImgNet are shown in <a href="#table_stat">Tab. 1</a>. MVImgNet includes 238 object classes, from 6.5 million frames of 219,188 videos. <a href="#part_data">Fig. 1</a> shows some frames randomly sampled from MVImgNet. The annotation comprehensively covers object masks, camera parameters, and point clouds.
</p>
<div class="stat_img">
<figure>
<image src="img/dataset_stat.png" class="img-responsive" id="table_stat">
<figcaption>
<span class="boldt">Tab. 1:</span> The statistic of the data amount generated from the pipeline, and valid amount after our cleaning, also the GPU hours for the processing.
</figcaption>
</figure>
<figure>
<image src="img/dspano_lowd.png" class="img-responsive" id="part_data"></image>
<figcaption>
<span class="boldt">Fig.1:</span> A variety of multi-view images in MVImgNet.
</figcaption>
</figure>
<!-- <figure style="clear:both;"></figure> -->
</div>
<h4>
Category taxonomy
</h4>
<figure>
<div id="tox_mvi"></div>
<figcaption>
<span class="boldt">Fig.2:</span> Taxonomy figure of the MVImgNet, where the angle of each class denotes its actual data proportion. <span class="boldt">Interior:</span> Parent class. <span class="boldt">Exterior:
</span> Children class.
<!-- Hover on the sector introduces the exact amount and ratio of the category. -->
</figcaption>
</figure>
</div>
</div>
</div>
<div class="taxonomy">
<div class="row">
<div class="col-md-8 col-md-offset-2">
<h3>
Dataset -- MVPNet
</h3>
<p class="text-justify">
Derived from the dense reconstruction on MVImgNet (as mentioned in ), a new large-scale real-world 3D object point cloud dataset–MVPNet, is born, which contains 80,000 point clouds with 150 categories. As listed in <a href="#mvp_table">Tab. 2</a>, compared with existing 3D object datasets, our MVPNet contains a conspicuously richer amount of real-world object point clouds, with abundant categories covering many common objects in the real life. The category distribution of MVPNet dataset is shown in <a href="#mvp_cls">Fig. 4</a> .
</p>
<div class="panojs">
<div>
<figure>
<image src="img/mvp_compare_table.png" class="img-responsive" id="mvp_table"></image>
<figcaption>
<span class="boldt">Tab.2:</span> Comparison between MVPNet and existing datasets
of 3D object point clouds.
<!-- Hover on the sector introduces the exact amount and ratio of the category. -->
</figcaption>
</figure>
</div>
<div>
<figure id="fig_panopc">
<image src="img/pcpano_lowd.png" class="img-responsive" id="panopc"></image>
<figcaption>
<span class="boldt">Fig.3:</span> Some dense reconstructed 3D point clouds sampled from MVPNet
<!-- Hover on the sector introduces the exact amount and ratio of the category. -->
</figcaption>
</figure>
</div>
<div>
<figure id="mvp_tax_js">
<div id="mvp_cls"></div>
<figcaption>
<span class="boldt">Fig.4:</span> MVPNet categories.
<!-- Hover on the sector introduces the exact amount and ratio of the category. -->
</figcaption>
</figure>
</div>
<div style="clear:both;"></div>
</div>
</div>
</div>
</div>
<div class="row">
<div class="col-md-8 col-md-offset-2">
<h3>
Video
</h3>
<div class="text-center">
<!-- <div style="position:relative;padding-top:56.25%;"> -->
<iframe src="//player.bilibili.com/player.html?aid=865356102&bvid=BV1o54y1M7Ye&cid=1034904470&page=1" allowfullscreen="allowfullscreen" width="100%" height="500" scrolling="no" frameborder="0" sandbox="allow-top-navigation allow-same-origin allow-forms allow-scripts" > </iframe>
<!-- </div> -->
</div>
</div>
</div>
</div>
</body>
<script src="js/echarts.min.js"></script>
<script type="text/javascript" src="js/tax.js"></script>
<script type="text/javascript" src="js/mvp_cls.js"></script>
</html>