Skip to the content.

:wave: Welcome to the 2nd Workshop and Challenge on Computer Vision In The Built Environment For The Design, Construction and Operation of Buildings organized at :wave: cvpr2022

Building on the success of the 1st workshop, the 2nd Workshop on Computer Vision in the Built Environment continous on connecting the domains of Architecture, Engineering, and Construction (AEC) with that of Computer Vision by establishing a common ground of interaction and identify shared research interests. . Specifically, this workshop focuses on the as-is semantic status of built environments and the changes that take place within them over time. These topics will be presented from the dual lens of Computer Vision and AEC-FM, highlighting the limitations and bottlenecks related to developing applications for this specific domain. The objective is for attendees to learn more about AEC-FM and the variety of real-world problems that, if solved, could have a tangible impact on this multi trillion dollar industry as well as the overall quality of life across the globe.

The workshop will begin by establishing ways to acquire the as-is status of a space in a granular and hierarchical way - some of the speakers are experts in acquiring the spatial layout whereas others focus on object categories and their attributes. Building on this static scene understanding, we introduce the impact of time, as change that is either explicitly observed (a human interacting with an object) or implicitly inferred (capturing the as-is status of a scene in different timestamps). The combination of the static and dynamic understanding of 3D scenes is at the core of AEC-FM industry and currently missing. One example is that architects typically design living spaces without any feedback from their previous designs. Another example is that 5-12% (this percentage corresponds annually to billions of dollars in the US alone) of non-estimated construction cost is due to rework that originates from misinterpretation of design documents and the dynamically changing environment of construction sites.

To further establish connections between the two domains and identify what we can do right now and what is still hard to solve, we will host the 2nd International Scan-to-BIM competition targeted on acquiring the semantic as-is status of buildings given their 3D point clouds. Specifically, we will focus on the tasks of floorplan reconstruction and 3D building model reconstruction and present appropriate interdisciplinary metrics for solving them. Last year we observed that a large gap remains before these problems can be considered solved and actually meet the needs of practitioners. We regard this workshop as the ideal environment for understanding the challenges and steps forward given that it provides convergence between the research and practical communities from multiple disciplines.

The workshop will therefore consist of two parts: invited keynote talks and a Scan-to-BIM challenge.


:hourglass_flowing_sand: Important Dates

NOTE: The submission/release times are 00:00:00 UTC on the specified date.


:calendar: Schedule

The workshop took place on 19 June 2022 from 09:00 - 18:00. The recording can be found here.

NOTE: Times are shown in Central Standard Time. Please take this into account if joining the workshop virtually.

Time (PDT) Duration Event
09:00 - 09:30 30 mins Introduction To The Workshop & Challenge
09:30 - 10:00 30 mins Burcu Akinci – Lessons learned from decades of research in utilizing computer vision to support construction and infrastructure management
10:00 - 10:30 30 mins Angela Dai – Learning from Synthetic 3D Priors for Real-World 3D Perception
10:30 - 11:15 30 mins Winner Presentations, 2D Floorplan Reconstruction
11:15 - 11:30 15 mins Coffee Break
11:30 - 12:00 30 mins Shirley Dyke – Applying Machine Learning to Support Disaster Reconnaissance
12:00 - 12:30 30 mins Siyu Tang – Human Motion Capture and Synthesis in 3D Scenes
12:30 - 13:15 45 mins Winner Presentations, 3D Building Model Reconstruction
13:15 - 14:15 60 mins Lunch Break
14:45 - 15:00 45 mins Community Engagement
15:00 - 15:30 30 mins Chen Fueng – Weakly and Self Supervised Robot Perception: from Scene Understanding to Mobile Construction in AEC
15:30 - 16:00 30 mins Thomas Funkhouser – Neural Scene Representations in Urban Environments
16:00 - 16:30 30 mins Federico Tombari – 3D scene understanding with scene graphs and self-supervision for AR and indoor design
16:30 - 17:00 30 mins Coffee Break
17:00 - 17:45 45 mins Panel Discussion
17:45 - 18:00 15 mins Concluding Remarks

:microphone: Keynote Speakers

Burcu Akinci
Burcu Akinci
Professor, CEE
CMU
Angela Dai
Angela Dai
Professor, CS
TU MUnich
Shirley J. Dyke
Shirley J. Dyke
Professor, ME & CEE
Purdue
Chen Feng
Chen Feng
Professor, ME & CEE
NYU
Thomas Funkhouser
Thomas Funkhouser
Senior Research Scientist
Google
Siyu Tang
Siyu Tang
Professor, CS
ETHZ
Federico Tombari
Federico Tombari
Senior Staff Research Scientist and Manager
Google

Burcu Akinci is Paul Christiano Professor of Civil & Environmental Engineering at Carnegie Mellon University and a member of the National Academies of Construction. She was also former Associate Dean for Research for the College of Engineering and Director of Engineering Research Accelerator at Carnegie Mellon. She earned a bachelor’s degree in civil engineering from the Middle East Technical University (Ankara, Turkey), MBA from Bilkent University (Ankara, Turkey), and Master’s and PhD degrees in Civil and Environmental Engineering with a specialization in Construction Engineering and Management from Stanford University. Dr. Akinci’s research focuses on investigating utilization and integration of building information models with data capture technologies, such as 3D imaging and embedded sensors, to create digital twins of construction projects and infrastructure operations, and develop approaches to support proactive and predictive operations and management.

Angela Dai is an Assistant Professor at the Technical University of Munich. Her research focuses on understanding how the 3D world around us can be modeled and semantically understood, leveraging generative deep learning towards enabling understanding and interaction with real-world 3D/4D scenes for content creation and virtual or robotic agents. Previously, she received her PhD in computer science from Stanford in 2018 and her BSE in computer science from Princeton in 2013. Her research has been recognized through a ZDB Junior Research Group Award, an ACM SIGGRAPH Outstanding Doctoral Dissertation Honorable Mention, as well as a Stanford Graduate Fellowship.

Shirley J. Dyke holds a joint appointment in Mechanical Engineering and Civil Engineering at Purdue University. She is the Director of Purdue’s Intelligent Infrastructure Systems Lab and the Director of the NASA funded Resilient ExtraTerrestrial Habitat Institute. Dyke is the Editor-in-Chief of the journal Engineering Structures. Her research focuses on “intelligent” structures, and her innovations encompass structural health monitoring and machine learning for structural damage assessment and reconnaissance support. She holds a B.S. in Aeronautical and Astronautical Engineering from the University of Illinois, Champaign-Urbana in 1991 and a Ph.D. in Civil Engineering from the University of Notre Dame in 1996. She was awarded the Presidential Early Career Award for Scientists and Engineers from NSF (1998), the International Association on Structural Safety and Reliability Junior Research Award (2001) and the ANCRiSST Young Investigator Award (2006).

Chen Feng is an assistant professor at NYU, appointed across departments including civil and mechanical engineering, and computer science. His lab AI4CE (pronounced as A-I-force) aims to advance robot vision and machine learning through multidisciplinary use-inspired research that originates from civil/mechanical engineering domains. Before NYU, Chen was a research scientist in the computer vision group at Mitsubishi Electric Research Labs (MERL) in Cambridge, MA, focusing on localization, mapping, and deep learning for self-driving cars and robotics. Chen holds a Bachelor’s degree in geospatial engineering from Wuhan University in China, and a master’s degree in electrical engineering and a Ph.D. in civil engineering, both from the University of Michigan at Ann Arbor.

Thomas Funkhouser is a Senior Research Scientist in Google and the David M. Siegel Professor, Emeritus, at the CS Department, Princeton University. Thomas joined Princeton University in 1998 as an assistant professor. He became an associate professor in 2003 and a full professor in 2009. Before coming to Princeton, he worked for four years on the technical staff at Bell Laboratories. He holds a Ph.D. in computer science from the University of California, Berkeley (1993), a Master’s in computer science from UCLA, and a Bachelor’s in biological sciences from Stanford. Among Professor Funkhouser’s honors and awards are the ACM SIGGRAPH Computer Graphics Achievement Award (2014), Sloan Foundation Fellowship (1999), and National Science Foundation Career Award (2000).

Siyu Tang is an assistant professor at ETH Zürich in the Department of Computer Science since January 2020. She received an early career research grant to start her own research group at the Max Planck Institute for Intelligent Systems in November 2017. She was a postdoctoral researcher in the same institute, advised by Dr. Michael Black. She finished her PhD at the Max Planck Institute for Informatics and Saarland University in 2017, under the supervision of Professor Bernt Schiele. Before that, she received her Master’s degree in Media Informatics at RWTH Aachen University, advised by Prof. Bastian Leibe and her Bachelor degree in Computer Science at Zhejiang University, China. She has received several awards for her research, including the Best Paper Award at BMVC 2012 and 3DV 2020, Best Paper Award Finalist at CVPR 2021, an ELLIS PhD Award and a DAGM-MVTec Dissertation Award.

Federico Tombari is Senior Staff Research Scientist and Manager at Google where he leads an applied research team in computer vision and ML. He is also a Lecturer (PrivatDozent) at the Technical University of Munich (TUM). He has 230+ peer-reviewed publications in CV/ML and applications to robotics, autonomous driving, healthcare and augmented reality. He got his PhD from the University of Bologna and his Habilitation from TUM. In 2018 he was co-founder and managing director of Pointu3D, a startup then acquired by Google. He regularly serves as Chair and Associate Editor for international conferences and journals (RA-L, ECCV18/22, IROS20/21/22, ICRA20/22, 3DV19/20/21 among others). He was the recipient of two Google Faculty Research Awards, one Amazon Research Award, 5 Outstanding Reviewer Awards (3x CVPR, ICCV21, NeurIPS 2021).

:checkered_flag: Challenge

The workshop will host the 2nd International Scan-to-BIM challenge. The challenge will include the following tasks:

I. 2D Floorplan Reconstruction
II. 3D Building Model Reconstruction

[GitHub] — [2D Challenge] — [3D Challenge]

2D Floor Plan Reconstruction

The 2D Floorplan Reconstruction challenge contains a total of 31 buildings with multiple floors each and dozens of rooms on each floor. Of which, 20 buildings are designated as the training set, with a total of 49 point clouds. The validation and testing sets contain 5.5 buildings with 21 point clouds each. For each model, there is an aligned point cloud in LAZ format. For the training and validation sets, a corresponding floorplan aligned with the coordinate system of the point cloud is also provided. The challenge data and evaluation code can be found in this Github repository. The submission should be made in the same JSON format as in the provided ground truth. We include metrics to evaluate the reconstruction of the walls, doors, and columns, as well as floor area in 2D :

  1. Geometric Metrics
    a. IoU of each room (a room is defined as a completely separated area with walls and doors).
    b. Accuracy of endpoints : Precision/Recall at 3 different thresholds: 5cm, 10cm and 20cm, as well as the F-measure at each threshold will be evaluated in the coordinate system of the point cloud. The provided endpoints will be matched with the Hungarian algorithm to the point cloud, and every point that is within a certain threshold will be determined as a match.
    c. Orientation For each matched line between the ground truth, we will compute the cosine similarity metric between them as the normalized dot product. If a line is not matched with ground truth, the cosine metric will be zero. Finally, the metric will be averaged over all the ground truth lines.

  2. Topological Metrics
    a. Warping error : The warping error will first warp the predicted floorplan to the ground truth with a homotopic deformation, and then compute the pixels that cannot match after the deformation.
    b. Betti number error : The Betti number error will compare the Betti numbers between the prediction and the ground truth and output the absolute value of the difference.

3D Building Model Reconstruction

The training data consists of 11 floors from 7 buildings. For each model, there is an aligned point cloud in LAZ format. The 3D building coordinates for walls, columns and doors are presented in 3 separate JSON files. We focus on the reconstruction of walls, columns, and doors. The challenge data and evaluation code can be found in this Github repository. The submission should be made in the same JSON format as in the provided ground truth. We evaluate the submissions on a variety of metrics :

  1. 3D IoU of the 3D bounding box of each wall
  2. Accuracy of the endpoints : Precision/Recall at 3 different thresholds: 5cm, 10cm and 20cm, as well as F-measure will be evaluated in the coordinate system of the point cloud. The provided endpoints will be matched with the Hungarian algorithm to the point cloud, and every point that is within a certain threshold will be determined as a match. We evaluate per each of the three semantic types (i.e., wall, column, door).

We decided to NOT provide proprietary formats such as Autodesk Revit files since there does not exist good open-source APIs to read them. The reason we decided not to provide an open format such as DXF is because DXF exports have arbitrary designations of conjunctions of walls, i.e. the corner will belong to only one of the walls in the DXF files, and the designation which corner belongs to which wall is arbitrary. In our JSON format, we provide walls as middle lines + thickness. The middle lines will connect to each other at corners. Hence there is no ambiguity on which part of the corner belongs to which wall. We would like to note that ALL the submissions need to be constructed automatically. Manual reconstructions are against the spirit of this challenge and will not be allowed.


:trophy: Challenge Winners

2D Floor Plan Reconstruction

Team Precision (5cm) Precision (10cm) Precision (20cm) Recall(5cm) Recall(10cm) Recall (20cm) IoU Warping Error Betting Error
Seg2Plan 0.052 0.203 0.335 0.015 0.065 0.114 0.657 0.249 1.076
S2FP 0.020 0.085 0.146 0.048 0.220 0.375 0.517 0.188 1.140
FLKPP 0.016 0.068 0.132 0.032 0.129 0.253 0.374 0.258 1.128
VecIM 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.731 1.646

3D Building Model Reconstruction

Team Average IoU Columns IoU Doors IoU Walls IoU 5cm Average F1 10cm Average F1 20cm Average F1 10cm Columns F1 10cm Doors F1 10cm Walls F1
Seg2BIM 0.309 0.470 0.260 0.266 0.417 0.515 0.577 0.618 0.494 0.477
FLKPP 0.231 0.372 0.230 0.152 0.316 0.454 0.584 0.608 0.367 0.452
PointToBIM 0.170 0.396 0.061 0.150 0.276 0.366 0.448 0.633 0.165 0.415
BoxDetector 0.024 0.038 0.006 0.033 0.109 0.171 0.258 0.167 0.144 0.197

Teams


:question: Questions

Contact the organisers at cv4aec.3d@gmail.com


Organizers

:construction_worker: Organizers

Iro Armeni
Iro Armeni
Postdoctoral Researcher, CS & CEE
ETHZ
Erzhuo Che
Erzhuo Che
Assistant Professor, CEE
Oregon State
Martin Fischer
Martin Fischer
Professor, CEE
Stanford
Yasutaka Furukawa
Yasutaka Furukawa
Associate Professor, CS
Simon Fraser
Daniel Hall
Daniel Hall
Assistant Professor, CEE
ETHZ
Jaehoon Jung
Jaehoon Jung
Assistant Professor, CEE
Oregon State
Fuxin Li
Fuxin Li
Associate Professor, CS
Oregon State
Michael Olsen
Michael Olsen
Associate Professor, CEE
Oregon State
Marc Pollefeys
Marc Pollefeys
Professor, CS
ETHZ
Yelda Turkan
Yelda Turkan
Assistant Professor, CEE
Oregon State