Welcome to the 4^th Workshop and Challenge on Computer Vision In The Built Environment For The Design, Construction and Operation of Buildings organized at

Building on the success of the previous three workshops, the 4th Workshop on Computer Vision in the Built Environment continues on connecting the domains of Architecture, Engineering, and Construction (AEC) with that of Computer Vision by establishing a common ground of interaction and identify shared research interests. Specifically, this workshop focuses on the as-is semantic status of built environments and the changes that take place within them over time. These topics will be presented from the dual lens of Computer Vision and AEC-FM, highlighting the limitations and bottlenecks related to developing applications for this specific domain. The objective is for attendees to learn more about AEC-FM and the variety of real-world problems that, if solved, could have a tangible impact on this multi trillion dollar industry as well as the overall quality of life across the globe.

The workshop will begin by establishing ways to capture the as-is status of a space with expert speakers from both the AEC and Computer Vision domains. Attendees will be then introduced to the type of information required for the spatiotemporal analysis of our built environment in AEC, with a focus on effective management, safety, and the role of users in this process. Following that, the topic of scene understanding from 3D and 4D reconstructions will be presented. Finally, to close the loop from understanding real-world built environments to designing built environments better and faster, the topic of scene synthesis at a geometric and semantic level will be presented. The importance of closing the loop for the AEC industry is paramount, especially when considering the design paradox. Architects are designing living spaces without any feedback from their previous designs. Learning to design using data from spaces that are already occupied and in-use, can provide designers with insights on what makes spaces appropriate for supporting the quality of life of the users.

To further establish connections between the two domains and identify what we can do right now and what is still hard to solve, we will host the 4th International Scan-to-BIM competition targeted on acquiring the semantic as-is status of buildings given their 3D point clouds. Specifically, we will focus on the tasks of floorplan reconstruction and 3D building model reconstruction and present appropriate interdisciplinary metrics for solving them. The past two years we observed that a large gap remains before these problems can be considered solved and actually meet the needs of practitioners. We regard this workshop as the ideal environment for understanding the challenges and steps forward given that it provides convergence between the research and practical communities from multiple disciplines.

The workshop will therefore consist of two parts: invited keynote talks and a Scan-to-BIM challenge.

News

06 May 2024 — Presentation instructions sent to authors, details updated here.
25 Apr 2024 — Paper decision notification sent to authors.
18 Apr 2024 — Workshop schedule confirmed.
19 Mar 2024 — Afshin Dehghan confirmed as keynote speaker.
14 Feb 2024 — We are accepting paper submissions this year! Look at Important Dates and Call For Short Papers, CMT Submission Link.
14 Feb 2024 — Caitlin Mueller confirmed as keynote speaker.
13 Feb 2024 — Tentative schedule and dates released.
26 Jan 2024 — Catherine De Wolf and Francis Engelmann confirmed as keynote speaker.
26 Jan 2024 — Derek Lichti and Yuanbo (Amber) Xiangli confirmed as keynote speaker.
26 Jan 2024 — Website is live!

Important Dates

NOTE: The submission/release times are 11:59:59 UTC on the specified date.

Short Paper Submission

1 Apr 2024 — Paper submission deadline
3 Apr 2024 — Papers distributed to reviewers
18 Apr 2024 — Review submission deadline
25 Apr 2024 — Notification to Authors

Challenge

8 Apr 2024 — Training + Validation + Testing data available for 2D & 3D
9 Apr 2024 — Evaluation server open to evaluate test submissions
~~01 Jun~~ 05 Jun 2024 — Challenge Submission Deadline
07 Jun 2024 — Notification To Challenge Participants
18 Jun 2024 — CV4AEC Workshop @ CVPR 2024

Schedule

The workshop took place on 18 June 2024 from 09:00 - 17:00 PDT. The recording of our workshop for registered participants can be found on CVPR platform.

NOTE: Times are shown in Pacific Daylight Time. Please take this into account if joining the workshop virtually.

Time (PDT)	Duration	Event
09:00 - 09:30	30 mins	Welcome & Introduction
09:30 - 10:00	30 mins	Derek Lichti – Rigorous Object Precision Modelling for Reality Capture Viewpoint Planning
10:00 - 10:30	30 mins	Francis Engelmann – Foundation Models For 3D Scene Understanding
10:30 - 11:00	30 mins	Catherine De Wolf – Digital Transformation For Circular Construction
11:00 - 11:15	15 mins	Coffee Break
11:15 - 11:45	30 mins	Afshin Dehghan – Apple LiDAR and Advanced Parametric Scene Representation: RoomPlan and Beyond
11:45 - 12:45	60 mins	Oral Session
12:45 - 14:15	90 mins	Poster Session (Arch Building Exhibit Hall, posters #40-49) & Lunch Break
14:15 - 14:45	30 mins	Challenge Winner Presentations and Awards
14:45 - 15:15	30 mins	Caitlin Mueller – Designing With Data For A Sustainable Built Environment
15:15 - 15:45	30 mins	Yuanbo (Amber) Xiangli – Pack Built Environments into Neural Fields
15:45 - 16:45	60 mins	Panel Discussion
16:45 - 17:00	15 mins	Concluding Remarks

Keynote Speakers

Afshin Dehghan is a Senior AI/ML Manager at Apple, where he leads a dynamic team dedicated to advancing multimodal perception and reasoning technologies. His team has made significant contributions to the development of several technologies across Apple, such as FaceID, the 2D & 3D Always On perception engine on iPhones, and VisionPro. Additionally, his team has developed Apple’s 3D Parametric Scene Understanding technology, RoomPlan. This feature harnesses the power of liDAR technology in iPhones and iPads, allowing users to effortlessly create 3D floor-plans of their surroundings. More recently, his team has been focusing on multimodal applications that integrate visual, spatial, and contextual data, powered by large models to create more intuitive and powerful computing experiences. Afshin earned his PhD in 2016 under the supervision of Mubarak Shah at the University of Central Florida.

Catherine De Wolf is assistant professor and director of the Chair of Circular Engineering for Architecture (CEA) at ETH Zurich. Her work explores digital innovations such as reality capture and AI to advance the built environment towards a circular economy. She has a dual background in civil engineering and architecture and completed her PhD at MIT. She is on the steering committee of the Centre for Augmented Computational Design in Architecture, Engineering and Construction (Design++). Catherine is also a faculty at the AI Center, EMPA, the Future Cities Lab, and the National Centre of Competence in Research on Digital Fabrication (DFAB). Additionally, Catherine provides regular consultation on environmental impact assessments for both government entities like the European Commission and engineering design offices such as Arup. Throughout her career, she has gained international experience working at institutions like the University of Cambridge, TU Delft, EPFL, Nanjing University, Kuwait University, and the African Urban Metabolism Network. Her contributions to these projects were often made possible by securing multiple fellowships, including the Swiss Excellence, WBI World Excellence, and Marie Sklodowska-Curie Postdoctoral Fellowships.

Francis Engelmann is a PostDoc with Prof. Marc Pollefeys at ETH Zurich, and a visiting researcher at Google with Federico Tombari. His research interest lie at the intersection of computer vision and deep learning towards open-vocabulary 3D scene understanding with foundation models. Francis is a Fellow of the ETH AI Center, the ELLIS Society, and the recipient of the ETHZ Career Seed Award.

Derek Lichti is a Professor of Civil Engineering and Computer Science & Technology. Derek Lichti received his Bachelor’s degree in Survey Engineering from Toronto Metropolitan University in 1993 and MSc and PhD degrees in Geomatics Engineering from the University of Calgary in 1996 and 1999, respectively. He is currently Professor in the Department of Geomatics Engineering at the University of Calgary, which he joined in 2008 and served (2013-2018) as Department Head. He is currently ISPRS Congress Director and served (2013-2020) as Editor-in-Chief of the ISPRS Journal of Photogrammetry and Remote Sensing. His research program focuses on imaging metrology: precision 3D reality capture from imaging sensors, principally terrestrial laser scanners and digital cameras. It touches a wide range of applications including the documentation of at-risk cultural heritage sites, as-built modelling of industrial sites, wear and damage assessment in structures and industrial machinery, and dimensional control.

Caitlin Mueller is an Associate Professor at MIT’s Department of Architecture and Department of Civil and Environmental Engineering, in the Building Technology Program, where she leads the Digital Structures research group. She works at the creative interface of architecture, structural engineering, and computation, and focuses on new computational design and digital fabrication methods for innovative, high-performance buildings and structures that empower a more sustainable and equitable future. Mueller holds three degrees from MIT in Architecture, Computation, and Building Technology, and one from Stanford in Structural Engineering. Her research is funded by federal agencies and industry partners, including the National Science Foundation, FEMA, the MIT Tata Center, the Dar Group, Holcim, Robert McNeel & Associates, and Altair Engineering. Mueller was awarded the ACADIA Innovative Research Award of Excellence by the Association for Computer Aided Design in Architecture in 2021 and the Diversity Achievement Award from the Association of Collegiate Schools of Architecture in 2022.

Yuanbo (Amber) Xiangli is a postdoc scholar at Cornell University, working with Prof. Noah Snavely. Prior to this, she did her Ph.D at Multimedia Lab, the Chinese University of Hong Kong, supervised by Prof. Dahua Lin. She received her Master degree from University of Oxford and Diploma from the University of Nottingham in Computer Science. Her research interests lie in 3D computer vision and generative modelling. She has been working on photorealistic and efficient large-scale 3D indoor/outdoor scenes rendering, manipulation and generation, leveraging diverse 2D/3D data sources, geographic and architectural information.

Accepted Papers

Automatic Defurnishing of Indoor Panoramas [Project Page] [arXiv]
Mira Slavcheva (Matterport), David Gausebeck (Matterport), Kevin Chen (Matterport), David Buchhofer (Matterport), Azwad Sabik (Matterport), Chen Ma (Matterport), Sachal Dhillon (Matterport), Olaf Brandt (Matterport), Alan Dolhasz (Matterport)
A Two-Stage Masked Autoencoder Based Network for Indoor Depth Completion [arXiv]
Kailai Sun (National University of Singapore), Zhou Yang (BYD company)
ARCH2S: Dataset, Benchmark and Challenges for Learning Exterior Architectural Structures from Point Clouds [arXiv]
Ka Lung Cheung (The Chinese University of Hong Kong), Chi Chung Lee (Hong Kong Metropolitan University)
Towards Automating the Retrospective Generation of BIM Models: A Unified Framework for 3D Semantic Reconstruction of the Built Environment [arXiv]
Ka Lung Cheung (The Chinese University of Hong Kong), Chi Chung Lee (Hong Kong Metropolitan University)
Enhancing Polygonal Building Segmentation via Oriented Corners [arXiv]
Mohammad Moein Sheikholeslami (York University), Muhammad Kamran (York University), Andreas Wichmann (Jade University of Applied Sciences), Gunho Sohn (York University)
Window to Wall Ratio Detection using SegFormer [arXiv]
Zoe De Simone (MIT), Sayandeep Biswas (MIT), Oscar Wu (MIT)
An Expeditious Spatial Mean Radiant Temperature Mapping Framework using Visual SLAM and Semantic Segmentation
Wei Liang (CMU), Yiting Zhang (CMU), Ji Zhang (CMU), Erica Cochran Hameen (CMU)
Zero-Shot Construction Object Detection through Knowledge-based Feature Integrator
Maryam Soleymani (Louisiana State University), Mahdi Bonyani (Louisiana State University), Chao Wang (Louisiana State University), Hyun-woo Jeon (Kyung Hee University)
Real-time Ergonomic Risk Assessment in Construction Sites through Spatiotemporal Graph Convolution Network
Mahdi Bonyani (Louisiana State University), Maryam Soleymani (Louisiana State University), Chao Wang (Louisiana State University), Hyun-woo Jeon (Kyung Hee University)
BIM-Module for deep learning-based parametric IFC reconstruction
Maarten Bassier (KU Leuven), Sam De Geyter (KU Leuven), Oscar Roman (Bruno Kessler Foundation), Roberto Battisti (Bruno Kessler Foundation), Heinder De Winter (KU Leuven), Gabriele Mazzacca (Bruno Kessler Foundation), Fabio Remondino (Bruno Kessler Foundation)

Call for Short Papers

This year we are inviting as part of the workshop the submission of short papers, which will not appear in the conference proceedings. Accepted papers will be presented in an oral session and will also have a spot in the poster session.

Short papers range from 3 to 4 pages without references. Submissions should otherwise follow the CVPR 2024 Author Kit provided by the main conference: CVPR 2024 Auhtor Kit. Papers that are not properly anonymized, or do not use the template, or have more than four pages (excluding references) will be considered for rejection without review.

Link to the submission system : CMT

Submissions should:

Introduce the topic and literature review, discussion on methodology, preliminary results.
Motivate and place the topic in relation to the built environment and its specific application, including a comparison to current AEC practice
Include a short discussion on considerations of practice, ethics, and organizations, as applicable.

Topics : Any topic that can be categorized as Computer Vision applications In The Built Environment For The Design, Construction and Operation of Buildings.

Including but not limited to:

Generative design
Floorplan reconstruction
Indoor layout synthesis
Activity recognition (e.g., occupants in a building, workers in a construction site)
Semantic 3D understanding (e.g., for renovation or construction)
3D reconstruction (e.g., for renovation or construction)
Material understanding
Object/Scene localization
Scene completion
Change detection
And more

Instructions for Presentation and Poster

PRESENTATIONS:

The oral session will take place 11:45AM - 12:45PM in the workshop room.
Each oral presentation will be 5 minutes with no Q&A. Participants will be encouraged to ask questions to presenters during the poster session.
We will use the organizers’ laptop to present all papers for smooth transition. Please upload a power point of your presentation using the following link the latest by June 15th, 23:59 PT: Google form
We hope that you used the reviewers’ feedback to improve on minor revisions and that you will present an updated version during the workshop.

POSTERS:

The poster session will take place together with the lunch break 12:45 - 2:15 PM
All posters will be in the Arch Building Exhibit Hall and labeled per workshop.
The maximum poster size is 4x8 and there is an onsite printing option (not mandatory). More information at: CVPR website
DO NOT SUBMIT A WORKSHOP PAPER TO THE MAIN CONFERENCE PRINT SITE - IT WILL BE REJECTED.
If you have a question, need to re-submit a file or need a receipt, please contact the provider directly atdsf.team@e-arc.com
In the poster room, there will be tables but no power outlets.

Challenge

The workshop will host the 4th International Scan-to-BIM challenge. The challenge will include the following tasks:

I. 2D Floorplan Reconstruction
II. 3D Building Model Reconstruction

[GitHub] — [2D Challenge] — [3D Challenge]

2D Floor Plan Reconstruction

The 2D Floorplan Reconstruction challenge contains a total of 31 buildings with multiple floors each and dozens of rooms on each floor. Of which, 20 buildings are designated as the training set, with a total of 49 point clouds. The validation and testing sets contain 5.5 buildings with 21 point clouds each. For each model, there is an aligned point cloud in LAZ format. For the training and validation sets, a corresponding floorplan aligned with the coordinate system of the point cloud is also provided. The challenge data and evaluation code can be found in this Github repository. The submission should be made in the same JSON format as in the provided ground truth. We include metrics to evaluate the reconstruction of the walls, doors, and columns, as well as floor area in 2D :

Geometric Metrics
a. IoU of each room (a room is defined as a completely separated area with walls and doors).
b. Accuracy of endpoints : Precision/Recall at 3 different thresholds: 5cm, 10cm and 20cm, as well as the F-measure at each threshold will be evaluated in the coordinate system of the point cloud. The provided endpoints will be matched with the Hungarian algorithm to the point cloud, and every point that is within a certain threshold will be determined as a match.
c. Orientation For each matched line between the ground truth, we will compute the cosine similarity metric between them as the normalized dot product. If a line is not matched with ground truth, the cosine metric will be zero. Finally, the metric will be averaged over all the ground truth lines.
Topological Metrics
a. Warping error : The warping error will first warp the predicted floorplan to the ground truth with a homotopic deformation, and then compute the pixels that cannot match after the deformation.
b. Betti number error : The Betti number error will compare the Betti numbers between the prediction and the ground truth and output the absolute value of the difference.

3D Building Model Reconstruction

The training data consists of 16 floors from 8 buildings. For each model, there is an aligned point cloud in LAZ format. The 3D building coordinates for walls, columns and doors are presented in 3 separate JSON files. We focus on the reconstruction of walls, columns, and doors. The challenge data and evaluation code can be found in this Github repository. The submission should be made in the same JSON format as in the provided ground truth. We evaluate the submissions on a variety of metrics :

3D IoU of the 3D bounding box of each wall
Accuracy of the endpoints : Precision/Recall at 3 different thresholds: 5cm, 10cm and 20cm, as well as F-measure will be evaluated in the coordinate system of the point cloud. The provided endpoints will be matched with the Hungarian algorithm to the point cloud, and every point that is within a certain threshold will be determined as a match. We evaluate per each of the three semantic types (i.e., wall, column, door).

We would like to note that ALL the submissions need to be constructed automatically. Manual reconstructions are against the spirit of this challenge and will not be allowed.

Challenge Winners

2D Floor Plan Reconstruction

Team	Precision (5cm)	Precision (10cm)	Precision (20cm)	Recall(5cm)	Recall(10cm)	Recall (20cm)	IoU	Warping Error	Betting Error
HKU-iLab-2D	0.031	0.060	0.093	0.192	0.347	0.510	0.563	0.221	1.308

3D Building Model Reconstruction

Team	Average IoU	Columns IoU	Doors IoU	Walls IoU	5cm Average F1	10cm Average F1	20cm Average F1	10cm Columns F1	10cm Doors F1	10cm Walls F1
HKU-iLab-3D	0.450	0.475	0.524	0.362	0.466	0.567	0.646	0.600	0.586	0.528
KUL-FBK	0.270	0.304	0.231	0.291	0.327	0.423	0.500	0.488	0.376	0.425
HumanTech	0.241	0.357	0.172	0.243	0.314	0.444	0.550	0.643	0.258	0.515

Teams

HKU-iLab-2D: Longyong Wu, Ziqi Li, Meng Sun, Fan Xue; Department of Real Estate and Construction, The University of Hong Kong
HKU-iLab-3D: Siyuan Meng, Sou-Han Chen, Jiajia Wang, Fan Xue; Department of Real Estate and Construction, The University of Hong Kong
KUL-FBK: Maarten Bassier, Sam De Geyter, Heinder De Winter, Roberto Battisti, Oscar Roman; KU Leuven Department of Civil Engineering, Geomatics Section, Faculty of Engineering Technology, Ghent, Belgium & Bruno Kessler Foundation (FBK), Trento, Italy
HumanTech: Mahdi Chamseddine, Fabian Kaufmann, Jason Rambach; DFKI and RPTU in Kaiserslautern, Germany

Questions

Contact the organisers at cv4aec.3d@gmail.com

Organizers

Senior Organizers

Student Organizers