Skip to the content.

:wave: Welcome to the 5th Workshop on Computer Vision for the Built World organized at :wave:

cvpr2026

This workshop bridges the fields of Architecture, Engineering, and Construction (AEC) with Computer Vision by focusing on how construction, the most dynamic, data-rich, and physically grounded phase of the built environment, can inform the way we design. Construction sites continuously evolve in geometry, appearance, and topology, offering a uniquely challenging yet structured setting for advancing computer vision tasks such as spatiotemporal modeling, semantic reasoning, and multimodal understanding. At the same time, the representations learned from construction data — capturing how things are actually built, changed, and adapted — can feed back into design processes, informing more generative, data-driven, and sustainable decision-making.

The workshop explores how visual and multimodal data, including 3D scans, imagery, sensor streams, and language, can be used to model and predict the evolution of the built environment and inspire generative frameworks that translate these insights into actionable design knowledge. The goal is to connect bottom-up scene understanding with top-down design generation, effectively closing the loop between “as-built” and “as-designed.” Construction thus becomes not only an application domain but also an experimental testbed for foundational computer vision research — providing real-world complexity, scale, and temporal dynamics rarely captured in synthetic datasets.

Through paper submissions, keynote talks, and the Nothing Stands Still construction-data challenge, participants will engage with real-world, challenging testbeds that advance spatiotemporal 3D modeling, multimodal understanding, and semantic reasoning of evolving scenes in core vision research.

The workshop will consist of: invited keynote talks, paper submissions, and the Nothing Stands Still Challenge.


:newspaper: News


:dart: Topics


:hourglass_flowing_sand: Important Dates

NOTE: The submission/release times are 11:59:59 UTC on the specified date.

Archival Paper Submission (8 pages)

Non-Archival Paper Submission (4 pages)

Nothing Stands Still Challenge


:calendar: Schedule

The workshop will take place on 3 June 2026 as a half-day in-person event (4 hours).

NOTE: The schedule is tentative. Exact times will be updated closer to the workshop date.

Time Duration Session
0:00 – 0:10 10 mins Welcome & Introduction
0:10 – 0:40 30 mins Keynote 1
0:40 – 1:10 30 mins Keynote 2
1:10 – 1:50 40 mins Challenge Winners Session (10 min intro + 30 min presentations)
1:50 – 2:30 40 mins Poster Session + Coffee Break
2:30 – 3:00 30 mins Keynote 3
3:00 – 3:30 30 mins Keynote 4
3:30 – 4:00 30 mins Oral Presentations (Best papers)
4:00 5-10 mins Conclusion & Closing Remarks

:microphone: Keynote Speakers

Semiha Ergan
Semiha Ergan
Professor, CEE & CSE
NYU
Konrad Schindler
Konrad Schindler
Professor, CEE
ETH Zurich
Jia Deng
Jia Deng
Professor, CS
Princeton
Huaizu Jiang
Huaizu Jiang
Assistant Professor, CS
Northeastern

Semiha Ergan is a faculty member at the Department of Civil and Urban Engineering and Computer Science and Engineering at New York University, and an associated faculty at the Center for Urban Science and Progress (CUSP). With her background in civil engineering, AI and informatics, she leads the Building Informatics and Visualization Lab (biLAB) at NYU Tandon School of Engineering. BiLAB specializes in utilizing cutting-edge AI and sensing technologies to tackle challenges observed during the design, construction, and operation of facilities. The research team detects, quantifies, and visualizes patterns over time, leveraging data obtained from reality capture technologies (e.g., cameras, laser scanners) and embedded sensing. By exploiting the intersection of BIM, AI, robotics, and manufacturing processes, the lab enhances the scalability and efficiency of construction methods, particularly in modular construction contexts. Her work has been supported by DOE BTO, various programs of NSF, DARPA, and private organizations. Her achievements include NYU’s 2023 Distinguished Teacher Award, 2024 Inclusive Excellence Award, and 2015 DARPA Young Faculty Award.

Konrad Schindler is a professor at the Department of Civil, Environmental and Geomatic Engineering, Institute of Geodesy and Photogrammetry at ETH Zurich. He completed his PhD in Computer Science at Graz University of Technology, Austria in 2003. He has published numerous papers on photogrammetry, remote sensing, computer vision and image interpretation. He has received several best presentation awards including the U.V. Helava Award from ISPRS in 2012 and the Marr Prize Honourable Mention from the IEEE Computer Society in 2013. Konrad has been serving as an Associate Editor of the Journal of Photogrammetry and Remote Sensing of ISPRS since 2011, and was the Technical Commission President of the ISPRS from 2012 to 2016. His research interests include computer vision, photogrammetry, and remote sensing, with a focus on image understanding, information extraction, and 3D reconstruction.

Jia Deng is a Professor of Computer Science at Princeton University. His research focuses on computer vision and machine learning. He received his Ph.D. from Princeton University and his B.Eng. from Tsinghua University, both in computer science. He is a recipient of the Sloan Research Fellowship, the NSF CAREER award, the ONR Young Investigator award, an ICCV Marr Prize, a CVPR test-of-time award and two ECCV Best Paper Awards. His recent work demonstrates how procedural and generative approaches can create complex, realistic indoor scenes, bridging vision research and design.

Huaizu Jiang is an assistant professor in the Khoury College of Computer Sciences at Northeastern University. His research interests include computer vision, computational photography, machine learning, natural language processing, and artificial intelligence. Prior to joining Northeastern University, he was a Postdoc Researcher at Caltech and a Visiting Researcher at NVIDIA. He obtained his Ph.D. from UMass Amherst, advised by Prof. Erik Learned-Miller. His awards include the 2019-2020 NVIDIA Graduate Fellowship, 2019 Adobe Fellowship, and 2018 Outstanding Reviewer at IEEE/CVF CVPR. His recent work demonstrates how generative vision models can transform 2D building plans into realistic 3D environments, bridging perception and design.


:paperclip: Call for Papers

We invite submissions exploring the intersection of Computer Vision and the Built Environment, focusing on applications that transform how we understand, model, and design buildings and construction sites. Construction sites and building lifecycles are dynamic, complex, and data-rich, providing an ideal real-world testbed for advancing computer vision methods while generating actionable insights for design, sustainability, and circular practices.

Both short non-archival papers (4 pages) and long archival papers (8 pages) are welcome. Submissions should:

The best two long archival papers and the best short non-archival paper will be presented during the workshop in the Oral Presentation session.

We also accept papers submitted to the main conference and accepted as long non-archival papers (8 pages). Please indicate on the manuscript submission that it is accepted at the main conference.

Topics include but are not limited to:

Each submission will be reviewed by at least two program committee members, chosen to provide complementary expertise across computer vision and AEC domains.


:checkered_flag: Nothing Stands Still Challenge

The workshop will host the 2026 Nothing Stands Still (NSS) Dataset Challenge, introducing a unique real-world testbed for computer vision research. Previously run as part of a robotics conference workshop, the NSS challenge is now joining the computer vision community for the first time, reflecting its relevance for understanding complex, dynamic environments at scale. Full details from prior challenges are available at: nothing-stands-still.com/challenge

The challenge focuses on spatiotemporal 3D point cloud registration of evolving construction sites, which feature dramatic changes in geometry, topology, and appearance over time. These dynamic environments make construction sites an ideal testbed for cutting-edge computer vision tasks, including scene reconstruction, semantic understanding, predictive modeling, and temporal reasoning. To expand the scope of the challenge, we aim to add semantic annotations, enabling participants to reason not only about geometry but also about functional elements, building components, and how they evolve over time.

Evaluation

Challenge Timeline

Milestone Date
Dataset Release & Registration Opens January 15, 2026
Submission Window Opens (evaluation server live) February 15, 2026
Submission Deadline April 30, 2026
Review & Evaluation April 30 – May 4, 2026
Notification of Challenge Winners May 5, 2026
Workshop Presentation & Awards June 3, 2026

Ethical Note: All construction sites in the dataset are located in North America, which may limit the generalization of models trained on this data. Participants are encouraged to consider methods for robust and fair modeling across varied environments.

By combining dynamic geometry, large-scale scene evolution, and future semantic reasoning, the NSS challenge offers the CV community a rigorous, high-impact platform to test algorithms in scenarios that closely mimic real-world challenges in construction, renovation, and modular building reuse — bridging the gap between technical innovation and tangible societal impact.


:question: Questions

Contact the organisers at cv4aec.3d@gmail.com


Organizers

Iro Armeni
Iro Armeni
Assistant Professor, CEE
Stanford
Fuxin Li
Fuxin Li
Associate Professor, CS
Oregon State
Michael Olsen
Michael Olsen
Dean's Professor, CCE
Oregon State
Yelda Turkan
Yelda Turkan
Associate Professor, CCE
Oregon State
Marc Pollefeys
Marc Pollefeys
Professor, CS
ETH Zurich
Sayan Deb Sarkar
Sayan Deb Sarkar
PhD, CEE
Stanford
Emily Steiner
Emily Steiner
PhD, EE
Stanford
Tao Sun
Tao Sun
PhD, CEE
Stanford