ITcon Vol. 24, pg. 511-526, http://www.itcon.org/2019/28

Single- and multi-label classification of construction objects using deep transfer learning methods

published:December 2019
editor(s):Dermott McMeel & Vicente A. Gonzalez
authors:Nipun D. Nath, Ph.D. Student,
Texas A&M University;
nipundebnath@tamu.edu and http://people.tamu.edu/~nipundebnath/

Theodora Chaspari, Assistant Professor,
Texas A&M University;
chaspari@tamu.edu and https://chaspari.engr.tamu.edu/

Amir H. Behzadan, Associate Professor,
Texas A&M University;
abehzadan@tamu.edu and http://people.tamu.edu/~abehzadan/
summary:Digital images are extensively used to increase the accuracy and timeliness of progress reports, safety training, requests for information (RFIs), productivity monitoring, and claims and litigation. While these images can be sorted using date and time tags, the task of searching an image dataset for specific visual content is not trivial. In pattern recognition, generating metadata tags describing image contents (objects, scenes) or appearance (colors, context) is referred to as multi-label image annotation. Given the large number and diversity of construction images, it is desirable to generate image tags automatically. Previous work has applied pattern matching to synthetic images or images obtained from constrained settings. In this paper, we present deep learning (particularly, transfer learning) algorithms to annotate construction imagery from unconstrained real-world settings with high fidelity. We propose convolutional neural network (CNN)-based algorithms which take RGB values as input and output the labels of detected objects. Particularly, we have investigated two categories of classification tasks: single-label classification, i.e., a single class (among multiple predefined classes) is assigned to an image, and multi-label classification, i.e., a set of (one or more) classes is assigned to an image. For both cases, the VGG-16 model, pre-trained on the ImageNet dataset, is trained on construction images retrieved with web mining techniques and labeled by human annotators. Testing the trained model on previously unseen photos yields an accuracy of ~90% for single-label classification and ~85% for multi-label classification, indicating the high sensitivity and specificity of the designed methodology in reliably identifying the contents of construction imagery.
keywords:Deep learning, transfer learning, convolutional neural networks, construction photos, web mining, multi-class classification, multi-label classification
full text: (PDF file, 1.731 MB)
citation:Nath N D, Chaspari T, Behzadan A H (2019). Single- and multi-label classification of construction objects using deep transfer learning methods, ITcon Vol. 24, Special issue Virtual, Augmented and Mixed: New Realities in Construction, pg. 511-526, https://www.itcon.org/2019/28