SEMI-AUTOMATIC 3D CRACK MAP GENERATION AND WIDTH EVALUATION FOR STRUCTURAL MONITORING OF REINFORCED CONCRETE STRUCTURES

SUMMARY : Bridge inspection is a time-consuming, expensive, but indispensable task. In this work, a new semi-automatic workflow for a concrete bridge condition assessment system is developed and discussed. The workflow consists of three main parts merged in the new methodology. The elements are the data acquisition with cameras, the automated damage detection and localization using a neural network


INTRODUCTION
The worldwide traffic infrastructure ensures our lifestyle.Goods are transported across entire continents by road and rail.In the event of unforeseen closures, dilapidated bridges not only lead to delays in freight traffic but also pose a risk to human life.Accidents like the collapse of the Morandi Bridge in Genoa, Italy in 2018 or the collapse of the Mexico City Metro in 2021, both with deaths and injuries, or accidences like the subsidence of the Salzbach valley bridge near Wiesbaden, Germany in 2021, are showing the necessity and relevance of bridge inspection and condition assessment.In addition to the accidents that have already occurred, the age of many bridges, the increasing traffic load, and the increasingly frequent severe weather events and natural disasters make regular monitoring of the infrastructure imperative.
In Germany, there are more than 25.000 railway bridges with an average age of about 60 years (Allianz Pro Schiene, 2017) and about 39.500 highway bridges, with a lot of them built in the 1960s and thereafter (BMVI, 2019), and many more bridges in cities and rural areas.According to the norm DIN 1076 (DIN, 1999), any engineering structure of the transport infrastructure must be inspected every year by visual inspection.Every six years there is a main inspection followed by a simple inspection three years later.These regular inspections are supplemented by inspections after unexpected events, like accidents or natural disasters.Nowadays, for reinforced concrete structures these inspections are still done personally visually with the help of a crack scale card and a hammer.The latter enables the identification of delamination and cavities by a sound response analysis.Identified defects and damages are documented in an inspection report and assessed regarding their impact based on the regularly updated damage example catalog from guideline RI-EBW-PRÜF (Bundesministerium für Verkehr und digitale Infrastruktur, 2019).Every observed damage is given a grade in the categories of stability, traffic safety, and durability (Bundesministerium für Verkehr und digitale Infrastruktur, 2017).These grades are based on a catalog with example grades for nearly all encounterable defects (Bundesministerium für Verkehr und digitale Infrastruktur, 2019) and hence provide a solid base for all inspections in Germany.Further algorithm-based processing (Haardt, 2013) of the defect-describing grades provides values for every structural component and ultimately the whole structure.The purpose of this method is to store all the data in a nationwide database to monitor structural aging and support decision-making (determine the end of service life, load-decreasing measures, maintenance, and repair).While providing reliable and comparable results, three drawbacks exist.First, defects not visible on the surface are not detected.Second, older bridges have the same inspection intervals as newer ones, although they require more thorough and time-consuming inspections (Holst, 2016).And third, the inspection results might not be comparable due to human subjectivity (Weller, 2021).
The overall process of crack inspection is time-consuming due to fine structures, large surfaces, and difficult-toreach areas which require lifting platforms.In the future, this task could be completed by unmanned aerial vehicles (UAV) or unmanned ground vehicles (UGV).Therefore, as part of this paper, an alternative semi-automated workflow for crack map generation and crack width evaluation is proposed.The objectives of this work are: (1) applying sub-millimeter crack detection on RGB images using a neural network for semantic segmentation, (2) deriving point cloud data and camera poses based on a photogrammetry pipeline based on structure from motion, (3) projecting segmented crack pixels onto the surface of a CAD model, (4) measure the width along the cracks, (5) compare it to manual inspection results, and (6) testing the workflow on a real concrete bridge with challenging conditions including crack-similar structures.
In the following section 2, the current state-of-the-art of automated bridge inspection is summarized and discussed.Section 3 describes the authors' approach using a concrete probe, followed by a description of the data acquisition for a real concrete bridge in section 4. Section 5 describes the implemented automated damage detection and localization.Moreover, the crack parameters are derived and analyzed.In section 6, the concept of an automated crack detection methodology is discussed.With a conclusion and an outlook in sections 7 and 8, the work is summarized.

Methods used in practice
For reinforced concrete bridge inspection, automated procedures are used only very limitedly in practice.The inspection of cracks consists of visually checking the overall bridge surface, detecting all cracks, measuring the cracks' width and length, determining the cracks' type, position and orientation relative to the bridge component, documenting all information, and lastly evaluating the danger or risk posed by the crack for the stability and durability of the component or overall bridge (Bundesministerium für Verkehr und digitale Infrastruktur, 2019).All the listed tasks are mostly made manually by a close-hand inspection and documentation in text and photos.The evaluation and grading of structural damages in Germany are supported by a catalog with damage examples and an online tool (WPM Ingenieure GmbH, 2022) which calculates an algorithmic-based condition note based on the inspectors' entries.The inspectors have a certain leeway to weight damages more severe or minor.However, at the bridge on site, they receive very little and infrequent support through automated procedures.In the guidelines for the preservation of buildings RI-EBW-PRÜF 2017 (Bundesministerium für Verkehr und digitale Infrastruktur, 2017), the use of automated or visual procedures is described.Accordingly, damages detected using the automated or visual inspection must be inspected again close-hand (Bundesministerium für Verkehr und digitale Infrastruktur, 2017).Only the use of laser-based methods for tunnel inspection and visual methods for the inspection of bridge cables are listed.Divergent practices require the consent of the Federal Ministry of Transport, Building and Urban Affairs (Bundesministerium für Verkehr und digitale Infrastruktur, 2017).Therefore, crack detection today is still made manually.To the knowledge of the authors, photogrammetric approaches based on drones are used to detect cracks in research (Mongelli et al., 2017) or are planned to be integrated into the inspection process (BAST, 2022).However, by using image processing, usually, only the crack length and not the crack width is derived which is due to resolution limitations.Moreover, in most cases, it is used to inspect the infrastructure in omnipresence or to process a photogrammetric point cloud or a textured mesh.

The current situation regarding crack detection and localization
To assess the condition of concrete bridges, surface damages including cracks could be mapped manually.For the automated detection and measuring of cracks, different sensor systems such as laser scanners or cameras can be used.The major challenge is the minimum crack width of 0.2 mm required by the norm DIN EN 1992-2:2010-12 (DIN, 2010).As tested and evaluated in our previous work (Merkle, Schmitt and Reiterer, 2020b), the resolution of laser scanning is not sufficient for such thin cracks.In contrast, the ground sampling distance (GSD) of a camerabased detection can be in the range of 0.1 mm which covers the required minimum crack width at the expense of the inspectable area per image.Halving the GSD means either the resolution of the image must be quadrupled by using another sensor or one image must be replaced by multiple images with the same sensor at a closer working distance or with a lens with double focal length.Depending on the overlap, which is required for the photogrammetric approach, more than four images are required, which can increase the acquisition time and lead to a longer processing time for the photogrammetric workflow.However, for now, the photogrammetric approach is the only approach allowing both the detection of thin cracks and the mapping of the overall bridge surface.
Due to the diverse bridge surfaces, classical image processing and classical machine learning are not able to deal with diverse bridge scenarios.Therefore, recent works use image processing based on convolutional neural networks (Vignesh et al., 2021), (Xu et al., 2019), (Vashpanov et al., 2019), (Li et al., 2020), (Li et al., 2021).(An et al., 2021) use a combination of a convolutional neural network and cluster segmentation.In the end, they calculate the mean crack width based on the crack area and length, derived by the half perimeter.(Chen and Shen, 2023) propose a new network architecture that performs better than U-Net and DeepllabV3.They use a skeleton algorithm to find the center line of the crack.However, they do not apply their method to submillimeter cracks.(Ye et al., 2023) calculate the crack width based on parallelograms after cutting the images into longitudinal and transversal segments.(Hadinata et al., 2023) use three different damage classes: voids, cracks, and spalling.This can help to detect a void within a crack as the widest crack width.However, they do not apply crack width measurement.(Zhang and Guo, 2023) apply a regularization approach to improve the segmentation of fine structures with low contrast.Recent works show that detecting fine crack structures within high-resolution images is still a challenging task.Most of the recent work does not cover submillimeter crack detection and width measurement.(Jahanshahi and Masri, 2013) and (Jahanshahi et al., 2013) start with a crack width of 0.4 mm.The crack width is still overestimated despite averaging the local neighborhood makes the crack width estimation more stable when relying on segmentation results.For correcting the perspective error, they use local SfM.Another option is combining a camera with a four-point laser emitter as proposed by (Li et al., 2022).This allows for deriving the GSD on planar surfaces.Another option is using a stereo camera or a flash LiDAR to derive depth images.
Apart from crack detection, crack localization in 3D is important since the meaning of a crack is highly dependent on the position and relative orientation.Liu et al. (Liu et al., 2020) present a projection workflow for a bridge pier.They use structure from motion (SfM) to reconstruct a 3D mesh.The minimum detectable crack width is in the millimeter range.However, the proposed workflow is dependent on a gap-free reconstruction without false artifacts due to the meshing process.While the minimum detectable crack width is dependent on the sensor and image acquisition constraints, the measurement accuracy is dependent on the projection and mesh quality.For the localization of image information, photogrammetric approaches based on SfM can be used as proposed for thermal images (Merkle and Reiterer, 2021), visual damage inspection, or deformation analysis (Hallermann et al., 2018).

Our contribution
The main research questions, addressed in this work, are: (1) whether sub-millimeter crack detection using semantic segmentation is sufficient for challenging surfaces including crack-similar structures, ( 2) what the tradeoff between context, defined by the field of view, and ground sampling distance is, (3) if the combination of photogrammetry and the use of a reference CAD model allows sufficient crack width measuring, (4) how the poposed crack map generation can be used as part of an automated crack inspection, and ( 5) if the crack data can be used for further applications such as estimating the corrosion process of the reinforcement.None of the listed crack localization procedures considers an already existing as-planned CAD model but either a representation as a point cloud or a mesh derived from the acquired image data.Since meshing a concrete bridge based on photogrammetry to a new model is highly challenging and time-consuming, we present an approach of projecting automated segmented images with crack information directly to the bridge's as-planned model.This approach facilitates the comparison between time-shifted inspections, reduces processing and manual modeling time, and enables the adoption of existing building information modeling (BIM) models, which in Germany since December 2020 are at least mandatory in the planning process for publicly tendered infrastructure projects like bridges (Werthmann, 2021).For most of the existing bridges, there is no 3D CAD model available.However, the 3D CAD model would have to be created only once per structure, since for following inspections the same model can be used.

CONCEPT AND PRELIMINARY EXPERIMENTAL STUDIES
The bridges' geometry can be complex and the surface diverse.Therefore, as a first test, cylindrical concrete probes with cracks resulting from compression tests were used as shown in Figure 1 to test a processing flow as depicted in Figure 2. 34 images were taken by hand from different angles around the symmetrical axis.Images including cracks were manually annotated using polygons.By using a photogrammetric workflow of the open-source software Meshroom (AliceVision, 2021) based on SfM, both the point cloud of the probe and the camera poses were calculated.The processing pipeline is stopped after the sparse point cloud computation and before the time-consuming densification starts.In parallel, a CAD model of the cylinder was created and exported as an STL file.By manually picking three points per circular edge within the processed point cloud, both the scale and the transformation matrix could be derived to match the point cloud with the mesh as shown in Figure 3. Since a cylinder is rotationally symmetric and has no distinctive features, the matching of the point cloud and the cylinder is random which is not a problem for a first test.After the point cloud, the camera poses were also transformed into CAD coordinates.As a next step, the intrinsic calibration of the camera was derived using a checkerboard and OpenCV.For each annotated image, the mesh is transformed into camera coordinates.Thereafter, by using the camera matrix, the faces of the CAD mesh can be projected into the image in UV coordinates.
Only faces visible by the camera are used.Since a cylinder is a simple geometry and has a convex body, it is sufficient to check for each face if the portion of the normal vector in the camera direction is pointing towards the camera.Otherwise, the face is shaded.In our case, each image contained the overall cylinder.In the case of partial coverage, even not shaded faces would not be in the image due to the limited field of view (FOV).By using an algorithm to find points within a path given by the face's vertices, the crack pixels inside the currently projected face are selected as shown in Figure 4.  Afterwards, the selected pixels are transformed from UV coordinates into barycentric coordinates based on the projected face vertices.Thereby, the corresponding 3D position of the crack pixel can be calculated by using the 3D coordinates of the face vertices and applying the barycentric coordinates into the 3D space.This approach requires that all three vertices of a face are within the image area since points outside the image are not covered correctly by the intrinsic camera calibration.Therefore, the resolution of the mesh should be high enough relative to the FOV.Alternatively, the mesh could be adapted by splitting a face into smaller faces until all relevant vertices are within the image containing damage information.However, this approach was not used in our workflow to keep the processing time short.By repeating this projection procedure for all annotated images, imprinted points of all cracks on the cylinder surface are derived.To avoid double projection of cracks included in multiple images as shown in Figure 5, faces are blocked if they already contain projected pixels.The shift is a result of the accuracy of the intrinsic calibration, the disparity of ideal mesh and real probe, and the accuracy of the matching of point cloud and mesh.Moreover, the accuracy of the photogrammetric workflow and the accuracy of the crack annotation lead to this error.Especially due to the round shape of the cylinder with a small radius, the shift is dependent on the angle between the normal vector and the ray to each pixel.For this experiment and the used workflow, the maximum shift is below five millimeters.The final projection is shown in Figure 6.It does not include double projections.This is the first step of testing the projection concept and shows that there are even problems with this simple geometry.Since, for this test, manual annotations were used, the evaluation of crack width, size, and orientation is not evaluated but is done later in section 5 where automated crack detection is used.During annotation, it was observed that cracks, seen from a flat angle, are not visible in the images or the cracks appear thinner.This effect is amplified by the cylindrical shape which is comparable with a bridge pier or other complex structures of a real bridge.Therefore, the next step is transferring the gained knowledge to a real bridge scenario with complex geometry, multiple parts, bigger dimensions, and diverse surface conditions.For this purpose, a small bridge close to Freiburg was selected from more than twenty sighted bridges.The reinforced concrete bridge crossing the small water channel Altgraben in the wake of the interstate road B3 in the vicinity of the city Freiburg in southern Germany has been chosen for testing due to its accessibility, multiple cracks, a diverse realistic surface with dirt, graffiti, and vegetation, the construction type, and the building age, which is 40 years and close to the average bridge age for federal roads of 38 years (Federal Highway Research Institute, 2021a).Figure 7 depicts the southern side of the bridge and Figure 8 shows the sectional view of the east superstructure.The construction was done in 1981 and the bridge has a military load classification (MLC) of 60 for traffic in one direction and 40 in both directions.The bridge span is given as 8.8 meters while the width is 26.5 meters based on the bridge's construction book.The construction consists of two similar parallel bridge superstructures resting on abutments connected by a toothed joint.The overall condition of the bridge was assessed with a condition grade of 2.2 during an inspection in 2017, (Federal Highway Research Institute, 2021b) which means the bridge is in a satisfying state (range from: 1.0 best state to 4.0 worst state).This grade includes stability, durability, and traffic safety.The authors had the possibility to read the unpublished inspection report from 2017.The cracks listed in this report had a crack width of up to 0.4 mm.For both manual and automated inspection, the light conditions are challenging due to the low height of the bridge and the compared long bridge width.

Main bridge inspection
The authors had the chance to participate in the main bridge inspection of the selected bridge in October 2020.This serves on the one hand as a reference and on the other hand, it provides a better understanding of the challenges in practice and possible improvements by the developed methodology.During the inspection, the accessible concrete surface was checked for defects by knocking at the concrete using a hammer.Since trained personnel can distinguish the sound from a blow at intact concrete in contrast to damaged concrete, flaws can be found reliably.Porous concrete is then removed to inspect the condition of the reinforcement underneath.Identified surface cracks were documented after measuring their width using a crack scale card.Those measurements are conducted at the widest part of a crack (the site is marked afterwards with chalk) based on the inspection staff's discretion.Moreover, small spalling or pores within the crack are not considered for the crack width.This information is important for the later automated crack width evaluation.Finally, road marks, sidewalks, and traffic signs were visually inspected for traffic safety assessment.

Photogrammetric acquisition
As part of our work, we concentrate on the underside of the concrete bridge since damages there mainly affect the safety and durability of the bridge while on top of the bridge, only the traffic cover can be inspected.The proposed method is tested on the underside surface but is not limited to it.For the data acquisition, we used a smartphone camera with the specs listed in Table 1.Using a smartphone camera is a lightweight option, that can allow sufficient resolution, and easy usage and is often already used in close-hand inspections.Especially in dynamic light and distance conditions, recent smartphone cameras perform well, and it is to be expected that their performance and technology will advance.However, the results of the proposed method can be improved by using a digital singlelens reflex (DSLR) camera with a high-quality lens and a big sensor.In contrast to the cylindrical probe, many images are necessary to cover the surface of a bridge.At the first inspection, the overall bridge must be covered.From the second inspection, it would be also possible to take images only of the visible cracks since the previous images can be used for the photogrammetric processing of the new camera poses.In the case of the selected test bridge, around 1000 images were taken.The strategy is to take enough images from different positions and angles to cover the overall bridge underside.Furthermore, close-distance images are required depending on the image resolution, to detect or even measure the cracks.Lastly, images in the medium distance are required to allow feature matching between all images.Depending on the GSD difference between close-distance and long-distance images, even multiple steps are required.This depends on the used feature descriptor, in our case the scale-invariant feature transform (SIFT) feature descriptor, and the size of the features in a bridge scenario.A small test including 20 images of a crack in 10 to 200 cm distance, listed in Table 2, shows that an image with 10 cm distance to a crack and an image with 120 cm to a crack can be matched when setting the describer preset to »ultra« in the software Meshroom.This corresponds to a GSD ratio of more than 10.At »high« settings, the images can be matched when taken at distances of 10 to 100 cm and at »normal« settings when taken at distances of 10 to 50 cm to a crack.These setting descriptions are no absolute measures, but they show how dependent it is on the settings of the photogrammetric workflow.The higher settings come with a higher processing time.Moreover, this does not mean that it also works from 1 m to 12 m since this highly depends on the texture of the surface of the structure.The features usually include concrete structures and texture due to the formwork, geometries like edges and corners, features within the ground surface below the bridge, and coatings like vegetation, graffiti, or dirt.Therefore, a more conservative approach should be used with images connecting far and close images.Another approach would be a real-time check if the images can be matched like it is used for visual simultaneous localization and mapping.For the acquisition of the bridge, both side walls were captured from an 8 m distance at an angle of 90 and 45° and 4 m at 90° as shown in Figure 9.Moreover, the edges of the 4 corner areas, later used for registration, were additionally imaged from different distances and angles.Since the ceiling is low (~2 m), images were taken close to the ground pointing to the top.Due to the water flow, images pointing diagonally to the ceiling above it were added.Lastly, both sides of the bridge were imaged as well as it was possible due to the surrounding vegetation.Additionally, further images were taken of corners, small areas around the bearings, and the connection interface of the two bridge parts to reach a higher level of coverage of the entire underside of the bridge.The low texture of the ceiling combined with bad lighting conditions, the water flow separating the photogrammetric reconstruction, and the geometry of the bridge with a high ratio of width to length and height led to a longitudinal shift between the two abutments in a first run.This problem was solved by capturing the ceiling images from as low as possible.Moreover, the variation of distance and angle including enough overlap was essential.
In the area of cracks, we took further images at around 30 cm distance to reach a GSD of 0.1 mm.Based on the sampling theorem, this is necessary to safely detect 0.2 mm cracks.Another factor is the effect of RGB, which has lower contrast than grayscale and is interpolated.As shown later, the detection of cracks is also possible with a GSD bigger than the crack width.However, one pixel then represents both crack and surrounding, which leads to a lower black value and lower contrast and depends on a bright surrounding surface.
The result of the photogrammetric approach using Meshroom is the point cloud with corresponding camera poses, depicted in Figure 10.The ceiling area above the water flow is sparser due to the angled acquisitions, which leads to a distortion of the texture reducing the distinctive features.The point cloud was processed with normal settings to reduce computation time.At higher feature extraction (FE) settings, the ceiling is expected to be better covered as previously seen in the comparison in Table 2.The processing time is split into 4 parts.First, the SfM of the overall images is processed.Afterwards, a multi-SfM approach for three different images each containing 20 different images from 10 to 200 cm distance is processed.The processing time to obtain point cloud and camera poses is 54 minutes using an Nvidia RTX 2080 Super.Another 47 minutes are required to do the meshing, which is used to obtain a densified point cloud.Lastly, a further 47 minutes would be required to obtain the textured mesh, depicted in Figure 11, which is not used in the proposed workflow.

Neuronal network
As a neuronal network, a code base (Negassi, Wagner and Reiterer, 2022) including optional augmentation of the U-NET architecture with the default number of layers and features (Ronneberger, Fischer and Brox, 2015) is used.
With its roots in biomedical image segmentation, it allows fast and precise segmentation of fine structures.The overall pipeline of this work is not limited to U-NET and can be replaced e.g. by the network proposed by (Chen and Shen, 2023).Our goal is both to detect and measure the crack size.The detection task does not require segmentation since object detection would be sufficient.Moreover, based on the detected bounding box, the spatial size of the crack and its orientation could be derived by defining different classes for different orientations such as oblique crack, longitudinal crack, and transverse crack.For measuring the crack length along the crack, width, and imaging the precise crack shape, pixel-wise segmentation is required.The quality and performance of the prediction, especially for such thin structures like cracks smaller than 0.3 mm, which corresponds to very few pixels of crack compared to the overall image resolution, is mainly dependent on the amount, diversity, and quality of the annotated data, the used camera, and the acquisition distance and angle of the training data.

Training and test data set
As part of this work, 126 images in total were annotated.76 images are used for training and each 25 images for validation and testing.Due to the high 9 MP resolution of the images, they are cropped into 16 crops each with a resolution of 864 × 864 pixels.It was manually tried to create a homogeneous distribution.Moreover, it was checked that each damage occurs only once within the data set.Some exceptions are included.However, the different perspectives, distances, and areas of a crack vary in such a way that the test data differ from the training and validation data.The data set is very small.To train a stable network that works for all kinds of concrete bridges and crack types, a huge and diverse data set is required.The use of this small data set, however, shows that for identifying different crack structures on the same bridge even a small number of images can be efficient.To avoid overfitting, the best checkpoint on the validation data set is used.
Annotating the crack images pixel-wise is time-consuming.The annotation of one image with 9 MP takes between 15 and 25 minutes.For this, we used an internal tool that allows drawing polygons around the crack outline.One example is shown in Figure 12.The annotation time and quality depend on the number and precision of the selected vertices to create the polygon.Since the transition from the crack to the undamaged area can be smooth, it is difficult to select the correct crack border.Here, a more conservative approach was chosen to reduce time and effort and to increase safety due to the critical effect of underestimated cracks.Another challenge is not to include the widening of the crack due to small spalling which often appears along the route of the crack.Furthermore, gaps within a crack can be either connected or separated when annotating depending on the gap distance.
With a GSD of 0.1 mm, a 0.2 mm crack is difficult to annotate.Without resampling, the zoomed-in image appears as depicted in Figure 13.As part of this work, the resampling is only used for annotation but not for the neural network.Using a higher sensor resolution or reducing the working distance should be the first step before resampling the data.The learning rate is set to a balanced value of 0.05 which is in the default range of 0.01 to 0.1.Since there are many more pixels of undamaged concrete than pixels of cracks, the dataset is highly unbalanced.Therefore, a weight vector is used to adapt the loss function individually for the two classes.The class weight of "background" is 0.00259493 and of "crack" it is 0.99740507 which is the reciprocal of the percentage of samples within the training data set.

Performance
The neural network converges after 4000 epochs.Out of 312,094 annotated crack pixels and 228,302,306 background pixels in total, 186,799 crack pixels are predicted as true positive (TP), 146,293 as false positive (FP), 125,295 pixels as false negative (FN), and 228,156,013 as true negative (TN).This leads to a recall of 0.599 and a precision of 0.561 which are defined as The intersection over union (IoU) is defined as and calculated to 0.408.Due to the low number of pixels per crack width, the possibility of branches, the width variation along the crack, and the jagged shape with small spalling, it is important to understand what the IoU, the recall, and the precision mean for this specific damage.The higher the IoU, the lower are FP or FN.An IoU of 1 would mean annotation and prediction are identical.Since the IoU of 0.408 is quite low, it is important to look at the recall and precision.The recall of 0.599 is higher than the precision of 0.561.This means the FN classified pixels are less than the FP.Due to the unbalanced class distribution, it is more probable that an FP appears than an FN.Since the number of images for training, validation, and testing is small, the statistic is still highly dependent on the image selection.Many images from a far distance include FP since other features appear as cracks from a higher distance.The conservative annotation leads to cracks being predicted wider which means more FP.However, there are also cracks, which are not or only partially detected.This leads to FN. Optimizing the IoU is a good option to have a balance between FN and FP.In the case of damage detection, FN should be weighted higher.
Figure 14 shows further reasons for the moderate IoU.Crack-similar joints, dirt, vegetation, motion blur, GSD limit and variations, or paint complicate the correct prediction.The diversity of this bridge is so high that it cannot be represented by images of itself without overfitting.One solution for this is augmentation.Otherwise, more annotated data from multiple bridges is required.However, this is not further addressed in this work since it is expected to improve the IoU which is shown in previous work (Negassi, Wagner and Reiterer, 2022).

Influence of ground sampling distance
In contrast to detecting larger damages in images, cracks are fine structures with a low number of pixels per crack width.To ensure that the low IoU is mainly a result of the low amount of training data and not due to the U-Net convolution architecture, another experiment was conducted to analyze the performance of the neural network regarding scale invariance.For this, intentional overfitting was reached using a single synthetical test image for training, validation, and testing as can be seen on the left in Figure 15Error!Reference source not found..The prediction of the overfitted network is shown on the right.The full results with the calculated area ratio are listed in Table 3.Error!Reference source not found.Ratios with an error of more than 10 % are marked in red.The most important results are that a 2×2 square and a single pixel line are not detected.A crack-similar structure with a width of one pixel is detected but only two of the 39 pixels are predicted as cracks.This shows that there is a minimum number of pixels required due to the convolutional layers within the neural network.Due to the jagged shape of a crack, the detectability is increased.Since the task is not only detection but also measuring, the crack width should be more than 2 pixels to stay within an error of 10 % for the predicted area.This is directly linked to the crack width calculation proposed later in this work and helps to understand the performance of the U-Net.Apart from this small experiment with a synthetic image, the effect of the GSD on the prediction using the neural network with the real bridge data is shown in Figure 16Error!Reference source not found..The prediction of 10 different working distances from 10 to 100 cm in 10 cm steps is compared.Furthermore, these images were not included in the training or validation data.Therefore, it is an unseen scenario and can be used to check the prediction accuracy.In the closest image at a 10 cm distance, only the crack and no other structures are detected.However, not the entire crack is detected.This can be due to missing images within the training set with a close distance of 10 cm.From a 20 cm distance, most of the crack is detected.This is due to a known GSD and high contrast.With increasing distance, other crack-like structures are also detected as cracks, which can be the result of higher GSD and lower contrast.Colored structures from higher distances can also lead to false positives since structures thinner than a pixel lead to greying since it is a mixture of concrete and structure.Another effect is that with increasing distance the number of FN increases since the crack is only detected partially.From experience, a crack area is still detectable if the ratio of GSD and crack width is below 3.However, this is suited only for detection and not for measuring.On the other hand, the FP is increasing since crack similar structures are predicted as cracks.This shows that the ratio of GSD to crack width is critical for the performance of the precision and recall of the segmentation.However, defining a general relation is difficult since it also depends on the dimensions of the surrounding features in the concrete texture.The best solution was to have an almost constant GSD, which is often not feasible in practice.

Crack width correction
Figure 17 shows a damaged area with a 0.2 mm crack based on the last inspection and the related segmentation.The thin crack is detected at a working distance of 50 cm leading to a GSD of 0.17 mm close to the crack width.
Apart from the crack, other vertical or horizontal structures and parts of the chalk label are detected.A possible reason is the confusion of the neural network by cracks within colorful graffiti.Otherwise, the dark color of the crack should make it easy to distinguish a painted crack from a real crack.As soon as the crack is filled with colors it is even difficult for the inspectors to check if it is only a crack within the lacquer coating or in the concrete.Another conspicuity is the detection of the bottom right crack which is much thinner than 0.2 mm.However, the width is detected as larger.This problem could not be solved within the neural network due to a limited number of training data.Therefore, post-processing including adaptive filtering the grayscale value of the detected crack area is applied, as depicted in Figure 18.This simplified approach is dependent on a brighter surrounding than the crack.For later crack width estimation, the medial axis and contour of the filtered crack are derived as shown in Figure 19.In this example the crack width of 0.2 mm is represented by only 3 to 5 pixels due to a higher GSD.
In summary, a low IoU is a result of a high number of FN and FP.Those can be due to missed or false predicted instances or instances which are detected only partially or too wide.This is directly linked to the amount of training data.As part of this work, cracks are detected too wide by the neural network.By using a synthetic image experiment, it is shown, that for crack-similar polygons with a mean width below two pixels, the crack area is predicted more than 10 % larger.The second experiment, testing different GSD for the same damage, shows that the number of FP increases with higher distance.To achieve a higher IoU, a smaller GSD and more training data are required.Due to the diversity of bridge surfaces, this effort can be reduced by using the proposed postprocessing step.Thereby, the predicted crack width is reduced, which allows enhanced crack width evaluation as presented in section 5.7.

Damage localization and 3D referencing
Detecting the cracks within the images is the first step but deriving the location of the crack is also important.
During an inspection, this is done by using chalk labels or by drawing the crack manually into a 2D surface map.
For better documentation and possible subsequent evaluations or simulations, a precise representation of the crack in 3D is beneficial.Challenges in this context are that the textured mesh can suffer from false geometric representations, or the resolution of the dense point cloud is not high enough to be labeled by the segmented images.Another approach is projecting the image information onto the processed mesh.However, faulty representations of the mesh can lead to false texturing.Therefore, a method of projecting the damage information onto the as-planned CAD model is developed.As an important boundary condition, the disparity between the asbuilt and as-planned models should be in the millimeter range (or centimeter range on edges due to formwork).
Even with a disparity of 1 cm and a working distance of 1 meter, either the crack position is shifted up to 1 cm or the crack width and length are projected too big by only 1 % which would equal 0.002 mm for a 0.2 mm crack.
One drawback can be cracks on the edge which can lead to projections to the mesh areas behind it.Since for the selected bridge presented in section 4.1, only 2D technical drawings have been available, a 3D CAD model of the bridge was derived with the software ANSYS as shown in Figure 20.The CAD data is exported as an STL file including all surface information.For the crack evaluation, only the exposed surfaces are required, which are depicted in Figure 21.This step was performed manually since at this point deriving an automated approach is of minor importance.In the future, it might be possible to use an automatic approach by checking the planar contact of connected structures, defining a volume or a number of points in 3D from which the exposed surfaces can be derived automatically by using raytracing and/or using the normal information of the faces.To project the image information containing the crack pattern to the CAD model's surface, the point cloud must be scaled, and registration is required.A possible solution would be the use of optical targets which allows automated detection.In the case of the selected bridge, we used four distinctive features, four vertices in the corner areas of the abutments, which are manually picked within the point cloud and the CAD data.Thereby, the transformation matrix and scale are derived using a Kabsch (Kabsch, 1976) least-square fitting of two 3D point sets.Compared to ICP algorithms, it is limited to rigid transformations but is faster and less sensitive to local minima.The offsets between the four transformed manually picked points and the four corresponding vertices of the CAD mesh are 30.6, 22.4, 26.7, and 44.7 mm.The result of the registration is shown in Figure 22.Due to the uneven concrete surface and the chamfered edges due to the formwork, there is a deviation from the as-build and as-planned model.It is expected, that the accuracy of the Kabsch registration method is in the millimeter range but the deviations of edges and inconsistencies within the photogrammetric point cloud data lead to the offsets of two to five centimeters.Using planes instead of edges can help overcome this problem since a fitted plane is more accurate if there are no larger deviations within the surfaces.Moreover, there are bridges with curved surfaces where different approaches must be used.In case of good reachability of the bridge surface, optical targets might give a more precise result, if they are placed correctly and at the same position at the next inspection.The best solution would be permanent markers out of metal embedded in the bridge structure.The transformation and scale are also applied to the camera poses shown in Figure 9 and Figure 23.The next step is projecting the crack information to the CAD surface.The projection method tested on cylindrical concrete probes in section 3 was based on projecting the face vertices into image coordinates and deriving the 3D coordinates using barycentric coordinates.However, this is not sufficient for a real bridge scenario, because images cover faces partially.This would lead to projections of vertices outside of the FoV where no intrinsic calibration including distortion parameters is known.Therefore, for the selected test bridge, 3D crack point derivation is based on raytracing.The rest of the proceeding flow is the same as depicted in Figure 2 except from the grey box.Faces are solely used for conclusion check, discretization of the overall bridge surface, and avoiding multiple projections of the same damage.The pseudocode of this method is given in Algorithm 1.The first step is the transformation of the camera poses in CAD coordinates.Thereafter, the first predicted image is read.If the image includes crack information, there is a loop over all faces.For each face, one must check if it is in the FOV.This is done by checking if it is in front of the camera and if the normal is pointing towards the camera.Otherwise, it would not be a front face.Moreover, the face must not be occluded by another face.Here, a simplified approach is chosen by checking if a ray from the face center to the camera intersects with another face.In case the face is not included, the vertices are projected into the image using ProjectPoints of OpenCV.Then, the image is masked by the outline of the face defining the pixels within the face.After checking if all the crack information of the image was not already projected to other faces, for each crack pixel, a ray is defined and the intersection with the face plane is calculated.Thereby, a crack point cloud for each image is derived.One example can be seen in Figure 24.A corresponding real image is depicted in Figure 25.The projection algorithm is tested on the selected bridge which contains complex structures leading to occlusions.Moreover, many cracks start in the corners of the abutments which is common for bridges.This allows testing the accuracy and the correct face selection of the proposed projection method.As shown in Figure 24, the crack does not start in the corner of the mesh.The offset is in the range of a few centimeters.Moreover, for four different cracks, passing the border of faces, the interface regions were analyzed.The proposed method does not create inconsistencies between neighboring faces.However, due to global projection shifts, cracks close to an edge of a structure or cracks even leading around an edge can be projected to faces which are not occluded but sideways behind.As depicted in Figure 24, the projected point cloud data contains both cracks and crack-similar structures.Since there are multiple damages, the next step is clustering the data into individual point clouds using density-based spatial clustering with noise (DBSCAN).The maximum distance of points within one crack is adjustable and should be defined based on the GSD and rules for close-hand inspection.However, these rules are not defined yet since cracks are not documented in such a high level of detail during inspection.In some cases, the inspector documents two cracks as one if they subjectively belong together.
In the case of crack patterns, this is even more challenging and is currently based on the experience and individual judgment of the inspector.As a first test, we set the maximum distance between two cracks to 10 cm.Cracks with smaller distances are considered as one.Moreover, a minimum number of points per crack can be defined to remove noise, which can occur due to false predictions of the segmentation.This parameter depends on the GSD and should not be used.A better way of removing noise is based on geometric dimensions.Another solution is removing straight structures like edges or interfaces from the formwork.Since straight cracks are also possible, aggressive filtering could miss a critical crack.For a conservative approach, we maintain the crack-like structures since they are expected to become less if the neural network is trained with a bigger training set.Since optimization and training with a huge data set are not part of this work, the following is a perspective subsequent method for deriving crack parameters.

Crack parameter derivation
The main crack parameters are geometric parameters like width, length, depth, orientation relative to its bridge component except it is crazing, and a time stamp.Further information includes the type of cracks such as settlement cracks, dry cracks inside or outside the spray region of roads, water-bearing cracks, cracks within the bearing pedestal, and cracks because of alkali-silica reaction with no/starting/advanced structure relaxation.As part of this work, we focus on the main geometric parameters, which are length and width.
For each crack, we compute the length and orientation.The length is based on the distance of the two farthest neighbor points within the crack point cloud.Further options such as a fitted trajectory, as depicted in Figure 26, or adding all distances between the medial axis can be used.However, as part of a real inspection, only the total length is measured.As a simplification, the two end points are not identified as start and end points.Knowing the exact start and end point would require either knowledge of the course along the crack, which is challenging due to diverse crack shapes, or comparing the same crack at a state later in time when it is grown.By comparing all combinations of distances between the start points at the first state with the state later in time, the growing direction can be derived.However, the growing direction means that the crack is growing more in one direction than in the other since there might be cracks that grow in both directions.The orientation of the crack is derived by calculating the vector between the two end points.Since the orientation is specified relatively to the main axis of each bridge component, it would require further information which can be derived from BIM models in the future or must be defined manually once.For this, the component, the crack is projected to, must be identified.As part of this work, this separation is not done but is part of future research.
The next parameter is the crack width.As the authors experienced while accompanying a real close-hand inspection, usually the crack width is measured at the widest position which must not be a pore or small delamination.Based on the performance of the segmentation, it is difficult to identify the widest position since sometimes pores are included, and the crack is partially segmented too wide.This effect is expected to partially disappear when using a bigger training data set and including multiclass-segmentation.However, it is expected that there will be still single false positives.Therefore, for the segmentation results, the full histogram of the crack widths and not only the maximum crack width is derived.In contrast to previous works, the crack width is measured in the projected 3D points.This method also works for uneven surfaces or cracks within bigger delaminations where the GSD can vary strongly.The crack width for one medial axis point is defined as the sum of twice the distance to the closest contour point and the GSD of the respective contour point.This simplification saves computational time since the different contour sides do not have to be distinguished.In this more conservative approach, the crack width is defined between the contour pixel borders.However, for the proposed pipeline different crack width definitions can be used e.g., the distance between the contour pixel centers or by adding just a fraction of one GSD depending on the brightness of the contour pixel.
For testing the proposed method on the selected bridge, from different cracks, detected and measured during the last main inspection, three cracks are chosen.For each of the three cracks, Figure 27 shows a crack image with medial axis and contour points in 2D, a detail view of the projected points in 3D with three random measurements of the local crack width, and the histogram of the automatically derived crack widths for the respective image.The histograms are overlaid with lines representing the multiples of the mean image GSD.This helps to understand that the local crack widths are always close to a multiple of GSD.The deviation is due to the local GSD and the orientation of the crack width.The distributions are largely consistent with the inspection measurements.The maximum crack width is slightly overestimated.This can be due to local segmentation results.Crack areas with a width of only one GSD appear on the left of the histogram.However, this is not counted as a real measurement but more as a detection only.The derived crack width results can be used for further assessment as explained in section Error!Reference source not found.. Since identifying the maximum global crack width is still challenging further post-processing steps are necessary.Defining a minimum neighborhood as used by (Jahanshahi et al., 2013) can be used.For this, the maximum length of a spalling must be defined.Another solution is removing crack width areas that do not have a minimum length or length-to-width ratio.In the following, the challenge of defining the crack width is briefly discussed.Figure 28 and Figure 29 depict the measurement of a 0.2 mm crack using crack scale card.The detailed view in Figure 29 shows that the crack appears wider as 0.2 mm depending on the definition of the crack flanks.The slope, roughness, and curvature vary strongly.Moreover, shadow and color make precise measurement more difficult.Here, one could go back to the manual annotation of the data.Nevertheless, improving the annotation and the number of vertices of the polygons would highly increase the annotation time per image.Another annotation approach would be drawing trajectories within the crack and using a semi-automated approach of deriving a polygon by stretching the crack width at multiple points of the trajectory.However, it must always be weighed up if the semi-automated approach reduces annotation time while keeping or increasing the annotation quality.In this case, a trajectory could reduce the annotation time by a factor of two since not both crack sides must be annotated.But, applying the crack width correctly along the crack length requires a quality check.The difference between both annotation approaches must be further evaluated and is not part of this work.Furthermore, it would require a better definition of the maximum crack width.The current manual approach is highly subjective.This makes creating high-quality ground truth data challenging.For that, either a precise description is required, or the ground truth data must be annotated or at least checked by trained inspectors.Further importance is the correct handling of multiple overlapping images.Due to the projection error which is a result of a difference between the as-built and as-planned model, camera calibration errors, errors in the automated crack detection, and the error included in the SfM, there can be a misalignment within a crack.Figure 30 and Figure 31 in detail exemplarily show the result of a projecting error as a result of the accuracies listed in section 3.This error is in the centimeter range when combining close images and far images.This leads to inconsistent representations of damages acquired by multiple images.One solution is to image damages completely which is not possible for damages with large dimensions while keeping the required GSD using standard camera sensors.Another possibility is using optimization to reduce the error by trying to match the sub-point clouds.Alternatively, those images, representing the same damage could be stitched in 2D and then projected together using a homography matrix.Of course, the listed errors could be reduced, but it is assumed to not reach millimeter accuracy which is, in the case of a 0.2 mm crack, still a gap that leads to a double representation of a crack.Furthermore, the CAD mesh can be split into a fine mesh, where one face can only be projected by one image.This leads to inconsistency in the face interfaces and is highly dependent on the face size.A different way is doing crack detection on a texture map and projecting the damage information from the photogrammetric mesh to the as-planned model mesh using e. g. the closest distance to the mesh.Lastly, the projected point clouds can be matched in 3D using an iterative closest point (ICP) algorithm or different approaches.The correct handling of this problem is important for both overlapping images acquired during one data acquisition or images from different inspection times where one wants to compare the damage and monitor the crack evolution.The maximum depicted errors in Figure 31 are for the worst-case scenario.Usually, it must be assumed that the damage is captured with a similar or higher GSD and from the same distance and not from 20 different distances.Switching the sensor system with different calibration accuracies can lead to further errors but this is not expected within one data acquisition.The last option is using further sensors for localization in 3D.By decoupling localization from damage detection and measurement, there are fewer dependencies for GSD, number of images, etc.As listed, there are multiple approaches to solving the problem.The easiest approach is to block areas where damages were already projected either per face or by using e. g. a bounding box.The most qualitative approach is stitching the 2D images of one damage if it is not covered by one image.Thereby, the damage is not only documented in 3D but also in 2D which can be used by the inspectors to double-check the results.Apart from the challenges, it is shown, that representing submillimeter structures in 3D is possible with the proposed projection approach onto the CAD mesh.

Conceptual methodology for an automated crack inspection in practice
Automated data acquisition can be achieved using mobile platforms like UGV or UAV equipped with cameras.This requires mission planning which can be either manually defined or automatically derived by the bridge's CAD data or point cloud data in combination with set parameters including sensor and lens specs, minimum or maximum working distance, and the maximum camera angle to the bridge surface.Different approaches and concepts for automated inspection using UAVs (Angel Ortega et al., 2020); (Bolourian and Hammad, 2020); (Bono et al., 2022); (Chen et al., 2019); (Ivić et al., 2022); (Shi, Mehrooz and Jacobsen, 2021) and UGVs (Charron et al., 2001); (Merkle, Schmitt and Reiterer, 2020a); (Peel et al., 2018) already exist.Bridges usually require simultaneous localization and mapping (SLAM) due to missing Global Navigation Satellite System (GNSS) below a bridge either based on cameras or Light Detection and Ranging (LiDAR) for navigation.The precise poses can be derived while post-processing using the overall trajectory and map information.
There are multiple possible scenarios for how the presented crack detection and measurement approach can be applied to real bridge inspections.The first one is that the bridge is completely imaged including both damaged and non-damaged areas from a distance allowing detecting the cracks but not measuring the crack width.Using a photogrammetric or further SLAM approach, the camera poses are derived.All images are segmented.Lastly, the crack information of images, containing damage information, is projected to the CAD.In the manual inspection, only the areas with projected cracks must be double-checked by an inspector.For this scenario, the reliability of the damage detection must be high enough that this method is certified and added to the given regulations.
The second scenario assumes, that the complete automated detection is done after the manual inspection in case additional digital documentation with a detailed 3D crack map is required.
A third scenario relies on taking images of detected damages in parallel to the manual inspection by the inspector.Assuming that the bridge was already completely photogrammetrically recorded, the camera pose can be derived, and the data can be projected into the CAD.If required, the damage detections are double-checked by a person for verification.
Regarding the crack width, either a sufficient GSD distance from a far distance or an iterative approach with varying distances can be used.In the case of a mobile platform, a far-distance image can already be used for crack detection.If a crack is detected, either multiple close-range images can be taken to cover the complete crack or only one image of the widest crack width position is taken.The latter is difficult and requires the expertise of the inspector.Another approach is, that the crack width and its position marked with chalk could be used to correct the acquired 3D crack data.However, without a sufficient GSD, measuring the crack width is highly challenging.Therefore, the assessment of the crack data, presented in Section Error!Reference source not found., still requires reliable crack width information.
To make use of the obtained crack parameters, high data quality is required.This is even more true for prestressed concrete where crack widths of 0.1 mm and below can already indicate a loss of prestress in the structure.As the number of results provided by an automated crack inspection might be huge (depending on bridge size and condition), an automated evaluation method is also needed.Therefore, the severity of an identified crack can not only be based on crack width but the date of crack occurrence and crack position should also be considered.As shown in Section 6, this information can be used to obtain degraded structural parameters (e. g. reduced rebar diameters).However, further research is needed to quantify the uncertainties resulting from the made assumptions.

Corrosion estimation based on 3D crack map
A possible future application of the obtained 3D crack map with crack width information is the estimation of chloride-induced reinforcement corrosion.Based on (Bundesministerium für Verkehr und digitale Infrastruktur, 2019), chloride exposure is to be assessed depending on the chloride penetration depth.According to (Zilch, 2011), the effect of cracks on the chloride ingression can be estimated based on the durability assessment (grades: 1 -best and 4 -worst).When equal grades for crack assessment and chloride ingression are linked, the interaction between the depth of chloride ingression and crack width can be established as listed in  According to (Zilch, 2011), a limit state equation for depassivation of the reinforcement can be described empirically by: where   () is the time-dependent limit state function for chloride-induced reinforcement depassivation.Further based on (Bundesministerium für Verkehr und digitale Infrastruktur, 2017) and (Haardt, 2013) the depth of chloride ingression can be calculated by: where   ((Pommerening, Freitag and Stadler, 2008); (Novak, Brosge and Reichert, 2003)) is the chloride migration coefficient for exposure period [cm² / a],   is the proportion of exposure duration from concrete age [-], and  is the time of exposure [a].Assuming that  is equal to the age of the considered concrete part,   is available from local measurements or experience, and the as-planned concrete cover for the position of a detected crack can be derived from an existing BIM-model, the calculation of   () is possible.Furthermore, by setting   () to zero and solving for , the time it takes the chloride ions to reach the reinforcement can be determined.
By using d* in equation ( 3), a limit state value for each identified crack width range can be calculated.However, a case distinction is to be made regarding  in the case that d* is equal to two-thirds of the concrete cover.The first case depicts the period the chloride ions already had to diffuse into the concrete cover (typically the age of the considered concrete part) and the second case describes the case that chloride ingression starts from two-thirds of the concrete cover using the period between crack occurrence and the current date as .Consequently, in the case that d*=1/3 , equation ( 3) can be formulated as: where dc is the current time and do the time of the first crack occurrence.The upper condition calculates whether chloride ingression reached the reinforcement without the presence of the crack (in case the crack just formed and did not accelerate the ingression).The lower condition covers the case that after crack formation the chloride ions still need to pass the remaining one-third of the concrete cover (checking whether the time difference between crack occurrence and date of condition assessment is sufficient for reaching the reinforcement).If the upper condition results in a negative value and the lower condition in a positive one, it could be a hint of a crack that is already induced by the corrosion of the reinforcement.
After having determined   () for every detected crack, the following outcomes are possible.Subsequently, with the time after depassivation already known, an empirical model for corrosion-induced steel loss can be formulated according to (Zilch, 2011).
where s is the absolute steel erosion [µm], rcl is the empirical rate of corrosion-induced steel erosion [µm / a] (for example a value of 30µm/a is suggested in (Novak et al., 2003)) and td is the elapsed time since depassivation [a].
Given that rcl can be estimated conservatively (values might be taken from literature e.g.(Novak, Brosge and Reichert, 2003) and (Hunkeler, Muehlan and Ungricht, 2006)) the reduced cross-section of the reinforcement (assuming that corrosion appears constantly around the rebar circumference) is given by: where A* is the reduced cross-section [cm²] and d s is the rebar diameter [cm].Monitoring the reduced crosssection over time for each detected and measured crack using the proposed crack map generation including width information, and corresponding rebar location, the degradation process can be modelled.In addition, the proposed approach might help with the identification of hotspots or locations that require special attention during the next inspection.

CONCLUSION
This work presents a semi-automated approach for detecting and localizing thin cracks.With the used sensor system, detecting 0.2 mm cracks is possible.Due to the challenging surface of the test bridge, the performance of the crack detection is limited which includes both false negatives and false positives.It is expected that a bigger and more diverse dataset for training the neural network will improve the results.Furthermore, this work proposes an approach to measuring the crack width by projecting the contour and medial axis pixels of the cracks to the bridge mesh.
The presented workflow overcomes the limitation of planar surfaces, texturizing the overall mesh, and allows the use of the same CAD model for multiple inspections and monitoring crack growth over time.Based on the projected points the widths along a crack are derived in 3D.Since the crack width is partially overestimated by the neural network, a further approach like classical post-processing is required.Within this work, adaptive filtering based on the greyscale of the cracked area is applied.The accuracy highly depends on a sufficient GSD which is still challenging with current camera systems when aiming for a big field of view.To measure the crack width, a GSD lower than the crack width is required.The histograms of the measured crack width of different examples match well with the official inspection results.The maximum crack width derivation is still challenging due to local spalling within a crack or local false positives.The authors assume that this problem remains even when using higher resolution or more advanced neural networks.However, by defining a local minimum crack length for the maximum crack width or further heuristics, those outliers could be removed.The resulting crack width and the referenced 3D crack data can then be used for further analysis.
At the end of this work, a conceptual methodology for an automated crack inspection in practice under consideration of different scenarios and the accompanying challenges is discussed.Though relying on several assumptions, a regular crack map generation using the presented scheme could help to reveal the actual deterioration process of bridge structures.This would include the possibility of clearly identifying newer cracks that can be separated from already identified ones.Such information could be most useful for planning and lifecycle management of road networks.

FUTURE WORK
As part of this paper, a new approach to crack detection, localization, and measurement is proposed.Moreover, a subsequent way of crack assessment is presented.Currently, the approach contains the following limitations: partially overestimating the crack width, detecting crack similar structures as cracks depending on GSD, projection errors in the mm to cm range, dependency on a robust image set for the photogrammetric approach, and the sensor resolution allowing low GSD while keeping a big FOV to reduce the number of required images.Therefore, future work includes improving the proposed approach by increasing the accuracy, creating a bigger data set, and testing high-resolution sensors in combination with further algorithms allowing to overcome the problem of crack width measurement and dealing with overlapping images.Further information can be derived using the component-based interpretation of damages instead of using the overall inspectable surface as one mesh.This allows differentiating longitudinal, transverse, and diagonal cracks depending on the main component axis which is required for damage grading.Furthermore, multiple classes can be detected such as spalling and repairing damages.This will increase the probability of identifying the maximum crack width correctly.Since other damages have bigger dimensions compared to crack widths and the damages do not have to be detected with millimeter precision, other damage types will be less critical.Apart from further research, more precise rules for the manual inspection must be defined for example a clear definition of the crack width and the difference between crazing and a normal crack.Moreover, the maximum distance and change of direction between two cracks must be defined for a correct distinction of cracks.The combination of precise rules, including example data, improved technologies, and ways to certify an automatic system will pave the way for the bridge inspection of the future.

Figure 1 :
Figure 1: Cylindrical concrete probe (diameter of 15 cm and height of 30 cm) with cracks.

Figure 2 :
Figure 2: Processing workflow to project crack pixels of images onto faces of the CAD mesh.

Figure 3 :
Figure 3: Scaling and transformation of the sparse point cloud and camera poses (red) to CAD coordinates.

Figure 4 :
Figure 4: Crack pixels masked by the currently projected face.

Figure 5 :
Figure 5: Manually created CAD mesh with a double projection of one image (blue) and another image (red) leading to spatial shift.

Figure 6 :
Figure 6: Manually created CAD mesh with projected points representing cracks by projecting only damage information of the image with the lowest angle to the face normal per face.

Figure 8 :
Figure 8: Sectional view of reference bridge.

Figure 9 :
Figure 9: Top view (top) and side view (bottom) of the bridge with camera poses (in red) for data acquisition.The arrows indicate the photographic direction.

Figure 11 :
Figure 11: Perspective of the underside of the textured mesh including problems in the ceiling area.

Figure 12 :
Figure 12: Annotated image (left) and detailed view (right) with the polygon in red.

Figure 13 :
Figure 13: Crop of a crack image (original left, resampled image right).

Figure 16 :
Figure 16: Prediction of crack with 0.2 mm width from different working distances (10 cm to 100 cm).

Figure 17 :
Figure 17: Image (left and prediction (right) of a cracked area labeled with chalk during the last inspection.

Figure 18 :
Figure 18: Prediction of crack (left), a crop of it (middle), and the result of adaptive filtering its grayscale value of the predicted area (right).

Figure 20 :
Figure 20: CAD model of the bridge based on 2D construction plans.

Figure 21 :
Figure 21: Inspectable surface of the underside of the bridge manually derived from CAD model without bearings.

Figure 22 :
Figure 22: Registration of point cloud and CAD including scaling and transformation by using point matching of distinctive features.

Figure 23 :
Figure 23: Camera poses in CAD coordinates overlayed with the inspectable surface.

Figure 24 :
Figure 24: Clustered point cloud to individual cracks or crack similar structures using DBSCAN.The numbers refer to the different clusters.

Figure 25 :
Figure 25: Crack and crack similar structures.

Figure 26 :
Figure 26: Trajectories fitted to individual cracks using a defined distance between trajectory points.

Figure 27 :
Figure 27: Crack images (left), detailed view of crack point clouds (medial axis (red) and contour (blue)) with exemplary width measurements (middle), and histogram of crack width (multiples of the GSD are dashed) (right).

Figure 28 :
Figure 28: Crack width measurement of a 0.2 mm crack using a crack scale card.

Figure 29 :
Figure 29: Detail view of crack measurement

Figure 30 :
Figure 30: Projection error leading to inconsistent projections of 20 images taken from 10cm to 200 cm distance.

Figure 31 :
Figure 31: Detailed view of projection errors leading to errors in the centimeter range for a worst-case scenario (combination of close and far distance images: ca.38 mm deviation of crack location).

1.
() > 0: chloride ions have not yet reached the reinforcement and the remaining time can be derived.2.   () = 0: chloride ions have just reached the reinforcement and depassivation starts.3.   () < 0: depassivation has already occurred.The time elapsed since depassivation can be calculated by solving equation (4) for  and taking the absolute value of   () as xc.

Table 1 :
Specification of the used camera.

Table 2 :
Test of a maximum combination of working distances for SfM dependent on the setting and respective processing time for feature extraction (FE), image matching (IM), Feature matching (FM), and SfM.FE setting Max.GSD scale [mm] FE [s] IM [s] FM [s] SfM [s]

Table 3 :
Comparison of mask and prediction of synthetic squares, rectangles, and crack similar polygons in number of pixels.The ratio of the areas is marked in red if the error is more than 10 %.