Estimating the performance of multi-rotor unmanned aerial vehicle structure-from-motion (UAVsfm) imagery in assessing homogeneous and heterogeneous forest structures: a comparison to airborne and terrestrial laser scanning

The implementation of Unmanned Aerial Vehicles (UAVs) and Structure-from-Motion (SfM) photogrammetry in assessing forest structures for forest inventory and biomass estimations has shown great promise in reducing costs and labour intensity while providing relative accuracy. Tree Height (TH) and Diameter at Breast Height (DBH) are two major variables in biomass assessment. UAV-based TH estimations depend on reliable Digital Terrain Models (DTMs), while UAV-based DBH estimations depend on reliable dense photogrammetric point cloud. The main aim of this study was to evaluate the performance of multi-rotor UAV photogrammetric point cloud in estimating homogeneous and heterogeneous forest structures, and their comparison to more accurate LiDAR data obtained from Aerial Laser Scanners (ALS), Terrestrial Laser Scanners (TLS), and more conventional means like manual field measurements. TH was assessed using UAVSfM and LiDAR point cloud derived DTMs, while DBH was assessed by comparing UAVSfM photogrammetric point cloud to LiDAR point cloud, as well as to manual measurements. The results obtained in the study indicated that there was a high correlation between UAVSfM TH and ALSLiDAR TH (R2 = 0.9258) for homogeneous forest structures, while a lower correlation between UAVSfM TH and TLSLiDAR TH (R2 = 0.8614) and UAVSfM TH and ALSLiDAR TH (R2 = 0.8850) was achieved for heterogeneous forest structures. A moderate correlation was obtained between UAVSfM DBH and field measurements (R2 = 0.5955) for homogenous forest structures, as well as between UAVSfM DBH and TLSLiDAR DBH (R2 = 0.5237), but a low correlation between UAVSfM DBH and UAVLiDAR DBH (R2 = 0.1114). The study demonstrated that UAV acquired imagery can be used to accurately estimate TH in both forest types, but has challenges estimating DBH. The research does not suggest that UAVSfM serves as a replacement for more high-cost and accurate LiDAR data, but rather as a cheaper adequate alternative in forestry management depending on accuracy requirements.


Introduction
Many developing countries rely on National Forest Inventories (NFIs) for their biomass and carbon stock estimates . An effort to be part of The Reducing Emissions from Deforestation and forest Degradation plus forest conservation, sustainable management of forest and enhancement of forest carbon stocks (REDD+) mechanism, as it provides developing countries the financial incentive for reducing forest degradation and deforestation. One of the requirements to benefit from the REDD+ mechanism is that participating countries of the United Nations Framework Convention on Climate Change (UNFCCC) report their verified national biomass and carbon estimates. It is therefore expected of these countries to have capable systems for carbon monitoring and technologies or methodologies with which to obtain this data. Unfortunately, many of these countries do not have comprehensive NFIs, which are run at high operational cost and are highly labour intensive. Due to these limitations in the conventional acquisition of the necessary measurements and subsequent estimation of biomass, Remote Sensing has played a vital role in the last few decades with estimating above ground biomass more efficiently and cost effectively (Günlü et al., 2014).
Remote Sensing has been used extensively in forest management and monitoring as it provides observations over a large area, can be repeated with ease after the initial application, and thereby offering a time saving alternative. Techniques such as the use of satellite imagery are a popular, inexpensive, and a valuable alternative to conventional field measurement methods (Maina et al., 2017). However, satellite imagery is usually flawed with relatively poor spatial resolution, and is periodically captured with extensive cloud cover, making processing challenging. Radio Detection and Ranging (RADAR), Aerial Laser Scanning (ALS), Terrestrial Laser Scanning (TLS), and optical images such as satellite imagery or large scale photography have proven to be a useful as a substitute to conventional methods, but are expensive, labour intensive, and time-consuming . These systems have been widely used in forest management with varying success due to differences in vegetation types, environmental conditions, forest canopy cover, and differences in the methods used .
For instance, RADAR systems such as Space-borne Synthetic Aperture RADAR (SAR) have the advantage of being able to operate regardless of weather and daylight and are able to penetrate forest canopies. However, challenges exist with polarization, land cover, terrain properties, and the incidence angle of the sensor (Maina et al., 2017), as well as poor spatial resolution. Distinguishing vegetation types is also a challenge for RADAR as it is also hampered by poor spectral resolution.
ALS and TLS are also weather and daylight independent, and have shown great potential for forest inventory acquisition and biomass estimation in varying forest structures (Iizuka et al., 2018;. ALS, however, often has difficulty with adequately capturing below-canopy forest structures such as complete tree trunks in very dense forests, depending on the sensor used, while TLS boasts of the ability to capture below-canopy forest structures but falls short at capturing the top of the forest canopy (Wilkes et al., 2017). Both can be labour intensive, costly, and time consuming .
Unmanned Aerial Vehicles (UAVs) have been utilised over the last few decades in numerous surveying and monitoring applications. They have garnered subsequent use in forestry management in recent years, especially with the use of Structure-from-Motion (SfM) photogrammetry and readily available stereo-matching software in estimating forest variables in Pinus forests with relative success (Guerra-Hernández et al., 2016;Galidaki et al., 2017;Mlambo et al., 2017). Low altitude UAV imagery was used to assess forest canopy height (Lisein et al., 2013), used to generate regression models to calculate individual tree biomass (Jones et al., 2007), and used to produce digital surface models using photogrammetric methods (St-Onge et al., 2008). UAV imagery was also used to estimate biomass in dry woodlands of Malawi , to estimate Japanese Cypress (Chamaecyparis obtusa) TH and DBH from digital surface models and orthophotos using UAVSfM imagery (Iizuka et al., 2018), and to compare UAVSfM derived point cloud data of tree variables to LiDAR data (Puliti et al., 2015).
Although UAVs and SfM have been used successfully in these studies, they are not without flaws in their results, although marginal. Puliti et al., (2015) recorded correlations of R 2 = 0.710, R 2 = 0.970, R 2 = 0.600, R 2 = 0.600, and R 2 = 0.850 for the variables Lorey's Mean Height (hL), Dominant Height (hdom), Stem Number (N), Basal Area (G), and Stem Volume (V) respectively, but only after combining SfM and ALS data due to a deficiency in ground imagery data resulting from UAVSfM only achieving limited penetration of top-canopy forests. This is a known limitation with SfM. Iizuka From the studies mentioned above, it is evident that the accuracy of forest variables, such as TH, is dependent on the ability of the sensor to acquire not only above canopy forest structures, but below canopy forest structures as well, such as the ground. As such, the accuracy of TH estimations, for example, is a function of an accurate Digital Terrain Model (DTM). ALS sensors are able to acquire this with ease, as they are airborne sensors and the narrow laser beams are able to penetrate the small gaps between the vegetation, and while TLS can also achieve this it requires more manoeuvring around obstacles and a number of setups. A photogrammetric point cloud can be obtained of dense forests; however, this is often limited to the top canopy as no narrow laser beams are used here to penetrate the small gaps in the vegetation. Photographically derived terrain model data requires stereo image coverage, which is unlikely in areas with dense tree canopy structures. Oblique imagery can be incorporated to acquire additional below-canopy structures. In addition to nadir and oblique imagery, tessellated façade aerial imagery was incorporated, as was done by Carnevali et al., (2018) when using UAVs and photogrammetry for modelling historical buildings for architectural purposes.
The tessellated façade imagery served as an alternative to terrestrial photography using DSLR cameras to acquire sufficient below-canopy forest structure data, thereby creating a completely UAVreliant approach in acquiring forestry data.
When estimating TH and DBH using photogrammetric point cloud data, several phases are involved. For TH, the point cloud needs to be classified into ground and non-ground (vegetation etc.) points. After which, the ground points can be used to create a continuous ground surface (DTM), the ground and non-ground points can be used to create a Digital Surface Model (DSM) which shows the elevation of all present structures. After which, a Canopy Height Model (CHM) can be created which shows the absolute height of any present structures thereby resulting in the TH .
As for DBH, the measurements can be extracted from the photogrammetric point cloud 1.37m above the ground level (Malone et al., 2009), or the trunks modelled into cylinders and their diameters extracted (Olofsson and Holmgren, 2017).
ALS and TLS LiDAR data are generally considered to be the best for creating DTMs and extracting forest variables Colomina and Molina, 2014) as they are more reliable and technologically better suited to data capture in dense forest structure environments. However, these technologies can incur high costs. UAVSfM may offer a cheaper alternative to this, within limits.

The Study Area
The three study areas, HomoFS, HeteroFS1, and HeteroFS2, located at Rondebosch Common, the University of Cape Town, and Steenbras Dam Nature Reserve respectively, chosen for this research project are located in Cape Town (Figure 1), which lies along the western coastline of the Western Cape province of South Africa at latitude 33°55'33.0" S, longitude 18°25'23.6" E, approximately 30m above mean sea level. Cape Town was chosen because of its unique climate in comparison to the rest of the country. It has a winter rainfall Mediterranean climate compared to the subtropical summer rainfall climate experienced by the rest of the country (Tuswa et al., 2019). This makes Cape Town, and the Western Cape as a whole, a unique location for the growth of several vegetation types, including woody homogeneous and heterogeneous Pinus forest structures, that are unique to this region of South Africa, and the world. HomoFS, HeteroFS1, and HeteroFS2 were planned at each of the respective study areas as Regions of Interest (ROIs). Figure 1. The location of the study areas and their relative location to each other.

Equipment
The data used in this research endeavour include 20MP

Design, Field Measurement and Ground Control Collection
Field measurements were applied when measuring the DBH for trees at HomoFS (Rondebosch Common), while at the other two sites, HeteroFS1 (University of Cape Town) and HeteroFS2 (Steenbras Dam), DBH was collected using TLS (TLSLiDAR), ALS (ALSLiDAR) and a UAV with ALS (UAVLiDAR). The circumference of 30 trees with DBH ≥ 5cm at HomoFS were collected on 20 June 2018, using a measuring tape wrapped around each tree trunk 1.37m above ground level (Malone et al., 2009). The DBH was calculated using an allometric equation, Equation 1, suggested by González-Jaramillo et al, (2019), where is the tree diameter is the tree circumference, and is equal to 3.14: Ground Control Points (GCPs) were included for each UAVSfM survey of each site to aid in correcting shifts and distortions due to possible loss of, or poor, Inertial Measurement Unit (IMU) and Global Navigation Satellite System (GNSS) measurements recorded by the UAV during flight (Puliti, 2017), and possibly improve overall image registration (Dandois et al., 2015). Easily identifiable features such as road markings or curb corners were identified as possible GCP markers where possible. Where no easily identifiable features existed, GCPs in the form of 1m by 1m black and white checkered mats were used as easily identifiable features to be captured in the UAV imagery.
To achieve this, 30cm long 10mm round iron pegs were hammered into desired positions and the mat centres aligned with the pegs. The centres of the mats were surveyed for their precise vertical and horizontal positions. In this study, all GCPs were surveyed using a Trimble R4 differential Global Navigation Satellite System (dGNSS) unit. The dGNSS unit comprised of two Trimble R4 receivers, a base receiver, and a rover receiver. Both receivers were set to observe pseudo-range and carrier phase signals of Global Positioning System (GPS) and Global Navigation Satellite System (GLONASS) to provide the precise positions of the GCPs. Three different GNSS survey styles were used in this study. At HomoFS, a base was established, and the rover was used to navigate to and survey in the positions of three Town Survey Marks (TSMs) around the area, as well as the GCPs in

Collection of UAV Imagery
Nadir and Oblique UAV imagery at HomoFS was acquired on three different days: 20 June 2018, 22 June 2018, and 18 July 2018. Nadir and Oblique UAV imagery was also captured over two days for HeteroFS1, with the tessellated façade imagery being captured on a third day: 12 February 2019, 23 Before every take-off for each survey, the mats were laid out over the peg positions and the corners fastened to the ground, and the take-off and landing zones were cleared of obstructions, e.g., by positioning the take-off and landing zone away from the tree edge. Since the GCPs were surveyed in the initial ground control survey, there was no need to survey them again. All the nadir and oblique flights were planned, and executed, on a Samsung Galaxy Note 5 smartphone with Pix4D Capture (Pix4Dcapture, 2019) (Figure 3). In all UAVSfM flights, forward lap and side lap were set to 80% to ensure adequate overlap of subsequent imagery, and due to South African Civil Aviation Authority (SACAA) regulations on safe and acceptable operations of a Remotely Piloted Aircraft System (RPAS), no flight was planned for, or flown, above 120m (400ft) above ground level (AGL) (SACAA, 2021). Table 1 below summarises the nadir flight parameters for each study area, while Table 2 shows that of the oblique flight parameters. Images flown at higher AGL provided a wider scene for more features to be used in the image matching phase, while the images acquired at lower AGL provided higher resolution imagery for finer detail.   The  Table 3. respectively. However, the survey of HeteroFS2 was discarded for various reasons -some targets placed on the tree trunks fell off the trunks during the survey, making referencing between stations challenging; rampant subspecies below-canopy growth obstructed line-of-sight between some targets and the scan positions; the absence of distinct features in a monotonous environment also proved to be a challenge for referencing scans; and the thick forest top-canopy was a challenge for the in-built GPS to receive satellite signals to assist with instrument orientation and referencing. These challenges meant that the scans could not be registered with high enough accuracy and low enough residuals to constitute an accurate and successful survey; a return to site to repeat the survey was also not possible within a reasonable timeframe. The scans were acquired using Z+F Imager® 5010X Terrestrial Laser Scanner . The scanner, which weighed a total of 11kg with the battery included, was equipped with an IMU and a GNSS.
Several A4 sheets with opposing black triangles were taped onto as many tree trunks as possible to maximise the chances of clear line of sight between scan positions, while some A4 sheets were placed on easily identifiable positions on the parking bays. Some of these markers were used as CPs to validate the accuracy of the generated point cloud and models. The mats used in the UAVSfM surveys were also placed on their respective positions to be surveyed during the LiDAR survey to assist with registration of the scans. A total of 44 targets were initially placed throughout HeteroFS1 to assist with referencing the images on both the horizontal and vertical plane, however (González-Jaramillo et al., 2019) argues that no more than 6 targets are necessary. Scan registration was done in Z+F LaserControl® v9.0.2.24038 , and cleaning in Autodesk Recap v6.0.0 (Autodesk, 2021). Table 4 shows the TLS registration statistics.

Image Processing
Agisoft Metashape v1.5.1 (Agisoft, 2017)  orthophoto of the scene was finally created. The processes followed were similar to those followed by (Puliti et al., 2015;. Automated batch files in the software were created for each of the nadir, oblique and the tessellated façade imagery datasets to speed up processing. These individual point clouds were later combined into one dense point cloud (Figure 4). Processing was done on a Lenovo Y70 (Lenovo, 2020) Intel® Core™ i7 processor, NVIDIA® GTX-860M with 4G VRAM graphics card, and 16GB DDR3L RAM high performance gaming laptop. Due to extensive invasion of below canopy subspecies which caused numerous challenges including shadows and obstruction of tree trunks, the photogrammetric point cloud for HeteroFS2 was discarded after several attempts failed to reprocess the dataset for more favourable results.

Generation of DTMs
Digital Terrain Models were created for H omo FS and H etero FS 1 each using the UAV SfM photogrammetric point cloud; as well as Digital Terrain Models using the ALS LiDAR data: a 10cm Digital Surface Model (DSM), a 10cm Digital Elevation Model (DEM), and a 10cm Canopy Height Model (CHM) for each cloud dataset (Iizuka et al., 2018;Guerra-Hernández et al., 2016). Creating higher resolution terrain models, such as <10cm, would increase processing time. Various attempts using a grid spacing of lower than 10cm led to longer processing times and software crash. Both the LiDAR data and UAV SfM photogrammetric point cloud were classified into appropriate ground and non-ground (vegetation) point classes. The DEMs were modelled using both classified ground and non-ground point classes, while DSMs were modelled using only the classified ground point class ( Figure 6). The CHM, which represents the absolute tree height, was the difference between the two (Lim et al., 2003;González-Jaramillo et al., 2019): All the terrain models were created using the Triangulated Irregular Network (TIN) method to create a surface. Figure 6. A 10cm DSM and a 10cm DEM of some of the trees in the HomoFS area

DBH
All non-ground (vegetation) points above 1.37m (DBH) (Malone et al., 2009) and below 1.00m from the average ground level were temporarily reclassified into a random class, producing several vertical 0.37m long stem cylinders of vegetation points in the vegetation class that formed the trunks as was done by (Olofsson and Holmgren, 2017). These cylinders were used to measure the diameters of the tree trunks by extracting the best-fit circular or ellipsoidal vectors (Figure 7) around the vegetation points, and extracting the perpendicular measurements across the circles to obtain the average DBH . The cylinders were also used to mark the individual tree positions so their respective TH measurements could be extracted. Point features were created at the centre of each cylinder to mark its position. This procedure was repeated on all UAVSfM photogrammetric point cloud data for HomoFS and HeteroFS1, as well as the TLSLiDAR and UAVLiDAR data acquired for HeteroFS1. A total of 32 trees were assessed in the HomoFS study area, while a total of 20 trees were assessed in the HeteroFS1 study area.

TH Extraction
For each area, the TH was extracted from the modelled CHMs. When assessing TH in HomoFS, both a 10cm resolution UAVSfM point cloud derived CHM and a ALSLiDAR data CHM were used.
When assessing TH in HeteroFS1, three 10cm derived CHMs were used: a UAVSfM point cloud derived CHM, a TLSLiDAR data derived CHM, and a ALSLiDAR data derived CHM. A total of 30 trees were assessed in the HomoFS dataset, while 20 trees were assessed in the HeteroFS1 dataset. The average individual TH was extracted by measuring perpendicular distances across each relative tree position on each CHM.

Evaluating the performance of UAV SfM derived tree variables
Various statistical tools were applied to evaluate the utility of UAVSfM-derived point cloud against LiDAR data in assessing DBH and TH. Pearson's correlation (Pearson's r) outlined in Equation 3, (Maina et al., 2017;Jayathunga et al., 2018; was used extensively on the results to measure how well the various UAVSfM datasets relate to their LiDAR dataset counterparts -the strength of the relationship between both variables: where is the total population number; represents UAVSfM DBH or TH values; and represents the various LiDAR dataset DBH or TH values. The correlation coefficient formula shows a linear relationship between two sets of data being compared. The accuracy of Pearson's r obtained for the DBH and TH data was validated using the leave-one-out cross validation (CV) technique as suggested by (Jayathunga et al., 2018). The root mean square error (RMSE) of the data was also determined using Equation 4.: where is the number of samples, ; is the observed LiDAR DBH or TH value; and � is the UAVSfM DBH or TH value. This was done to evaluate the average separation from the best-fit line of each sample measurement.

UAV SfM against TLS LiDAR , UAV LiDAR and Field Measured DBH
UAVSfM DBH measurements were compared to field measured DBH values at HomoFS using a total of 32 sample trees in the area, while at HeteroFS1 the same comparison was done but with TLSLiDAR and UAVLiDAR DBH data using 20 sample trees in both instances.

UAV SfM against TLS LiDAR and ALS LiDAR TH
UAVSfM TH measurements were compared to both TLSLiDAR and ALSLiDAR TH measurements.
When assessing HomoFS, 30 sample trees were used to evaluate the coefficient of determination between UAVSfM and ALSLiDAR data, while 20 sample trees were used in assessing the utility of UAVSfM in estimating TH against TLSLiDAR and ALSLiDAR in HeteroFS1.

Discussion
For the DBH comparison, a moderate coefficient of determination of R 2 = 0.5955 (59.55%) was obtained, signifying UAVSfM performs averagely well at estimating field DBH. A stark difference between the variables in the UAVSfM and field measurements, minimum difference of 0.008m and maximum difference of 0.438m, is suggestive of challenges in reconstructing full and accurate tree trunks in various instances using photogrammetry -a function of inadequate scene coverage caused by insufficient image cover from multiple perspectives. For the HeteroFS1 comparison, a moderate agreement of R 2 = 0.5237 (52.37%) was also obtained when comparing UAVSfM DBH to TLSLiDAR DBH, while a poor agreement of R 2 = 0.1114 (11.14%) was obtained when comparing UAVSfM DBH to UAVLiDAR DBH (Figure 8). This was because the UAVLiDAR data obtained failed to properly represent the full extent of some tree trunks making extraction of the actual diameter challenging, TLSLiDAR, a coefficient of determination of R 2 = 0.8614 (86.14%) was achieved, signifying a strong correlation between the two sets of data. A Pearson's r value of r = 0.9280 (92.80%) was also achieved, indicating a strong association between the two (Figure 9). However, a RMSE = 2.131m value was achieved for this comparison which shows the average separation from the line of best fit between these two variables, caused by the inability of TLSLiDAR to properly acquire the top-canopy of the forest. On average, the UAVSfM TH values were higher than the TLSLiDAR values. When UAVSfM was compared to ALSLiDAR, a coefficient of determination of R 2 = 0.8850 (88.50%) was achieved, with a Pearson r value of r = 0.9407 (94.07%) and a RMSE = 1.683m. These all indicate good correlation between the two data variables. It can be noted here that UAVSfM performs slightly better when compared to ALSLiDAR than when compared to TLSLiDAR when considering R 2 . This could be attributed to the fact that both datasets are acquired from airborne vehicles and as such are able to acquire the full top-canopy. The lower RMSE value also signifies that there is lower separation from the line of best fit between the two datasets.

Conclusion
The study intended to assess the efficacy of using multi-rotor unmanned aerial vehicles in assessing allometric variables in homogeneous and heterogeneous forest structures necessary for rudimentary biomass estimation using these variables. The unmanned aerial vehicles structure-frommotion (UAVSfM) techniques applied provided fair reconstruction and characterisation of both homogeneous and heterogeneous forest structures, with results being comparable to high-cost LiDAR data obtained from expensive platforms. Overall, UAVSfM provided relatively similar results to LiDAR data when assessing diameter at breast height (DBH), but highly comparative results when assessing tree height (TH) estimations. Although UAVSfM performed well in TH estimation in this study, CHMs produced can be influenced by the complexity of the top-canopy, and omission of data during the flight capture. As such, additional capture angles, altitudes, flight patterns, high image overlap, multiple-perspective imagery, and capture techniques are necessary to acquire sufficient data to create the stereopairs necessary to reconstruct the captured scene extensively. This means that photogrammetric data cannot deliver the same accuracies as LiDAR data when considering ground cover and below-canopy vegetation conditions without significant effort and relative error but does provide a cheaper alternative. This is evident in previous studies. Further research can be done on the inclusion of tessellated façade imagery in acquiring images for forestry inventory management.