Urban land use land cover mapping in tropical savannah using Landsat-8 derived normalized difference vegetation index (NDVI) threshold

Generation of land use/land cover map at different spatial scales using satellite remote sensing data has been in practice as far back as early 1970s. Since then, research focus has been on the development of classification steps and improving the quality of the resulting maps. In recent times, the demand for detailed high accuracy land-use and land-cover (LULC) data has been on the increase due to the growing complexity of earth processes, while, at the same time, processing step is becoming more complex. This paper explores Landsat 8 derived normalized difference vegetation index (NDVI) threshold for the purpose of simplifying land cover classification process. NDVI images of January, May and December, 2018, representing dry, wet and harmattan seasons were generated. Thereafter, NDVI values corresponding to the location of a set of training data representing the target urban land covers (water, built-up area, soil, grassland and shrub) were extracted. Using the statistics of the extracted values, NDVI threshold for the respective land cover type were determined for the classification process. Finally, the classification accuracy was evaluated using the unbiased matrix coefficient technique which produced overall accuracy of 71.3%, 46.4% and 75.6% at 95% confidence limit for the months of January, May and December of the year review respectively. The result has shown that NDVI threshold is a simple and practical alternative to obtain LULC map at a reasonable time with a few data.


Introduction
Since the normalised difference vegetation index (NDVI) was first introduced by Rouse et al. (1976), it has been widely utilised as a potential tool for vegetation studies at different spatial scales (Anyamba & Eastman 1996;Defries & Townshend 1994;Fensholt et al. 2006). From the basic knowledge of the behaviour of plants across the electromagnetic spectrum, it has been established that healthy vegetation absorbs most of the incident visible energy and reflects a large portion of the near-infrared light while unhealthy or sparse vegetation reflects more visible energy and less near-infrared light (La et al. 1987;Miomir et al. 2018). Thus, the red and near-infrared bands of satellite sensors such as Advanced Very High-Resolution Radiometer (AVHRR) (Anyamba & Eastman 1996;Miomir et al. 2018), moderate resolution imaging spectroradiometer (MODIS) (Aredehey et al. 2018;Kong et al. 2016), Satellite Pour l'Observation de la Terre (SPOT -(de Bie et al. 2011)), and Landsat imagery (Aburas et al. 2015;Gandhi et al. 2015) are usually manipulated to obtain NDVI.
In the last decade, the need to simplify the process of obtaining land cover data has inspired investigation into determining the relationship between NDVI and surface cover of the urban environment (Zaitunah et al. 2018). With the growing availability of high spatial and temporal resolution satellite data, studies have been intensified on the use of NDVI for land-cover classification. For example, de Bie et al. (2011) used SPOT vegetation 10-day composite NDVI images of 1998-2002 to produce 11 classes that depict different cover types in part of Nizamabad district, Andhra Pradesh, India, using unsupervised classification, ISODATA clustering algorithm.
Similarly, Ehsan and Kazem (2013) utilized NDVI derived from Landsat ETM+ for 1990 and2006 to detect and monitor LULC change in Ardakan, Iran, by applying NDVI image differencing. The resultant NDVI-change image was threshold and subsequently density sliced into four classes (low, medium, high, and very high) to find changes.
Specifically, the use of NDVI for vegetation phenology has been widely reported in the literature.
One of those works published by Al-doski et al. (2013) detected vegetation change in Halabja City, Iraq, between 1986 and 1990 using NDVI produced from Landsat-5 Thematic Mapper (TM). The authors applied NDVI threshold values of -1 to 0, 0 to 0.2, 0.2 to 0.4, 0.4 to 0.6, < 0.6 to classify the study area into water bodies, no vegetation, sparse vegetation, moderate vegetation and dense vegetation, respectively. A related study reported by Aburas et al. (2015) evaluates land-use changes in Seremban District, Peninsula Malaysia, between 1990 and 2010 using NDVI images obtained from Landsat TM. Based on the general knowledge that negative NDVI represents water body, values close to zero indicate built up areas and positive values reveal different vegetation types (Anyamba & Eastman 1996), the District was classified into five classes, which include: non-vegetation, sparse, moderate, high, and dense vegetation.
An investigation by Jeevalakshmi et al. (2016) also reveals the potential of NDVI for land-cover classification. The research work, which was carried out in Chittoor District, Andhra Pradesh, India, using Landsat-8 time series NDVI, was intended to identify a range of values for five different land cover types, namely, water body (-0.0175 to -0.328), built-up (-0.019 to 0.060), bare soil (-0.001 to 0.166), sparce vegetation (0.244 to 0.44) and dense vegetation (0.5 and above). In a more recent study, Hashim et al. (2019) classified urban vegetation around the National Monument Park, in Kuala Lumpur, Malaysia, with NDVI of a very high resolution (0.5 m) Pleiades imagery. The result shows that NDVI values of -1 to 0.199 represents non-vegetated area, while the low and high vegetated areas are identified in the range of 0.2 -0.5 and 0.5 to 1.0, respectively. Naturally, the use of NDVI threshold value for land cover classification requires less time and data as it eliminates the complexity of the mainstream remote sensing image analysis (Aredehey et al. 2018;Hashim et al. 2019). However, determining the precise range of NDVI values to distinguish agricultural land, semi-natural areas, artificial surfaces and urban fabric is still a challenge. Most studies that utilized NDVI for urban land cover classification used generalized threshold. This approach results in broad range of thresholds that do not usually seem suitable for urban application at a finer scale (Miomir et al. 2018). The objective of this study is to propose an applied NDVI threshold values that simplifies and improve LULC classification process in tropical savannah using Landsat-8 satellite imagery.

Study Area and Data Used
The researchers selected Ilorin City and its suburbs to be the study area. Ilorin is located in the south-eastern part of Kwara State in Nigeria (Fig 1). Ilorin metropolis is an agro-pastoral ecotone in the North-Central zone of the country. Geographically, the study area is located between latitudes 8 o 23' 11.4" N to 8 o 36' 13.1" N and longitude 4 o 25' 56.04" E to 4 o 45' 48.6" E, covering a total area of 868.04 sq. km. The area encompasses an elevated range of 250 -429 m asl. Ilorin has a tropical savannah climate with marked rainy season from April to November and dry seasons from December to March (Olanrewaju 2009). Ilorin experiences harmattan season characterized by low sunshine as well as colder temperatures towards the end of November until early January. The region experiences annual temperatures ranging between 18 o C to 36.9 o C and a total rainfall that varies from 1000 mm and 2000 mm of rain each year (Ajadi et al. 2011;Olanrewaju 2009) Approximately, 80% of the annual rainfall occurs between the months of May and September. In addition to conurbation, the study area comprises a range of land cover and vegetation types, including cropland, deciduous shrubs, grassland (typical meadow steppe). The main agricultural practice in the peri-urban is nomadic farming and crop cultivation, majorly mono-cropping system, where the land is left uncultivated during the fallow season. The dominant planted crops are maize, wheat, rice, beans, millet, guinea corn, yam, cassava, sweet potato and vegetables (Ajadi et al. 2011). Given its diverse topography and cover types, Ilorin metropolis is highly suitable for investigating vegetation index threshold most appropriate for urban land cover classification as proposed in this study. . Several parts of the study area were visited for ground truthing.

Image pre-processing and NDVI generation
The Landsat 8 Operational Land Imager (OLI) surface reflectance products used in this study are atmospherically corrected for surface reflectance using COST, an image-based absolute correction method. COST is a simple technique that utilises the cosine of sun zenith angle (cos (TZ)) as input parameter to estimate the effects of atmospheric absorption and scattering in the image scene (Mahiny & Turner 2007).
Furthermore, the images were geometrically projected to Universal Transverse Mercator (UTM) coordinate system, datum WGS84, zone 31 and subset to the study area. Thereafter, composite images of the respective season under consideration were constructed and also the Red and Near Infrared bands extracted to calculate the vegetation index.
NDVI is generated on per-pixel basis as the normalised difference between the red band (0.63 -0.68 µm) and near infrared band (0.84 -0.88 µm). Several studies have revealed that multi-temporal remotely sensed data provide the distinctions between similar spectral of different land cover types (de Bie et al. 2011;Miomir et al. 2018;Usman et al. 2015). Generally, NDVI values range from -1 to +1 (La et al. 1987;Zhao et al. 2017), where surface features like water, snow and cloud reflect more in the visible band than in the near-infrared band, thus represented as negative NDVI values.
Bare soil, rock and man-made objects, on the other hand, have NDVI value of around zero. Whereas, healthy green vegetation exhibits stronger near-infrared reflectance resulting in high NDVI values close to +1. In the current study, the NDVI is calculated for each of the three set of images using the expression in Equation 1 (Kinthada 2014).
where NIR and RED is the near infrared (Band 5) and the red (Band 4) of the Landsat-8 imagery.

NDVI value extraction and threshold
Vegetation index is obviously helpful in detecting land cover changes caused by human activities such as physical development and agriculture. This is achieved by examining the balance between the energy reflected and emitted by surface objects using the Red and NIR bands (Aburas et al. 2015;Yagci et al. 2014;Zhao et al. 2017). To identify land-cover types in urban area surrounded by complex agricultural activities, utilising multi-temporal NDVI data, it is possible to distinguish urban features from agricultural land, and also forest land through analysis of changes in vegetation vigour across seasons (Gandhi et al. 2015). In this study, five different urban land cover types (water, developed area, soil/cultivated land, grassland and shrub) were identified and sample points were carefully selected for the respective date using stratified random sampling. Necessary steps were taken to ensure that each training set was as spectra1ly distinct from training sets for other land cover types as possible by using different Landsat 8 false colour combination, the researchers' knowledge of the study area, and field data containing points representing different land use land cover samples collected with GPSMAP Garmin 78Sc.
Thereafter, the corresponding NDVI values of the sample points were extracted and the result for each date subset into two parts in ratio 80:20 percent for threshold and validation dataset, respectively (Jin et al. 2018). The statistics (minimum, maximum, and mean) of the threshold dataset for each date is analysed and plotted. Ultimately, the optimal NDVI threshold for urban land cover classification is obtained by averaging the minimum and maximum values of the dates that produced similar map.

Land cover classification and accuracy assessment
The NDVI image of each date was categorized into the specified land cover classes of interest based on the NDVI threshold value in form of supervised classification approach using the reclassify tool in ArcGIS 10.4. Finally, the classification result was validated with the ground truth data made up of combination of the 20% validation datasets of the statistically realistic image dates (January 23 and December 25). The accuracy of the classification was analysed using unbiased matrix coefficient approach proposed by (Olofsson et al. 2013(Olofsson et al. , 2014Pontius & Millones 2011), basically considering the user accuracy, producer accuracy and overall accuracy of the classification results.

Seasonal NDVI Image
The NDVI product provides visual assessment of the measure of vegetation amount and distribution in each acquisition date across the seasons (Fig. 2). The image in Fig. 2   Basically, the very low value in the range of negative to near zero value (0.1 and below) signifies non-vegetation surface cover types, low value (0.1 -0.3) indicates slight presence of vegetation such as grass and shrubs while moderate to high value (>0.5) represents healthy vegetation or forest land cover (Fensholt et al. 2006;Gandhi et al. 2015). The result obtained herein reveals that very low to low NDVI value is predominant in the study area. Water and developed areas have very low NDVI value (dark shade). This is interpreted to mean that the land cover has more reflection in the visible band than they do in the near-infrared. Conversely, the peri-urban recorded low to moderate NDVI value represented in varying shade of grey to white colour (Fig. 2,

NDVI Value Threshold Statistics
The NDVI values of the five urban land cover classes collected through sampling was used to determine approximate range of each feature. Table 1 presents detailed information about the classes and their NDVI threshold. The NDVI threshold value is able to identify the land cover types considered. It can be observed that the results obtained for the months of January and December are similar. For example, in both images, the water class falls within negative to 0.02/0.03, built-up areas (0.02 -0.12), soil and at times cultivated land (0.12 -0.13/0.14), grassland (0.12 -0.16), and shrub (0.17 -0.3). In contrary, the NDVI range obtained for the image acquired in the month of May is not consistent with the former. Plot of the maximum, minimum and mean NDVI values of each land cover class for the respective month is presented in Figure 3. The plot of the minimum, maximum and mean (Figure 3) reveals the similarities and differences in the range of NDVI value with which each land cover class are coded in each acquisition date. It can be observed that the curve produced from plot of the January and December NDVI values against the selected features show similar pattern ( Fig. 3a and 3c), unlike what is obtained for the month of May (Fig. 3b). This implies that all the features are encoded relatively with the same NDVI spectral in the former case. In contrast, the latter case produced diverging curve that shows broad and overlapping range of NDVI values with which the feature classes are identified, indicative of the amount of feature misidentified and thus misclassified. From the statistical analysis, the ideal threshold is identified (Fig. 3d). In summary, the closer the curve of the minimum and maximum defines the degree of complexity to identify the feature (see feature class 1, 3 and 4).

Classification and Accuracy Assessment
Implementation of the NDVI threshold produced the final urban land cover classes where extremely low value (<0.03) are classified as water, built-up area (0.03 -0.12), soil/cultivated land (0.12 -0.14), grass (0.14 -0.17), and shrub (0.17 -0.27) (Fig. 4). Results of the land cover map of January (dry season, Fig. 4a) and December (harmattan, Fig. 4c) are similar but differ considerably from the result obtained for the month of May (raining season). This, obviously, reflects the impact of seasonal variation and vegetation condition. Except for the increase in the class shrub, results of the dry and harmattan seasons are similar. In contrast, result obtained for the month of May is different in all respects (Fig. 4b).
Most of the study area show plants with good condition and vegetation with high chlorophyll content (see Fig. 2b), partly due to high soil water content and vegetation regrowth. Quantitative assessment of the land cover maps is done using unbiased error matrix and the details are presented in Table 2. In error analysis, it is important to know which class(es) have the greatest error by examining individual class accuracy using the user's and producer's accuracy, which measure the correctly classified pixel in the reference data (Anees et al. 2020;Jin et al. 2018). For this study, the user's accuracy for the individual class ranges between 55% and 100% for the first map (Fig 4a), 24% and 100% for the second map (except for the soil) (Fig 4b), and between 60% to 100% for the third map (Fig 4c). The producer's accuracies between 46% -100%, 5% -100%, and 13% -94% were also obtained for the January, May and December maps, respectively. In addition, overall accuracy of 71.3%, 46.4% and 75.6% were obtained for January, May and December, respectively. Certainly, judging from the results, the degree of error in all the land cover classes generated with the image collected in May appears high, which accounts for the low overall accuracy achieved. However, the classification results obtained for the January and December data are good, implying that they are produced with high accuracy.

Discussion
It has been widely reported that time series NDVI provides seasonal changes in vegetation cover (Kong et al. 2016;Usman et al. 2015;Zhao et al. 2017). Analogically, features of the same or related spectral signature in NDVI time series belong to the same category of land cover. This assumption is more practical in urban areas, which has more permanent features. Unlike vegetated region with broad range of NDVI value, typically between 0.3 and +1, to classify vegetation into different categories (Ehsan & Kazem 2013;Kinthada 2014;Zhao et al. 2017), the urban setup and its surrounding is a heterogeneous environment that occupies the lower end of the NDVI scale with extremely narrow width to identify different land-cover types.
The plot of the curve presented in this study has shown that the thin line of separation between different urban land-cover types is detectable (Figure 2). This work has revealed how difficult it is to distinguish soil from agricultural land, grassland and shrub particularly within the urban peripheries of the City of Ilorin, which is characterised by a mix of open lands for physical development, cropland, and grassland for grazing (Ehsan & Kazem 2013;Kinthada 2014;Zhao et al. 2017), by only using the Landsat imagery of single dates (Figure 2a, 2c, 2d). Therefore, the dry, raining, and harmattan season images have been used for urban land-cover classifications using the defined NDVI threshold because the images contain most of the seasonally impacted phenological changes (January, May, and December, respectively).
Examination of the classified maps have shown the relation between NDVI spectral variability and the vegetation amount. It can be seen that water, built-up area and exposed soil at the city peripheral appeared fairly consistent in the January 23 and December 25 images (Figure 4a and 4c), while the more vegetated regions reveal slight phenological variance, which could be attributed to stress due to reduction in soil moisture (Ajadi et al. 2011;Gandhi et al. 2015). In January, during dry season, there is a significant loss of soil moisture arising from lack of rainfall, high temperature and high intensity of sunshine. Dryness in leaves and bush burning expose the soil in low growing grasses and plants causing low reflectance in the near-infrared band that account for more areas of bare soil in January (18.0%) compared to December (9.4%), built-up area (27.4% to 20.2%) and also grassland (39.4% to 32.8%) for the respective dates. But for the shrub land cover, the December identified large percentage (37.2%) than is obtainable in January (14.6%). As mentioned earlier, the weather condition (low temperature, sunshine and cold wind) makes the vegetation resistant to the diminishing soil moisture content.
Unlike the products of the two dates discussed above, the NDVI generated from the month of May (and subsequently the classified map) has shown the impact of seasonal change on landcover particularly the vegetation cover. The NDVI presents value that varies between the very low to fairly high value (-0.02 -0.57, Figure 2b), indicating the presence of substantial amount of chlorophyll content, typical of healthy vegetation. However, the image poorly represents the water, built-up areas and completely underrepresents the soil class ( Figure 4b). Critical evaluation of the result shows that only 6 pixels fall in the negative NDVI value which is responsible for the absence of negative NDVI value among the water sampled points (Table 2 and Figure 3b). While rainfall increases soil moisture thereby accelerating vegetation regrowth and supporting plant health condition, which could also be responsible for the variant NDVI spectral (Jin et al. 2018). The NDVI spectral conflict of the urban features and water as observed in the present study ( Figure 4b

Conclusion
The three-date data represents the most significant characteristics of the seasons in the study area that are essential for the accurate classification of the urban land-cover types. According to the spectral characteristics of the images, water, developed area (Built-Up area), soil and grass/cropland can be effectively mapped in both dry and harmattan seasons. In the rainy season image, on the other hand, highly reflective surfaces, such as built-up area, water and unused land could be confused with vegetation. Basically, the concept of seasons allows differentiating urban farm land from perennial green plants in the off-farming season. It also provides a better visual interpretation of the vegetation distribution around the metropolis, which could be useful for urban vegetation mapping. This study has demonstrated the usefulness of NDVI indicators for urban land-cover mapping. The threshold employed in this study will make urban land-cover classification handy for non-remote sensing specialists who may need it as an input into climate modelling operation. The limitation of this study is the availability of a very few numbers of cloud free images. In the future, the causes of NDVI spectral variance and mix during the rainy season shall be investigated. Also, the approach employed in this study will be advanced using higher resolution satellite images such as Sentinel-2 and Pleiades imagery.

Acknowledgement
The Landsat-8 satellite imagery used in this study is downloaded from the United State Geological Survey (USGS) data archive.