8 Soil depth mapping

8.1 Introduction

8.1.1 Purpose

This document is meant to present the methodology we used for soil depth mapping. Our data are based on the sampling campaigns 2022 - 2023 and additional remote sensing observation for bare rock formations. We based our approach on Liu et al. (2022) and added small modifications.

8.1.2 Covariates

All the data listed below are available freely online excepted for the digitized maps (Geology).

(#tab:input_5)Data used in the production of the conditioned soil depth mapping.
Name/ID	Original resolution (m)	Type/Unit	Source
Aspect/TE.1	25	Radians	ESA and Airbus (2022)
DEM/TE.5	25	Meters	ESA and Airbus (2022)
General curvature/TE.7	25	-	ESA and Airbus (2022)
MrRTF/TE.8	25	-	ESA and Airbus (2022)
MrVBF/TE.9	25	-	ESA and Airbus (2022)
Plan curvature/TE.12	25	-	ESA and Airbus (2022)
Profile curvature/TE.14	25	-	ESA and Airbus (2022)
Slope/TE.16	25	Radians	ESA and Airbus (2022)
TPI/TE.21	25	-	ESA and Airbus (2022)
Landsat 5 B2/LA5.1	30	0.52 - 0.60 µm	EROS (2013)
Landsat 5 B3/LA5.2	30	0.63 - 0.69 µm	EROS (2013)
Landsat 5 B4/LA5.3	30	0.76 - 0.90 µm	EROS (2013)
Landsat 5 B7/LA5.4	30	2.08 - 2.35 µm	EROS (2013)
Landsat 5 NDVI/LA5.5	30	-	Mathias Bellat / EROS (2013)
Landsat 5 NDWI/LA5.6	30	-	EROS (2013)
LST Apr-May/LST.1	1000	Kelvin	Hulley and Hook (2018)
LST Feb-Mar/LST.2	1000	Kelvin	Hulley and Hook (2018)
LST Jun-Jul/LST.3	1000	Kelvin	Hulley and Hook (2018)
LST Oct-Nov/LST.4	1000	Kelvin	Hulley and Hook (2018)
Geology/OT.2	1 : 250 000	Lithology	Sissakian, Hagopian, and Hasan (1995), al-mousawi_geological_2007
Landuse/OT.4	10	Landuse class	Zanaga et al. (2021)
PET sum/OT.5	750	mm	Zomer and Trabucco (2022)
Prec. sum/OT.6	1000	mm	Fick and Hijmans (2017)
SRAD sum/OT.7	1000	MJ m\(^{-2}\)	Fick and Hijmans (2017)
Wind sum/OT.9	1000	m s\(^{-1}\)	Fick and Hijmans (2017)
Temp avg/OT.10	1000	°C	Fick and Hijmans (2017)

8.1.2.1 Terrain

All terrain covariates were computed the same way as in the previous chapter.

8.1.2.2 Remote sensing images and indexes

The Landsat 5 images were collected via a Google Earth Engine script on a period covering 1990 - 2010, the median of the composite image from Collection 2 Tier 1 TOA was used. The land surface temperature (LST) were computed also on Google Earth Engine with a time covering from 2012 to 2016 from the Suomi National Polar-Orbiting satellite. We took the average of each period: February - Mars, April - May, June - July and October November from the VIIRS instrument. The javascript codes for scraping these images are available in the supplementary file inside the “8 - Soil depth/code” folder. We computed the following indexes: Normalized Difference Vegetation Index (NDVI) (McFeeters 1996) and Normalized Difference Water Index (NDWI) (Rouse et al. 1974).

\[ NDVI = (NIR - Red) / (NIR + Red) \]

\[ NDWI = (Green - NIR) / (Green + NIR) \]

8.1.3 Soil depths

The soil depths of the 122 soil profiles were measured during the field campaign and can be found found online at https://doi.org/10.1594/PANGAEA.973714. Additional 25 site with no soil (or zero value to soil depth), were added after looking at Bing and Google earth satellite image. We only selected bare rock formation for these sites.

8.1.4 Preparation of the data

All raster were sampled to 25 x 25 m tiles to match the DEM. We used bilinear method excepted for the discrete maps (geology) where we used ngb resampling.

library(raster)
library(terra)
# 1.1 Import data ==============================================================

DEM <- raster("./data/DEM_GLO_25.tif")

rlist<-list.files("./data/remote", full.names=T)
rlist2<-list.files("./data/remote/", full.names=F)
outpath<-"./data/remote/resample/"
outfiles <- paste0(outpath, rlist2)

# 1.1 Resample loop ============================================================
for (i in (1:length(rlist))){
  x<-raster(rlist[i])
  x <- projectRaster(x, crs=CRS("+init=epsg:32638"))
  x<-resample(x,DEM,method = "bilinear") #Careful to implement "ngb" option for discrete values
  x<-crop(x,DEM)
  writeRaster(x,filename=outfiles[i], format="GTiff",overwrite=T)
}


# 1.2 Export as a stack_raster =================================================
rlist <- list.files("./data/remote/resample/", full.names=T)
x <- stack(rlist, DEM)
x <- rast(x)
terra::writeRaster(x, "./export/Stack_layers_depth.tif")

8.2 Soil depth mapping prediction

8.2.1 Preparation of the environment

# 0.1 Prepare environment ======================================================

# Folder check
getwd()

# Set folder direction
setwd(dir="./depth")

# Clean up workspace
rm(list = ls(all.names = TRUE))

# 0.2 Install packages =========================================================

install.packages("pacman")        #Install and load the "pacman" package (allow easier download of packages)
library(pacman)

pacman::p_load(ggplot2, terra, mapview, sf, raster, dplyr, corrplot, viridis, rsample, caret, parsnip, tmap, DescTools, patchwork, quantregForest, wesanderson, cblindplot, grid, gridExtra, ggspatial, cowplot, ncdf4)

# 0.3 Show session infos =======================================================
sessionInfo()

## R version 4.4.0 (2024-04-24 ucrt)
## Platform: x86_64-w64-mingw32/x64
## Running under: Windows 11 x64 (build 22621)
## 
## Matrix products: default
## 
## 
## locale:
## [1] LC_COLLATE=French_France.utf8  LC_CTYPE=French_France.utf8   
## [3] LC_MONETARY=French_France.utf8 LC_NUMERIC=C                  
## [5] LC_TIME=French_France.utf8    
## 
## time zone: Asia/Tbilisi
## tzcode source: internal
## 
## attached base packages:
## [1] grid      stats     graphics  grDevices utils     datasets  methods  
## [8] base     
## 
## other attached packages:
##  [1] ncdf4_1.23             cowplot_1.1.3          ggspatial_1.1.9       
##  [4] gridExtra_2.3          cblindplot_0.0.4       wesanderson_0.3.7     
##  [7] quantregForest_1.3-7.1 RColorBrewer_1.1-3     randomForest_4.7-1.2  
## [10] patchwork_1.3.0        DescTools_0.99.58      tmap_3.3-4            
## [13] parsnip_1.2.1          caret_6.0-94           lattice_0.22-6        
## [16] rsample_1.2.1          viridis_0.6.5          viridisLite_0.4.2     
## [19] corrplot_0.95          dplyr_1.1.4            raster_3.6-26         
## [22] sp_2.1-4               sf_1.0-16              mapview_2.11.2        
## [25] terra_1.7-78           ggplot2_3.5.1          pacman_0.5.1          
## [28] DT_0.33                readr_2.1.5           
## 
## loaded via a namespace (and not attached):
##   [1] rstudioapi_0.16.0    jsonlite_1.8.8       magrittr_2.0.3      
##   [4] farver_2.1.2         rmarkdown_2.27       vctrs_0.6.5         
##   [7] base64enc_0.1-3      forcats_1.0.0        htmltools_0.5.8.1   
##  [10] leafsync_0.1.0       haven_2.5.4          cellranger_1.1.0    
##  [13] pROC_1.18.5          sass_0.4.9           parallelly_1.38.0   
##  [16] KernSmooth_2.23-22   bslib_0.8.0          htmlwidgets_1.6.4   
##  [19] plyr_1.8.9           rootSolve_1.8.2.4    lubridate_1.9.3     
##  [22] stars_0.6-6          cachem_1.1.0         lifecycle_1.0.4     
##  [25] iterators_1.0.14     pkgconfig_2.0.3      Matrix_1.7-0        
##  [28] R6_2.5.1             fastmap_1.2.0        future_1.34.0       
##  [31] digest_0.6.36        Exact_3.3            colorspace_2.1-1    
##  [34] furrr_0.3.1          leafem_0.2.3         crosstalk_1.2.1     
##  [37] lwgeom_0.2-14        fansi_1.0.6          timechange_0.3.0    
##  [40] httr_1.4.7           abind_1.4-8          compiler_4.4.0      
##  [43] proxy_0.4-27         withr_3.0.2          DBI_1.2.3           
##  [46] MASS_7.3-60.2        lava_1.8.0           tmaptools_3.1-1     
##  [49] leaflet_2.2.2        classInt_0.4-10      gld_2.6.6           
##  [52] ModelMetrics_1.2.2.2 tools_4.4.0          units_0.8-5         
##  [55] future.apply_1.11.2  nnet_7.3-19          glue_1.7.0          
##  [58] satellite_1.0.5      nlme_3.1-164         reshape2_1.4.4      
##  [61] generics_0.1.3       recipes_1.1.0        gtable_0.3.6        
##  [64] tzdb_0.4.0           class_7.3-22         tidyr_1.3.1         
##  [67] data.table_1.15.4    lmom_3.2             hms_1.1.3           
##  [70] utf8_1.2.4           foreach_1.5.2        pillar_1.9.0        
##  [73] stringr_1.5.1        splines_4.4.0        survival_3.5-8      
##  [76] tidyselect_1.2.1     knitr_1.48           bookdown_0.40       
##  [79] stats4_4.4.0         xfun_0.46            expm_1.0-0          
##  [82] hardhat_1.4.0        timeDate_4032.109    stringi_1.8.4       
##  [85] yaml_2.3.10          boot_1.3-30          evaluate_0.24.0     
##  [88] codetools_0.2-20     tibble_3.2.1         cli_3.6.3           
##  [91] rpart_4.1.23         munsell_0.5.1        jquerylib_0.1.4     
##  [94] dichromat_2.0-0.1    Rcpp_1.0.13          readxl_1.4.3        
##  [97] globals_0.16.3       png_0.1-8            XML_3.99-0.17       
## [100] parallel_4.4.0       gower_1.0.1          listenv_0.9.1       
## [103] mvtnorm_1.3-2        ipred_0.9-15         scales_1.3.0        
## [106] prodlim_2024.06.25   e1071_1.7-14         purrr_1.0.2         
## [109] rlang_1.1.4

8.2.2 Prepare the data

# 01 Import data sets ##########################################################

# 01.1 Import soils depths =====================================================

depth <- read.csv("./data/Soil_depth.csv", sep=";")
depth$Depth <- as.numeric(depth$Depth)

# 01.2 Plot the depth information ==============================================

# Short overview of the values
head(depth)

##          X       Y    Site.Name Depth
## 1 287940.2 4095202 B07_2022_001   100
## 2 286917.0 4104485 B07_2022_002    65
## 3 280594.1 4090376 B07_2022_003    40
## 4 275039.6 4097134 B07_2022_004    80
## 5 289512.9 4086706 B07_2022_006    80
## 6 287922.6 4092158 B07_2022_007     0

# Histogram of the values
ggplot(depth, aes(x = Depth)) + 
  geom_histogram(breaks = seq(0, 100, by = 10), alpha = 0.5, fill = 'steelblue', color = 'black') +
  labs(title ="Number of smaple for each soil depth", x = "Depth (cm)", y = "Number of samples") +
  xlim(0, 100) +
  theme_minimal() +
  theme(plot.title = element_text(hjust = 0.5))

# 01.3 Transform and plot the data =============================================

depth$sqrt <- sqrt(depth$Depth)

# Histogram of the sqrt values
ggplot(depth, aes(x = sqrt)) + 
  geom_histogram(breaks = seq(0, 10, by = 0.5), alpha = 0.5, fill = 'steelblue', color = 'black') +
  labs(title ="Number of sample for each soil depth in cm", x = "Sqrt-transformed soil depth in cm", y = "Number of samples") +
  xlim(0, 11) +
  theme_minimal() +
  theme(plot.title = element_text(hjust = 0.5))

# 01.4 Plot the depth spatial information ======================================

# Create a spatial dataframe
sp_df <- st_as_sf(depth, coords = c("X", "Y"), crs = 32638)

# Plot the values of depth regarding the elevation
mapview(sp_df, zcol = "Depth", legend =TRUE, map.types ="Esri.WorldShadedRelief")

# 01.5 Import covariates raster ================================================

Landsat <- raster::stack(list.files("./data/Landsat/", full.names = TRUE))
names(Landsat)

Terrain <- raster::stack(list.files("./data/Terrain/", full.names = TRUE))
names(Terrain)

Others <- raster::stack(list.files("./data/Others/", full.names = TRUE))
names(Others)

LST <- raster::stack(list.files("./data/LST/", full.names = TRUE))
names(LST)

DSM_cov_name <- read.table("./data/Covariates_names_DSM.txt")


df_names <- data.frame()
for (i in 1:length(names(Terrain))) {
  col_name <- names(Terrain)[i]
  if (col_name %in% DSM_cov_name$V2) {
    transformed_name <- DSM_cov_name$V1[DSM_cov_name$V2 == col_name]
  } else {
    transformed_name <- paste0("TE.26")
  }
  df_names[i,1] <- transformed_name
  df_names[i,2] <- names(Terrain)[i]
}

t <- nrow(df_names)
for (i in 1:length(names(Landsat))) {
  c <- paste0("LA5.",i)
  df_names[i+t,1] <- c
  df_names[i+t,2] <- names(Landsat)[i]
}

t <- nrow(df_names)
for (i in 1:length(names(LST))) {
  c <- paste0("LST.",i)
  df_names[i+t,1] <- c
  df_names[i+t,2] <- names(LST)[i]
}

t <- nrow(df_names)
for (i in 1:length(names(Others))) {
  col_name <- names(Others)[i]
  if (col_name %in% DSM_cov_name$V2) {
    transformed_name <- DSM_cov_name$V1[DSM_cov_name$V2 == col_name]
  } else {
    transformed_name <- "OT.10"  #Only one new variable
  }
  df_names[i+t,1] <- transformed_name
  df_names[i+t,2] <- names(Others)[i]
}

write.table(df_names,"./data/Covariates_names_soil_depth.txt")
all_cov_name <- rbind(DSM_cov_name, df_names)
all_cov_name <- distinct(all_cov_name)
x <- raster::stack(Terrain, Landsat, LST, Others)
names(x) <- df_names[,1]
x <- rast(x)

terra::writeRaster(x, "./data/Stack_layers_soil_depth.tif", overwrite = TRUE)
covariates <- stack("./data/Stack_layers_soil_depth.tif")

# 01.6 Plot the covariates maps ================================================

reduce <- aggregate(covariates, fact=10, fun=mean)

plot(reduce)

\(\\[0.5cm]\)

# 01.7 Extract the values ======================================================

# Extract the values of each band for the sampling location
df_cov = raster::extract(covariates, sp_df, method='simple')

# 01.8 Save the data ===========================================================

# Create a csv out of it
write.csv(data.frame(df_cov), "./data/df_cov_soil_depth.csv")

# 02 Check the data ############################################################

# 02.1 Import the data and merge ===============================================
Covdfgis    <- read.csv("./data/df_cov_soil_depth.csv")
ID    <- 1:nrow(Covdfgis)
SoilCov <- cbind(ID, depth, Covdfgis[,-c(1)])

#Check the data
names(SoilCov)
str(SoilCov)

# Prepare data for ML modelling
SoilCovML <- SoilCov[,-c(1:7)]

##  [1] "ID"        "X"         "Y"         "Site.Name" "Depth"     "sqrt"     
##  [7] "TE.1"      "TE.5"      "TE.7"      "TE.8"      "TE.9"      "TE.12"    
## [13] "TE.14"     "TE.16"     "TE.21"     "LA5.1"     "LA5.2"     "LA5.3"    
## [19] "LA5.4"     "LA5.5"     "LA5.6"     "LST.1"     "LST.2"     "LST.3"    
## [25] "LST.4"     "OT.2"      "OT.4"      "OT.5"      "OT.6"      "OT.7"     
## [31] "OT.10"     "OT.9"

## 'data.frame':    147 obs. of  32 variables:
##  $ ID       : int  1 2 3 4 5 6 7 8 9 10 ...
##  $ X        : num  287940 286917 280594 275040 289513 ...
##  $ Y        : num  4095202 4104485 4090376 4097134 4086706 ...
##  $ Site.Name: chr  "B07_2022_001" "B07_2022_002" "B07_2022_003" "B07_2022_004" ...
##  $ Depth    : num  100 65 40 80 80 0 25 100 75 75 ...
##  $ sqrt     : num  10 8.06 6.32 8.94 8.94 ...
##  $ TE.1     : num  5.13 4.27 3.53 4.41 2.6 ...
##  $ TE.5     : num  483 613 352 400 380 ...
##  $ TE.7     : num  -0.003644 0.000265 0 -0.003209 -0.006941 ...
##  $ TE.8     : num  0.0733 0.4736 0.7583 0.0832 0.0127 ...
##  $ TE.9     : num  1.637 0.445 2.258 0.118 0.394 ...
##  $ TE.12    : num  -2.09e-04 1.64e-05 2.24e-04 1.59e-03 3.37e-04 ...
##  $ TE.14    : num  -0.000943 0.000229 -0.000917 0.001199 -0.000281 ...
##  $ TE.16    : num  0.0928 0.06845 0.00189 0.17723 0.15261 ...
##  $ TE.21    : num  -2.078 1.333 -0.852 1.378 -3.201 ...
##  $ LA5.1    : num  0.16 0.18 0.2 0.167 0.196 ...
##  $ LA5.2    : num  0.192 0.225 0.243 0.204 0.23 ...
##  $ LA5.3    : num  0.272 0.288 0.314 0.252 0.28 ...
##  $ LA5.4    : num  0.193 0.248 0.282 0.226 0.264 ...
##  $ LA5.5    : num  0.16 0.123 0.135 0.103 0.117 ...
##  $ LA5.6    : num  -0.244 -0.233 -0.23 -0.206 -0.197 ...
##  $ LST.1    : num  311 310 312 312 314 ...
##  $ LST.2    : num  296 296 296 296 296 ...
##  $ LST.3    : num  327 327 328 329 329 ...
##  $ LST.4    : num  306 305 306 306 307 ...
##  $ OT.2     : int  13 13 5 8 6 8 13 7 8 7 ...
##  $ OT.4     : int  5 11 11 11 11 11 11 5 5 11 ...
##  $ OT.5     : num  2113 2069 2122 2127 2144 ...
##  $ OT.6     : num  734 738 664 647 739 ...
##  $ OT.7     : num  211009 210498 211285 210657 211260 ...
##  $ OT.10    : num  19.5 19 19.9 19.9 20 ...
##  $ OT.9     : num  25.3 24.6 25.5 25.6 25.6 ...

# 02.3 Plot and export the correlation matrix ==================================
# Correlation of the data
corrplot(cor(SoilCovML),  method = "color", col = viridis(200), 
         type = "upper", 
         addCoef.col = "black", # Add coefficient of correlation
         tl.col = "black", tl.srt = 45, # Text label color and rotation
         number.cex = 0.7, # Size of the text labels
         cl.cex = 0.7) # Size of the color legend text

         #cl.lim = c(-1, 1)) # Color legend limits

Regarding the correlation plot, the Landsat 5 bands and land surface temperature layers have correlation index over 0.75, as some transformed layers based on DEM (curvature, slope ,aspect …) and the solar radiation with wind.

# 02.4 Export and save data ====================================================
save(covariates, SoilCov, file = "./export/Save/Pre_process_soil_depth.RData")
rm(list = ls())

8.2.3 Tune the model

# 03 Tune the model ############################################################
load(file = "./export/Save/Pre_process_soil_depth.RData")

# 03.1 Create the train and test set ===========================================

#R Remove the site name and x, y columns
df_soilCov <- SoilCov[,c(1,5:length(SoilCov))]

# Set a random seed
set.seed(1070)

# preprocess the layers
preproc <- preProcess(df_soilCov[,4:length(df_soilCov)], method=c("range"))
df_soilCovTrans <- predict(preproc, df_soilCov[,4:length(df_soilCov)])
df_soilCovTrans <- cbind(df_soilCovTrans, df_soilCov[,1:3])

# Create a 80% split
df_soil <- initial_split(df_soilCovTrans, prop = 0.8, strata = ID)
trainData <- training(df_soil)
testData <- testing(df_soil)

# Implement the index
row.names(testData) <- testData$ID
row.names(trainData) <- trainData$ID

# Create the splits
X_train <- trainData[,-c(27:29)]
y_train <- trainData[,c(28)]
y_train_sqrt <- trainData[,c(29)]
X_test <- testData[,-c(27:29)]
y_test <- testData[,c(28)]
y_test_sqrt <- testData[,c(29)]

# 03.2 Create the model train controls with cross-validation  ==================
TrainControl <- trainControl(method="repeatedcv", 10, 3, allowParallel = TRUE, savePredictions=TRUE, verboseIter = TRUE)

# 03.3 Set the grid for mtry and nodesize ======================================

tuneGrid <- expand.grid(mtry = c(1, 3, 5, 7, 10, 12, 15))
nodesize_values <- c(3, 4, 5, 6, 7, 8, 9, 10, 11, 12)

all_results <- list()
set.seed(1070)

# 03.4 Run the model tunning ===================================================
# Loop over different nodesize values and different mtry
for (nodesize in nodesize_values) {
  start_time <- Sys.time()
  qrf_model <- train(x = X_train, y = y_train_sqrt,
                  method = "qrf",
                  trControl = TrainControl,
                  tuneGrid = tuneGrid,
                  ntree = 500,     # Number of trees
                  metric = "RMSE",      # Specify the metric you want to optimize
                  nodesize = nodesize     # Minimum node size
)
  end_time <- Sys.time()
  print(end_time - start_time)
  
# Store the results with the corresponding nodesize
results <- qrf_model$results
results$nodesize <- nodesize  # Add the nodesize information
all_results[[as.character(nodesize)]] <- results
}

final_results <- do.call(rbind, all_results)

# View the combined results for all mtry and nodesize combinations
print(final_results)

##      mtry     RMSE  Rsquared      MAE    RMSESD RsquaredSD     MAESD nodesize
## 3.1     1 2.729334 0.4962853 1.842530 0.8103775  0.2462337 0.5842215        3
## 3.2     3 2.833004 0.4773500 1.904092 0.7890278  0.2513403 0.5469537        3
## 3.3     5 2.841813 0.4714421 1.922363 0.7898268  0.2548695 0.5590358        3
## 3.4     7 2.882372 0.4651227 1.941985 0.7952213  0.2559438 0.5670334        3
## 3.5    10 2.910301 0.4604997 1.974741 0.7570323  0.2461620 0.5293020        3
## 3.6    12 2.916578 0.4596933 1.975224 0.7161846  0.2408349 0.5149407        3
## 3.7    15 2.951068 0.4526866 2.001091 0.7184835  0.2388678 0.5245988        3
## 4.1     1 2.773342 0.4929364 1.867376 0.7325954  0.2481142 0.5790523        4
## 4.2     3 2.838034 0.4895359 1.915626 0.7145291  0.2440721 0.5656305        4
## 4.3     5 2.830851 0.4901692 1.905814 0.7078239  0.2359281 0.5483191        4
## 4.4     7 2.878877 0.4750751 1.973288 0.7049437  0.2440490 0.5876352        4
## 4.5    10 2.908365 0.4670291 1.969554 0.6328296  0.2241551 0.5133742        4
## 4.6    12 2.947753 0.4580659 1.994925 0.6831769  0.2171447 0.5480810        4
## 4.7    15 2.982162 0.4583561 2.026137 0.6708416  0.2158921 0.5384385        4
## 5.1     1 2.663055 0.5221769 1.800572 0.7448825  0.2207681 0.5635888        5
## 5.2     3 2.779503 0.5010455 1.867847 0.7509376  0.2170405 0.5612229        5
## 5.3     5 2.789159 0.4988129 1.876542 0.7544052  0.2195226 0.5458377        5
## 5.4     7 2.845951 0.4780145 1.923041 0.7654205  0.2351576 0.5985707        5
## 5.5    10 2.916293 0.4551181 1.983060 0.7682618  0.2436306 0.5844046        5
## 5.6    12 2.901288 0.4604422 1.969209 0.7500825  0.2318361 0.5945813        5
## 5.7    15 2.928156 0.4543225 1.980911 0.7637384  0.2398471 0.5924727        5
## 6.1     1 2.737298 0.5416168 1.848400 0.7511112  0.2307820 0.5255341        6
## 6.2     3 2.769118 0.5411176 1.875529 0.7338547  0.2318195 0.5177122        6
## 6.3     5 2.832541 0.5293194 1.907256 0.6903118  0.2239275 0.4742872        6
## 6.4     7 2.857926 0.5173423 1.931808 0.7112448  0.2242474 0.4928804        6
## 6.5    10 2.926328 0.4900923 1.970553 0.7080537  0.2288874 0.4873877        6
## 6.6    12 2.942479 0.4853409 1.977365 0.6608536  0.2189466 0.4556416        6
## 6.7    15 2.957402 0.4782926 1.985686 0.7045822  0.2218300 0.4805035        6
## 7.1     1 2.730479 0.5095423 1.827464 0.7556501  0.2258124 0.5525021        7
## 7.2     3 2.794204 0.4925000 1.903247 0.8293877  0.2384477 0.5883647        7
## 7.3     5 2.851583 0.4780661 1.933760 0.7945918  0.2344559 0.5876801        7
## 7.4     7 2.893746 0.4675265 1.944191 0.7999453  0.2305590 0.5874803        7
## 7.5    10 2.871070 0.4677090 1.948722 0.7555909  0.2257198 0.5477448        7
## 7.6    12 2.991799 0.4378629 2.030453 0.7139264  0.2149300 0.5409700        7
## 7.7    15 3.007470 0.4320991 2.046692 0.8110356  0.2321078 0.5965111        7
## 8.1     1 2.685270 0.5249273 1.793010 0.6919063  0.2178054 0.4843446        8
## 8.2     3 2.763586 0.5095831 1.848866 0.7294080  0.2303865 0.5100895        8
## 8.3     5 2.779153 0.5088139 1.871082 0.7549698  0.2398836 0.5340549        8
## 8.4     7 2.773158 0.5061415 1.878156 0.7643847  0.2390085 0.5535613        8
## 8.5    10 2.842823 0.4844367 1.922981 0.7414883  0.2325255 0.5451719        8
## 8.6    12 2.779985 0.4986677 1.890600 0.7697851  0.2382519 0.5572981        8
## 8.7    15 2.832611 0.4851692 1.920453 0.7173450  0.2241389 0.5217087        8
## 9.1     1 2.708775 0.5081740 1.822192 0.6939379  0.2420910 0.5060151        9
## 9.2     3 2.843956 0.4809980 1.902569 0.6654775  0.2326433 0.4704214        9
## 9.3     5 2.865376 0.4708315 1.891287 0.6255177  0.2188883 0.4407110        9
## 9.4     7 2.882954 0.4646611 1.915415 0.6112141  0.2152987 0.4561261        9
## 9.5    10 2.895203 0.4690362 1.920161 0.6001704  0.2199620 0.4374223        9
## 9.6    12 2.876605 0.4713040 1.922756 0.5518848  0.1996927 0.4064532        9
## 9.7    15 2.910187 0.4565587 1.927773 0.5937036  0.2140814 0.4306462        9
## 10.1    1 2.669146 0.5193111 1.814073 0.7783866  0.2272123 0.5854923       10
## 10.2    3 2.693680 0.5166215 1.828976 0.7915503  0.2280681 0.5746169       10
## 10.3    5 2.735058 0.5080277 1.850283 0.7329562  0.2181400 0.5332899       10
## 10.4    7 2.799899 0.4926177 1.898245 0.7327455  0.2183946 0.5429160       10
## 10.5   10 2.822178 0.4870636 1.934005 0.7389095  0.2152389 0.5410894       10
## 10.6   12 2.840115 0.4845868 1.934662 0.7306276  0.2082656 0.5552059       10
## 10.7   15 2.828377 0.4889028 1.921956 0.7200878  0.2129413 0.5440597       10
## 11.1    1 2.735141 0.5307591 1.850768 0.7655969  0.2501171 0.5644426       11
## 11.2    3 2.738556 0.5260517 1.863128 0.7605979  0.2525759 0.5524572       11
## 11.3    5 2.752541 0.5154618 1.874229 0.7562428  0.2553953 0.5714878       11
## 11.4    7 2.820722 0.4978160 1.914186 0.7809397  0.2586396 0.5791621       11
## 11.5   10 2.869406 0.4885411 1.950299 0.7591539  0.2539700 0.5947951       11
## 11.6   12 2.876452 0.4770955 1.967754 0.7989715  0.2685377 0.6284647       11
## 11.7   15 2.906921 0.4680887 1.989789 0.7629615  0.2646727 0.6125713       11
## 12.1    1 2.702005 0.5072654 1.818173 0.8076490  0.2471731 0.6133061       12
## 12.2    3 2.754790 0.4953014 1.880478 0.8198944  0.2518294 0.6431279       12
## 12.3    5 2.790744 0.4896488 1.921956 0.8345103  0.2542266 0.6653885       12
## 12.4    7 2.857767 0.4728454 1.958128 0.8065503  0.2489765 0.6457750       12
## 12.5   10 2.855369 0.4734595 1.966860 0.8357491  0.2489488 0.6733135       12
## 12.6   12 2.906046 0.4599975 1.987815 0.7731286  0.2355587 0.6506323       12
## 12.7   15 2.904322 0.4593194 1.995720 0.8032117  0.2424297 0.6707301       12

The two parameters that we are modifying are the minimum node size/ nodesize (min_n) and the number of randomly selected covariates (mtry). We set 500 trees as default.

# 03.5 Plot the different optimization =========================================
gg1 <- ggplot(final_results, aes(x = mtry, y = Rsquared, color = as.factor(nodesize))) +
  geom_line() +
  geom_point() +
  labs(x = "mtry", y = "R²", color = "nodesize") +
  theme_minimal()

gg2 <- ggplot(final_results, aes(x = mtry, y = RMSE, color = as.factor(nodesize))) +
  geom_line() +
  geom_point() +
  labs(x = "mtry", y = expression("RMSE (" *cm^0.5* ")"), color = "nodesize") +
  theme_minimal()


# Combine R2 and RMSE in one plot
combined_plot <- wrap_plots(gg1, gg2, ncol = 2)
plot(combined_plot)

# Save and export the plot
ggsave("./export/Optimisation of the model.pdf", combined_plot, width = 20, height = 10)
ggsave("./export/Optimisation of the model.png", combined_plot, width = 20, height = 10)

8.2.4 Final model

Here we selected optimised nodesize of 6 and mtry of 1 with ntree always set at 500.

# 04.1 Produce the final model #################################################

nodesize <- 6
mtry  <- expand.grid(mtry = 1)

set.seed(1070)
qrf_final <- train(x = X_train, y = y_train_sqrt,
                   method = "qrf",
                   trControl = TrainControl,
                   tuneGrid = mtry,
                   ntree = 500,     # Number of trees
                   metric = "RMSE",      # Specify the metric you want to optimize
                   nodesize = nodesize )


results_qrf <- qrf_final$results

# View R² and RMSE values for final model
print(results_qrf[, c("mtry", "RMSE", "Rsquared")])

##   mtry     RMSE  Rsquared
## 1    1 2.619363 0.5257963

# Select the final model
final_model <- qrf_final$finalModel

# 04.2 Show covariates importance ==============================================

importance_df <- as.data.frame(varImp(final_model))

# Convert row names (variables) to a column for ggplot
importance_df$scale <- (importance_df$Overall/sum(importance_df$Overall)*100)
importance_df$Variable <- rownames(importance_df)

# Plot using ggplot2
gg1 <- ggplot(importance_df, aes(x = reorder(Variable, scale), y = scale)) +
  geom_bar(stat = "identity", fill = "lightblue") +
  coord_flip() +
  xlab("Covariates") +
  ylab("Importance scaled") +
  ggtitle("Variable Importance from QRF Model (%)")

gg1

8.2.5 Model evaluation

We used five metrics to evaluate the model:

RMSE= Root mean square error
R\(^2\)= Coefficient of determination, also called rsquared.
MAE = Mean absolute error
CCC = Concordance correlation coefficient
PICP = Prediction interval coverage probability set at 90% interval

\[ \\[0.5cm] RMSE = \sqrt{\frac{\sum_{i=1}^{n}(y_i - \hat{y}_i)^2}{N}} \]

\[ \\[0.5cm] R^2 = 1 - \frac{\sum_{i=1}^{n}(y_i - \hat{y}_i)^2}{\sum_{i=1}^{n}(y_i - \bar{y}_i)^2} \]

\[ \\[0.5cm] MAE = \frac{1}{n} \sum_{i=1}^{n}|Y_i - \hat{Y}_i| \]

\[ \\[0.5cm] CCC = \frac{2S_{XY}}{S_{X}^2+S_{Y}^2+(\bar{X}-\bar{Y})^2} \]

\[ \\[0.5cm] PICP = \frac{1}{v} ~ count ~ j \\ j = PL^{L}_{j} \leq t_j \leq PL^{U}_{j} \]

# Save and export the plot
ggsave("./export/Variables importance.png", gg1, width = 10, height = 8)
ggsave("./export/Variables importance.pdf", gg1, width = 10, height = 8)


# 04.3 Predict the test set ====================================================
predictions <- predict(final_model,  X_test)

# 04.4 Get the metrics  ========================================================
metrics <- postResample(predictions[,2], y_test_sqrt)
ccc <- CCC(y_test_sqrt, predictions[,2])
PICP <- (y_test_sqrt >= predictions[,1]) & (y_test_sqrt <= predictions[,3])
PICP <- mean(PICP)*100

Final_stats <- as.data.frame(t(c(metrics, ccc$rho.c$est, PICP)))
colnames(Final_stats) <- c("RMSE", "R²", "MAE", "CCC", "PICP")

# Print the metrics
print(Final_stats)

##   RMSE (cm-0.5)        R²      MAE       CCC   PICP
## 1      2.422883 0.5680413 1.538656 0.7375648 96.875

# Write it
write.table(Final_stats, file="./export/Final_stats.txt", row.names = FALSE)

# 04.5 Save the model and train and test sets ==================================
save(trainData, testData, final_model, qrf_model, all_results, qrf_final, file = "./export/Save/Model.RData")
rm(list = ls())

The final model explains around 56% of the variability (R\(^2\)) of the soil depth with a precision (RMSE) of soil depth around 2.42 cm\(^{-0.5}\).

8.2.6 Mapping the soil depth of the area

Due to the highly detailed data we had to split the prediction into different blocks (16). This process allows the computer to run separately each prediction.

# 05 Predict the map ###########################################################
# 05.1 Import the previous data ================================================
load(file = "./export/save/Model_soil_depth.RData")
stack_raster <- stack("./data/Stack_layers_soil_depth.tif")
cov <- read.csv("./data/df_cov_soil_depth.csv")
cov <- cov[,-1]
cov[] <- lapply(cov , as.numeric)
  
# 05.2 Normalise the values from the rasters ===================================
process_layer <- function(layer) {
  # Convert in numeric 
  layer <- as.numeric(layer)
  # Replace the NAs by median
  median_value <- median(layer, na.rm = TRUE)
  layer[is.na(layer)] <- median_value
  return(layer)
}

scaling_params <- lapply(1:ncol(cov), function(i) {
  list(min = min(cov[[i]], na.rm = TRUE), max = max(cov[[i]], na.rm = TRUE))
})

scale_layer <- function(layer, min_val, max_val) {
  (layer - min_val) / (max_val - min_val)
}

stack_scaled <- stack() 

# Apply transformation
for (i in 1:nlayers(raster_stack)) {
  min_val <- scaling_params[[i]]$min
  max_val <- scaling_params[[i]]$max
  
  # Normalised the layer
  layer_processed <- calc(raster_stack[[i]], process_layer)
  layer_scaled <- calc(layer_processed, function(x) scale_layer(x, min_val, max_val))
  stack_scaled <- addLayer(stack_scaled, layer_scaled)
  cat(round((i/nlayers(raster_stack))*100, 1),"% \n")
}

stack_scaled <- rast(stack_scaled)
sum(is.na(values(stack_scaled)))
names(stack_scaled) <- names(raster_stack)
terra::writeRaster(stack_scaled, "./export/Soil_depth/Stack_raster_normalised_soil_depth.tif", overwrite = TRUE)

# 05.3 Prediction map for soil depth ===========================================
rm(list = ls(all.names = TRUE))
load("./export/save/Model_soil_depth.RData")
raster_stack_normalised <- stack("./export/Soil_depth/Stack_raster_normalised_soil_depth.tif")

# Divide in block for allowing the computer to run it
block_info <- blockSize(raster_stack_normalised )

# Create an empty raster for storing predicted values
predicted_raster <- raster_stack_normalised [[1]]  # Use the first layer as a template
predicted_raster <- writeStart(predicted_raster, "./export/Prediction_depth_sqrt.tif", overwrite = TRUE)

# Loop through each block of the raster stack

for (i in 1:block_info$n) {
  # Read block of raster data
  start_time <- proc.time()
  block <- getValuesBlock(raster_stack_normalised , row = block_info$row[i], nrows = block_info$nrows[i])
  block <- as.data.frame(block)
  
  # Process the block 
  predicted_block <- predict(final_model, block,  what = c(0.05, 0.5, 0.95))  
  
  # Write the predicted block to the output raster
  predicted_raster <- writeValues(predicted_raster, predicted_block[,2], block_info$row[i])
  end_time <- proc.time()
  print(end_time - start_time)
  print(i)
}

# Close the creation of the raster
predicted_raster <- writeStop(predicted_raster)

# Rescale the raster and export it
predicted_raster_sqrt <- (predicted_raster)^2
writeRaster(predicted_raster_sqrt,"./export/Prediction_depth.tif", format = "GTiff",overwrite=T)

8.2.7 Prediction of uncertainty map

The uncertainty of the model is based on the following equation from Yan et al. (2018).

\[ \\[0.5cm] Uncertainty = \frac{Qp ~ 0.95- Qp ~ 0.05}{Qp ~ 0.5} \]

# 05.4 Uncertainty map of soil depth ===========================================

# Create an empty raster for storing predicted values
uncertainty_raster <- raster_stack_normalised [[1]]  # Use the first layer as a template
uncertainty_raster <- writeStart(uncertainty_raster, "./export/Uncertainty_depth_sqrt.tif", overwrite = TRUE)

# Loop through each block of the raster stack

for (i in 1:block_info$n) {
  # Read block of raster data
  start_time <- proc.time()
  block <- getValuesBlock(raster_stack_normalised, row = block_info$row[i], nrows = block_info$nrows[i])
  block <- as.data.frame(block)
  
  # Process the block 
  predicted_block <- predict(final_model, block,  what = c(0.05, 0.5, 0.95))  
  
  # Write the predicted block to the output raster
  values <- (predicted_block[,3] - predicted_block[,1]) /predicted_block[,2]
  uncertainty_raster <- writeValues(uncertainty_raster, values, block_info$row[i])
  end_time <- proc.time()
  print(end_time - start_time)
  print(i)
}

# Close the creation of the raster 
uncertainty_raster <- writeStop(uncertainty_raster)
uncertainty_raster_sqrt <- (uncertainty_raster)^2
writeRaster(uncertainty_raster_sqrt ,"./export/Uncertainty_depth.tif", format = "GTiff",overwrite=T)

8.3 Visualisations and exports

8.3.1 Basic visualisations

# 06 Visualisation of the prediction ###########################################
# 06.1 Plot the depth maps =====================================================
rm(list = ls())
load(file = "./export/save/Pre_process_soil_depth.RData")
predicted_raster <- raster("./export/Prediction_depth.tif")
uncertainty_raster <- raster("./export/Uncertainty_depth.tif")
survey <- st_read("./data/Survey_Area.gpkg", layer = "Survey_Area")
sp_df <- st_as_sf(SoilCov, coords = c("X", "Y"), crs = 32638)

predicted_raster_resize <- aggregate(predicted_raster, fact=5, fun=mean)
uncertainty_resize <- aggregate(uncertainty_raster, fact=5, fun=mean)

# Plot continuous values of the map
mapview(predicted_raster_resize, at = seq(0,100,20), legend = TRUE, layer.name = "Soil depth prediction (cm)",col.regions = wes_palettes$Zissou1Continuous) +
  mapview(sp_df, zcol = "Depth", at = seq(0,100,20),  legend = TRUE, layer.name = "Soil samples depth (cm)", col.regions = wes_palettes$Zissou1Continuous)

\(\\[0.5cm]\)

mapview(uncertainty_resize, legend = TRUE, layer.name = "Soil depth prediction uncertainty", col.regions = brewer.pal(5, "YlGnBu"))

\(\\[0.5cm]\)

8.3.2 Colorblind visualisations

# 06.3 Visualise for colorblind ================================================

wes_palette_hcl <- function(palette_name, n = 7) {
  wes_colors <- wes_palette(palette_name, n = n, type = "continuous")
  hex_colors <- as.character(wes_colors)  
  return(hex_colors)
}

palette_wes <- rev(wes_palette_hcl("Zissou1", n = 7))
palette_blue <- colorRampPalette(c("lightblue", "blue", "darkblue"))(7)

# Replace the cvd palette with palette_wes for depht and palette blue for uncertainty
gg1 <- cblind.plot(uncertainty_resize, cvd = palette_blue)
gg2 <- cblind.plot(uncertainty_resize, cvd = "deuteranopia")
gg3 <- cblind.plot(uncertainty_resize, cvd = "tritanopia")
gg4 <- cblind.plot(uncertainty_resize, cvd = "protanopia")

grid <- grid.arrange(
  arrangeGrob(gg1, bottom = textGrob("Original", gp = gpar(fontsize = 14))),
  arrangeGrob(gg2, bottom = textGrob("Deuteranopia", gp = gpar(fontsize = 14))),
  arrangeGrob(gg3, bottom = textGrob("Tritanopia", gp = gpar(fontsize = 14))),
  arrangeGrob(gg4, bottom = textGrob("Protanopia", gp = gpar(fontsize = 14))),
  nrow = 2, ncol = 2,
  top = textGrob("Soil depth uncertainty for different visions", gp = gpar(fontsize = 16, fontface = "bold")))

ggsave("./export/Visualisation_soil_depth_uncertainty.png", grid, width = 20, height = 10)
ggsave("./export/Visualisation_soil_depth_uncertainty.pdf", grid, width = 20, height = 10)

dev.off()

(#fig:Colorblind plot hide)Soil depth prediction map adapted for colorblindness

\(\\[0.5cm]\)

(#fig:Colorblind plot hide second)Soil depth prediction uncertainty map adapted for colorblind

8.3.3 Export final maps

# 06.4 Export final maps =======================================================

# Replace missing values by zero as it occurs only in mountainous area
uncertainty_raster[is.na(uncertainty_raster)] <- 0
predicted_raster[is.na(predicted_raster)] <- 0


# For GeoTiff format
soil_depth_map <- stack(predicted_raster, uncertainty_raster)
soil_depth_crop <- crop(soil_depth_map, survey)
soil_depth_mask <- mask(soil_depth_crop, survey, inverse=FALSE, updatevalue=NA, updateNA=TRUE)
crs(soil_depth_mask) <- "EPSG:32638"
x <- rast(soil_depth_mask)
x_repro <- project(x, "EPSG:4326")
names(x_repro) <- c("Prediction", "Uncertainty")
terra::writeRaster(x_repro,"./export/Soil_depth_prediction_map.tif", overwrite=TRUE)

# For netCDF format
CDF_df <- lapply(1:nlayers(x_repro), function(i) {
  rast(x_repro[[i]])
})

names(CDF_df) <- c("Prediction", "Uncertainty")

soil_to_netcdf <- function(soil_list, output_file, overwrite = FALSE) {
  # Check if file exists and handle overwrite
  if (file.exists(output_file) && !overwrite) {
    stop("File already exists and overwrite = FALSE")
  }
  
  # If file exists and overwrite is TRUE, remove the existing file
  if (file.exists(output_file) && overwrite) {
    file.remove(output_file)
  }
  
  # Get dimensions and CRS from first raster
  r <- soil_list[[1]]
  nx <- ncol(r)
  ny <- nrow(r)
  crs_string <- crs(r)
  
  # Create longitude and latitude vectors
  ext <- ext(r)
  lon <- seq(from = ext[1], to = ext[2], length.out = nx)
  lat <- seq(from = ext[3], to = ext[4], length.out = ny)  # Changed back to ascending order
  
  # Define dimensions
  londim <- ncdim_def("longitude", "degrees_east", lon)
  latdim <- ncdim_def("latitude", "degrees_north", lat)
  
  # Define units for each soil property
  units_list <- list(
    Prediction = "cm",
    Uncertainty  = "cm"
  )
   
  # Create list of variables with appropriate units
  var_list <- list()
  for (var_name in names(soil_list)) {
    var_list[[var_name]] <- ncvar_def(
      name = var_name,
      units = units_list[[var_name]],
      dim = list(londim, latdim),
      missval = NA
    )
  }
  
  # Create netCDF file
  ncout <- nc_create(output_file, var_list)
  
  # Write data
  for (var_name in names(soil_list)) {
    # Convert SpatRaster to matrix and handle orientation
    values_matrix <- t(as.matrix(soil_list[[var_name]], wide=TRUE))
    values_matrix <- values_matrix[,ncol(values_matrix):1]
    ncvar_put(ncout, var_list[[var_name]], vals = values_matrix)
  }
  
  # Add global attributes
  ncatt_put(ncout, 0, "title", "Soil depth until R horizon ")
  ncatt_put(ncout,0,"institution","Tuebingen University, CRC1070 ResourceCultures")
  ncatt_put(ncout, 0, "description", "Soil depth until R horizon for the top 100 cm in the Northern Kurdsitan region of Irak")
  ncatt_put(ncout,0,"author", "Mathias Bellat PhD. candidate at Tuebingen University (mathias.bellat@uni-tuebingen.de)")
  ncatt_put(ncout, 0, "creation_date", format(Sys.time(), "%Y-%m-%d %H:%M:%S"))
  
  # Add CRS information
  if (!is.na(crs_string)) {
    ncatt_put(ncout, 0, "crs", crs_string)
    ncatt_put(ncout, 0, "spatial_ref", crs_string)
    
    # Add standard CF grid mapping attributes
    ncatt_put(ncout, 0, "Conventions", "CF-1.6")
    ncatt_put(ncout, "longitude", "standard_name", "longitude")
    ncatt_put(ncout, "longitude", "axis", "X")
    ncatt_put(ncout, "latitude", "standard_name", "latitude")
    ncatt_put(ncout, "latitude", "axis", "Y")
  }
  
  # Add variable descriptions
  var_descriptions <- list(
    Prediction = "Soil depth until R horizon up to 100 cm",
    Uncertainty = "Uncertainty of the Quantile Random Forest of soil depth prediction based on 0.05 - 0.95 interval"
  )
  
  # Add variable-specific attributes
  for (var_name in names(soil_list)) {
    ncatt_put(ncout, var_list[[var_name]], "long_name", var_descriptions[[var_name]])
  }
  
  # Close the file
  nc_close(ncout)
}

soil_to_netcdf(CDF_df, "./export/Soil_depth_prediction_map.nc", overwrite = TRUE)

# 06.5 Final visualisations ====================================================
#Change parameters for uncertainy and predictions
raster_resize <- aggregate(soil_depth_map, fact=5, fun=mean)
rasterdf <- raster::as.data.frame(raster_resize, xy = TRUE)
rasterdf <- rasterdf[complete.cases(rasterdf),]
  
bounds <- st_bbox(survey)
xlim_new <- c(bounds["xmin"] - 3000, bounds["xmax"] + 3000)
ylim_new <- c(bounds["ymin"] - 3000, bounds["ymax"] + 3000)
  
palette_wes <- rev(wes_palette("Zissou1", type = "continuous"))
palette_blue <- colorRampPalette(c("lightblue", "blue", "darkblue"))(7)

gg1 <- ggplot() +
 geom_raster(data = rasterdf,
                  aes(x = x, y = y,fill = Uncertainty )) +
      ggtitle("Soil depth prediction uncertainty map") +
      scale_fill_gradientn(colors = palette_blue,
                           name = "Uncertainty") +
      annotation_scale(
        location = "br",       
        width_hint = 0.3,     
        height = unit(0.3, "cm"),
        line_col = "black",      
        text_col = "black",    
        bar_cols = c("white", "red") 
      ) +
      annotate("text", label = paste("Projection: WGS84 UTM 38N"), 
               x = Inf, y = -Inf, hjust = 1.5, vjust = -3, size = 3, color = "black") +
      annotation_north_arrow(location = "tr", which_north = "true", height = unit(1, "cm"), width = unit(0.75, "cm")) +
      theme_bw() +
      theme(
        panel.grid = element_blank(),
        axis.text.y = element_text(angle = 90, vjust = 0.5, hjust = 0.5),
        axis.title = element_blank(),       
        legend.position = "right")+
      coord_equal(ratio = 1) 

ggsave("./export/Visualisation_soil_depth_uncertainty_prediction_maps.pdf", gg1, width = 20, height = 10)
ggsave("./export/Visualisation_soil_depth_uncertainty_prediction_maps.png", gg1, width = 20, height = 10)

(#fig:final plot hide)Final soil depth prediction map

(#fig:final plot second hide)Final soil depth prediction uncertainty map adapted

# 06.6 Simplified version for publication =======================================
  
gg1 <- ggplot() +
      geom_raster(data = rasterdf, aes(x = x, y = y, fill = Prediction)) +
      ggtitle("Soil depth prediction map") +  
      scale_fill_gradientn(colors = palette_wes, name = "Soil depth in cm") + 
      theme_void() +  
      theme(
        legend.position = "right",
        plot.title = element_text(size = 8),
        legend.title = element_text(size = 10),  
        legend.text = element_text(size = 8),
        legend.key.size = unit(0.7, "cm")) +
      coord_equal(ratio = 1)  
    
gg2 <- ggplot() +
  geom_raster(data = rasterdf, aes(x = x, y = y, fill = Uncertainty)) +
  ggtitle("Soil depth uncertainty prediction map") +  
  scale_fill_gradientn(colors = palette_blue, name = "Uncertainty") + 
  theme_void() +  
  theme(
    legend.position = "right",
    plot.title = element_text(size = 8),
    legend.title = element_text(size = 10),  
    legend.text = element_text(size = 8),
    legend.key.size = unit(0.7, "cm")) +
  coord_equal(ratio = 1)  

combined_plot <- plot_grid(gg1, gg2, ncol = 2) 
 
ggsave("./export/Publication_soil_depth_map.pdf", combined_plot, width = 20, height = 10)
ggsave("./export/Publication_soil_depth_map.png", combined_plot, width = 20, height = 10)

dev.off()

(#fig:publication plot hide)Simplified plot of the prediction maps

EROS. 2013. “Landsat 4-5 Thematic Mapper Level-2, Collection 2.” U.S. Geological Survey. https://doi.org/10.5066/P9IAXOVV.

ESA, and Airbus. 2022. “Copernicus DEM.” https://doi.org/10.5270/ESA-c5d3d65.

Fick, Stephen E., and Robert J. Hijmans. 2017. “WorldClim 2: New 1‐km Spatial Resolution Climate Surfaces for Global Land Areas.” International Journal of Climatology 37 (12): 4302–15. https://doi.org/10.1002/joc.5086.

Hulley, Glynn, and Simon Hook. 2018. “VIIRS/NPP Land Surface Temperature Daily L3 Global 1km SIN Grid Day V001.” NASA EOSDIS Land Processes Distributed Active Archive Center. https://doi.org/10.5067/VIIRS/VNP21A1D.001.

Liu, Feng, Fei Yang, Yu-guo Zhao, Gan-lin Zhang, and De-cheng Li. 2022. “Predicting Soil Depth in a Large and Complex Area Using Machine Learning and Environmental Correlations.” Journal of Integrative Agriculture 21 (8): 2422–34. https://doi.org/10.1016/S2095-3119(21)63692-4.

McFeeters, S. K. 1996. “The Use of the Normalized Difference Water Index (NDWI) in the Delineation of Open Water Features.” International Journal of Remote Sensing 17 (7): 1425–32. https://doi.org/10.1080/01431169608948714.

Rouse, J. W., R. H. Haas, D. W. Deering, J. A. Schell, and J. C. Harlan. 1974. “Monitoring the Vernal Advancement and Retrogradation (Green Wave Effect) of Natural Vegetation.” E75-10354. https://ntrs.nasa.gov/citations/19750020419.

Sissakian, Varoujan, Dikran Hagopian, and Eman Hasan. 1995. “Geological Map of Al-Mosul Quadrangle.” Geological map. Baghdad: GEOSURV.

Vaysse, Kévin, and Philippe Lagacherie. 2017. “Using Quantile Regression Forest to Estimate Uncertainty of Digital Soil Mapping Products.” Geoderma 291 (April): 55–64. https://doi.org/10.1016/j.geoderma.2016.12.017.

Yan, Fapeng, Wei Shangguan, Jing Zhang, and Bifeng Hu. 2018. “Depth-to-Bedrock Map of China at a Spatial Resolution of 100 Meters.” https://doi.org/10.5194/essd-2018-103.

Zanaga, Daniele, Ruben Van De Kerchove, Wanda De Keersmaecker, Niels Souverijns, Carsten Brockmann, Ralf Quast, Jan Wevers, et al. 2021. “ESA WorldCover 10 m 2020 V100.” Zenodo. https://doi.org/10.5281/ZENODO.5571936.

Zomer, Robert, and Antonio Trabucco. 2022. “Global Aridity Index and Potential Evapotranspiration (ET0) Database: Version 3.” figshare. https://doi.org/10.6084/M9.FIGSHARE.7504448.V6.

References

EROS. 2013. “Landsat 4-5 Thematic Mapper Level-2, Collection 2.” U.S. Geological Survey. https://doi.org/10.5066/P9IAXOVV.

ESA, and Airbus. 2022. “Copernicus DEM.” https://doi.org/10.5270/ESA-c5d3d65.

Sissakian, Varoujan, Dikran Hagopian, and Eman Hasan. 1995. “Geological Map of Al-Mosul Quadrangle.” Geological map. Baghdad: GEOSURV.

Yan, Fapeng, Wei Shangguan, Jing Zhang, and Bifeng Hu. 2018. “Depth-to-Bedrock Map of China at a Spatial Resolution of 100 Meters.” https://doi.org/10.5194/essd-2018-103.

Zomer, Robert, and Antonio Trabucco. 2022. “Global Aridity Index and Potential Evapotranspiration (ET0) Database: Version 3.” figshare. https://doi.org/10.6084/M9.FIGSHARE.7504448.V6.