Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GeoAI_2021_unit_02_EX_Remote_Sensing_classification_hands_on #3

Open
Baldl opened this issue Nov 8, 2021 · 40 comments
Open

GeoAI_2021_unit_02_EX_Remote_Sensing_classification_hands_on #3

Baldl opened this issue Nov 8, 2021 · 40 comments

Comments

@Baldl
Copy link

Baldl commented Nov 8, 2021

I got two error messages with the variables prediction_cart & prediction_rf because the raster stack is not named like the data frame. To rename the stack something like names(stack) = c("band1","band2",.....) should work.

Copy link

nilsbo123 commented Nov 10, 2021

Recursive Partitioning and Regression Trees :

prediction_cart <- raster::predict(stack, cart, type='class', progress = 'text')

error message : Error in eval(predvars, data, env) : object 'band1' not found

I guess thats because we labeled DF2 in different way than stack, so I changed in the step before
names(DF) = c("id","band1","band2","band3","band4","band5","band6","band7","band8","band9","band10","band11")

to :

names(DF) = c("id","S2B2A_20210613_108_MOF_BOA_10.1","S2B2A_20210613_108_MOF_BOA_10.2","S2B2A_20210613_108_MOF_BOA_10.3","S2B2A_20210613_108_MOF_BOA_10.4","S2B2A_20210613_108_MOF_BOA_10.5","S2B2A_20210613_108_MOF_BOA_10.6","S2B2A_20210613_108_MOF_BOA_10.7","S2B2A_20210613_108_MOF_BOA_10.8","S2B2A_20210613_108_MOF_BOA_10.9","S2B2A_20210613_108_MOF_BOA_10.10","S2B2A_20210613_108_MOF_BOA_10.11")

which are the original label names from the raster layer (stack) and it worked : )
Hope there is no mistake !!

Copy link
Contributor

gisma commented Nov 12, 2021

@nilsbo123that is exactly the reason. however, my suggestion would be to rename the raster stack. preferably directly at the time of creation.

# pasting a filename
fn_noext=xfun::sans_ext(basename(list.files(paste0(envrmt$path_research_area,"/BOA/"),pattern = "S2B2A")))
fn = basename(list.files(paste0(envrmt$path_research_area,"/BOA/"),pattern = "S2B2A"))
# creating a raster stack
stack=raster::stack(paste0(envrmt$path_research_area,"/BOA/",fn))
names(stack) = c("band1","band2","band3","band4","band5","band6","band7","band8","band9","band10","band11")

Copy link

The small code section of "Digitize fast forward traning data" takes a lot of time to run on my system to a point, where I am not able to continue:

fields <- mapview::viewRGB(stack, r = 8, g = 4, b = 3) %>% mapedit::editMap()

Is there any solution for this? It takes me hours and it seems to go on forever without finishing and I am not sure what could be the problem.

Copy link

GeoKL commented Nov 18, 2021

To save my results I had to adjust the code and change it to "saveRDS(train_areas, file.path(envrmt$path_sentinel,"train_areas.RDS"))". Without this change, the data was saved under a wrong name in the data folder.

@gisma
Copy link
Contributor

gisma commented Nov 18, 2021

The small code section of "Digitize fast forward traning data" takes a lot of time to run on my system to a point, where I am not able to continue:

fields <- mapview::viewRGB(stack, r = 8, g = 4, b = 3) %>% mapedit::editMap()

Is there any solution for this? It takes me hours and it seems to go on forever without finishing and I am not sure what could be the problem.
Sorry for the delay .
The problem is probably that you are loading a big data set (presumingly the DOP data) in mapview. Mapview is an Instance of yours browser DOM. Avtually this means if you open it in RStudio/Browser you are limitetd in Menory access corresponding to the memory your brwoser is allowed to use.
To make it short: (1) close all running nstances of your browser and run a clean Rstudio. (2) Reduce the RGB dataset using resample or a PCA. (3) and probably the best appoach do it in QGIS

@gisma
Copy link
Contributor

gisma commented Nov 18, 2021

To save my results I had to adjust the code and change it to "saveRDS(train_areas, file.path(envrmt$path_sentinel,"train_areas.RDS"))". Without this change, the data was saved under a wrong name in the data folder.

I do not exactly understand what you mean with wrong name? The name is case senitive hoever it should not matter for reainging and writing.

Copy link

nilsbo123 commented Nov 18, 2021

When you print envrmt$path_sentinel the last backslash is missing (don't know why), thats why the two strings aren't connected with a backslash

print(paste0(envrmt$path_sentinel,"train_areas.rds")) --> 

C:/Users/nils/Documents/edu/geoAI/data/sentineltrain_areas.rds

And is safed with the "wrong" name in data folder.
I guess GeoKL just added the backslash in "/train_areas.rds" or added a backslash at the end in appendProjectDirList

appendProjectDirList = c("data/sentinel//",
                         "data/vector_data//",
                         "data/sentinel/S2//",
                         "data/sentinel/SAFE//",
                         "data/sentinel/research_area//")
print(envrmt$path_sentinel) -> C:/Users/nils/Documents/edu/geoAI/data/sentinel/

Copy link

When I try to apply the classification on the airborne imagery dataset it takes really long to calculate the DF (DF <- raster::extract(stack, tp, df=TRUE)) and the resulting object is quite big. The further calculations for the different classifications then either take long too or I get error messages like the folowing when I try to run the random forest classification code: "model fit failed for Fold01: mtry=2 Error : kann Vektor der Gr��e 5.1 GB nicht allozieren".
Is there a possibilty to reduce the resolution of the marburg_dop image or is there another solution for the problem?

@envimar
Copy link

envimar commented Nov 19, 2021

For the training data extraction you can consider an alternative approach:

extr = exact_extract(rasterStack, tp, include_cols = c( "class",  "id"))
extr = extr[sapply(extr, is.data.frame)]
df = do.call(rbind, extr)

Them memory problem is unfortunately pretty difficult to address.

  1. increase the memory limit
# check memory limit
m=memory.limit()
# set new memory limit e.g.
memory.limit(size=m + a senseful number) 

see documentation
2. clean up your environment do not save .RData restart just load the essential packages etc.
3. buy additional RAM
Most of the other solutions are tricky because you have to chunk down the data etc.

@envimar
Copy link

envimar commented Nov 19, 2021

The small code section of "Digitize fast forward traning data" takes a lot of time to run on my system to a point, where I am not able to continue:

fields <- mapview::viewRGB(stack, r = 8, g = 4, b = 3) %>% mapedit::editMap()

Is there any solution for this? It takes me hours and it seems to go on forever without finishing and I am not sure what could be the problem.

you could aggregate the image (with loss of visual information) but for digitizing this would be ok.

# 
stack_by2 = aggregate(stack,2)
# if still to big 
stack_by2 = aggregate(stack,4)

Copy link

@gisma @mforoushani I tried to run the script with the aerial image "marburg_dop.tif". Therefore I safed trainings areas consisting of buildings, field, meadows, forest and roads under ther filename "/train_areas_marburg.rds". I then transformed them in the correct crs, extracted them and renamed the to "Marburg_01", "Marburg_02", "Marburg_03" and "Marburg_04", added the "class" category and deleted the geometry. Then I tried to make kmean prediction using 25000 samples, 5 classes, 25 starts and 250 iterations. I showed the foto with mapview and 4 collums, changed them to factors, sown the classification tree and tried to predict my cart. Then I got the message:
"Error in eval(predvars, data, env) : object 'Marburg_01' not found
In addition: Warning message:
In .Internal(gc(verbose, reset, full)) :
closing unused connection 3 (C:/Users/jomue/edu/geoAI/tmp/r_tmp_2022-02-10_173529_15172_60669.gri)"
What do you think is the reason and how can I fix this problem?

@gisma
Copy link
Contributor

gisma commented Feb 10, 2022

@Muenchj4 I need the source coude to have a chance of reproducing it

@Muenchj4
Copy link

@gisma My script looks like this:
library(envimaR)
source(file.path(envimaR::alternativeEnvi(root_folder = "C:/Users/jomue/edu/geoAI",
alt_env_id = "COMPUTERNAME",
alt_env_value = "PCRZP",
alt_env_root_folder = "F:/BEN/edu/geoAI"),
"src/000-rspatial-setup.R"))

add define project specific subfolders

appendProjectDirList = c("data/sentinel/",
"data/vector_data/",
"data/sentinel/S2/",
"data/sentinel/SAFE/",
"data/sentinel/research_area/")

   # 2 - define variables
   #---------------------
   
  
   
   # creating a raster stack
   stack=raster::stack(paste0(envrmt$path_doc,"/marburg_dop.tif"))
   # sentinel true and false color composite 
   mapview::viewRGB(stack, r = 4, g = 3, b = 2)    

   train_area <- mapview::viewRGB(stack, r = 4, g = 4, b = 3) %>% mapedit::editMap()       
   # add class (text) and id (integer number)
   building <- train_area$finished$geometry %>% st_sf() %>% mutate(class = "building", id = 1)
   # fields only
   train_area <- mapview::viewRGB(stack, r = 4, g = 3, b = 2) %>% mapedit::editMap()       
   fields <- train_area$finished$geometry %>% st_sf() %>% mutate(class = "fields", id = 2)     
   # meadows only
   train_area <- mapview::viewRGB(stack, r = 4, g = 3, b = 2) %>% mapedit::editMap()       
   meadows <- train_area$finished$geometry %>% st_sf() %>% mutate(class = "meadows", id = 3)       
   # forest only
   train_area <- mapview::viewRGB(stack, r = 4, g = 4, b = 3) %>% mapedit::editMap()       
   forest <- train_area$finished$geometry %>% st_sf() %>% mutate(class = "forest", id = 4)       
   #road only
   train_area <- mapview::viewRGB(stack, r = 4, g = 4, b = 3) %>% mapedit::editMap()
   road <- train_area$finished$geometry %>% st_sf() %>% mutate(class = "road", id = 5)    
   # bind it together to one file
   train_areas <- rbind(building, fields, meadows, forest, road)
   # save results
   saveRDS(train_areas, paste0(envrmt$path_doc,"/train_areas_marburg.rds"))
   # first we have to project the data into the correct crs
   tp = sf::st_transform(train_areas,crs = sf::st_crs(stack))
   ## next we extract the values from every band of the raster stack 
   # we force the values to be returned as an data frame
   # because extracting the raster way is very slow
   DF <- raster::extract(stack, tp, df=TRUE) 
   # we rename the layers for simplicity
   names(DF) = c("id","Marburg_01","Marburg_02","Marburg_03","Marburg_04")
   # now we add the "class" category which we need later on for training
   # it was dropped during extraction
   DF_sf =st_as_sf(inner_join(DF,tp))       
   # finally we produce a simple data frame without geometry
   DF2 = DF_sf       
   st_geometry(DF2)=NULL       
   ## k-means via RStoolbox
   prediction_kmeans = unsuperClass(stack, nSamples = 25000, nClasses = 5, nStarts = 25,
                                    nIter = 250, norm = TRUE, clusterMap = TRUE,
                                    algorithm = "MacQueen")       
   mapview(prediction_kmeans$map, col = c('darkgreen', 'burlywood', 'green', 'orange'))
   # defining the model 
   cart <- rpart(as.factor(DF2$class)~., data=DF2[,2:4], method = 'class')# the tree       
   rpart.plot(cart, box.palette = 0, main = "Classification Tree")       
   prediction_cart <- raster::predict(stack, cart, type='class', progress = 'text')  
   
   Here the script stopped and the error occurred. 

"Error in eval(predvars, data, env) : object 'Marburg_01' not found
In addition: Warning message:
In .Internal(gc(verbose, reset, full)) :
closing unused connection 3 (C:/Users/jomue/edu/geoAI/tmp/r_tmp_2022-02-10_173529_15172_60669.gri)"

@gisma
Copy link
Contributor

gisma commented Feb 10, 2022

@Muenchj4 pls sent the utput of head(DF) and stack

@Muenchj4
Copy link

@gisma
here it is:
id Marburg_01 Marburg_02 Marburg_03 Marburg_04
1 1 138 153 148 115
2 1 21 27 25 26
3 1 17 13 19 26
4 1 242 243 246 211
5 1 242 243 246 211
6 1 242 243 245 210

@gisma
Copy link
Contributor

gisma commented Feb 10, 2022

still missing the output of stack

@Muenchj4
Copy link

@gisma
class : RasterStack
dimensions : 10000, 20000, 2e+08, 4 (nrow, ncol, ncell, nlayers)
resolution : 0.2, 0.2 (x, y)
extent : 480000, 484000, 5626000, 5628000 (xmin, xmax, ymin, ymax)
crs : +proj=utm +zone=32 +ellps=GRS80 +units=m +no_defs
names : red, green, blue, nir
min values : 1, 1, 1, 0
max values : 255, 255, 255, 255

@gisma
Copy link
Contributor

gisma commented Feb 10, 2022

So finally I need thehead(DF2) output .
However if you name the DF/DF2 cols c("id","Marburg_01","Marburg_02","Marburg_03","Marburg_04") you need also to name the raster layers according to it c("Marburg_01","Marburg_02","Marburg_03","Marburg_04"). Why are you naming the R,G,B,NIR channels Marburg_XX..?

@Muenchj4
Copy link

head(DF2)
id Marburg_01 Marburg_02 Marburg_03 Marburg_04 class
1 1 138 153 148 115 building
2 1 138 153 148 115 building
3 1 138 153 148 115 building
4 1 138 153 148 115 building
5 1 138 153 148 115 building
6 1 138 153 148 115 building

@Muenchj4
Copy link

I try to run it again with the same names of the RasterLayer as it is "red", "green", "blue" and "nir". Still I am waiting for the output.

@gisma
Copy link
Contributor

gisma commented Feb 10, 2022

It is only necessary that the pred stacks names and the corresponding training data names are identical. Obviously this is fine. Due to the fact that you are working on highresolution data you have to reduce your training data sample doing somthing like
DF = DF2[createDataPartition(DF2$id,list = FALSE,p = 0.05),]

@Muenchj4
Copy link

ok thank you
still I could not train rf, since accuracy and kappa are not applicable for regression models.
I only trained a linear model with rmse. I will aso try with r^2.

@gisma
Copy link
Contributor

gisma commented Feb 10, 2022

why you are training a regression model? To avoid this you need to define the Respons col as of factor type as.factor(DF2$class)

@Muenchj4
Copy link

I now have reduced my training data sample and set the respose col as factor as well. But ist is still trying to build a regression model, which semms not to work with random forest.
this is my script
set.seed(123)

   DF2$class<-as.factor(DF2$class)
   # split data into train and test data and take only a fraction of them
   trainDat =  DF2[createDataPartition(DF2$id,list = FALSE,p = 0.05),]
   

   
   # define a training control object for caret with cross-validation, 10 repeats
   ctrlh = trainControl(method = "cv", 
                        number = 10, 
                        savePredictions = TRUE)
   
   # train random forest via caret model 
   cv_model = train(trainDat[,2:4],
                    trainDat[,5],
                    method = "rf",
                    metric = "Accuracy",
                    trControl = ctrlh,
                    importance = TRUE)
   and I get this message: "Error: Metric Accuracy not applicable for regression models" 

@gisma
Copy link
Contributor

gisma commented Feb 11, 2022

@Muenchj4 as far as I see is your trainDat[,5] the NIR channel and not the class col. pls check. In addition use the metric "Kappa"

Copy link

Does somebody know, how to load the nlcd data used for the example scripts? Under the written Link I cannot find it. There is other nlcd data I can find, but none of them have got tif files.

Copy link
Contributor

gisma commented Feb 12, 2022

Sorry but actually I dont know what do you mean with nlcd data.

@Muenchj4
Copy link

I want to work through the section about supervised classification and I think it would be helpfull to use the same or at least very similar data as in the examples.

Copy link

Trying to write the classification script as rmarkdown file and saving the "meadows" trainings areas following error occurs:
Fehler in st_sf(.): no siple features geometry column present Ruft auf: ... withVisible -> eval -> eval -> %>% -> mutate ->st_sf Ausführung angehalten

Copy link

the error occures writing as usual script and changing into rmarkdown and also writing as Rmarkdown script from the beginning - and only with class "meadows". Does somebody know why?

@mforoushani
Copy link

@Muenchj4 did you get B02? does it work?

@Muenchj4
Copy link

Muenchj4 commented Feb 19, 2022 via email

@Muenchj4
Copy link

Muenchj4 commented Feb 20, 2022

@mforoushani @gisma
Now this Error did not occur again, but the programme forced me to edit already edited layers again and again.
When editing was finished finally I tried to predict on this basic by kmeans and to show the result by mapview, following error occurred:

"Error in res[i] <- readBin(x@file@con, what = dtype, n = 1, size = dsize,  : 
  replacement has length zero
In addition: Warning message:
In rasterCheckSize(x, maxpixels = maxpixels) :
  maximum number of pixels for Raster* viewing is 5e+05 ; 
the supplied Raster* has 2e+08 
 ... decreasing Raster* resolution to 5e+05 pixels
 to view full resolution set 'maxpixels =  2e+08 ' "

In spite of this I tried to define a cart model with rpart function and got the message:

"Graphics error: Plot rendering error"
Because the programme had forced me to edit again it also tried to predict using kmeans again and the error occurred: "Error in res[i] <- readBin(x@file@con, what = dtype, n = 1, size = dsize,  : 
  replacement has length zero
In addition: There were 50 or more warnings (use warnings() to see the first 50)" 

and of course occurred again trying to show by mapview.
Amazingly predicting a cart model worked. Only warnings occurred. Do you know what the reason could be?
LG Münch

@mforoushani
Copy link

@Muenchj4 did you run the file from Marburg or that one from FL (USA)? But overall, I think it happened coz large samples. try to make gridded=TRUE, then maxpixels may be ignored to get a larger sample. You can use the following link and rearrange the plot command. https://rdrr.io/cran/raster/man/plot.html
then we see.
LG

@Muenchj4
Copy link

@mforoushani I used the Marburg File. I will try as you said.
See you

@gisma
Copy link
Contributor

gisma commented Feb 21, 2022

@Muenchj4 I think we go in circles. To support you more effienctly please do one of the following things. Either zip your whole working enviroment togehter with all necessary data and share it with me or push it to github (preferred).
without wanting to offend you, i think the errors lie in the usual and well-known inaccuracies and cursory mistakes that you like to make ;-) The only thing left to do is what I try to convey all the time: read and understand every line of code. I have to do that too, hence the above request. Howevr @mforoushani thank you for your support.

Copy link

@gisma Finally I have finished my prediction scripts, which was not possible before since some packages needed to be updated. Now I tried to plot my prediction, but following error occurred. :

---Plotting predictions---

   par(par(mfrow=c(1,2)))
   # for Sentinel prediction
   plot(prediction_rf$layer,
  •         main = "Cv model sentinel (randome forest)",
    
  •         col = c('darkgreen', 'burlywood', 'green', 'orange'),
    
  •         axes = FALSE,
    
  •    )
    

Error in as.double(y) :
cannot coerce type 'S4' to vector of type 'double'
I do not know why this coercion is necessary and how to manage this.

P.S. My prediction stack looks like this:
class : RasterLayer
dimensions : 561, 769, 431409 (nrow, ncol, ncell)
resolution : 10, 10 (x, y)
extent : 474460, 482150, 5630390, 5636000 (xmin, xmax, ymin, ymax)
crs : +proj=utm +zone=32 +datum=WGS84 +units=m +no_defs
source : memory
names : layer
values : 1, 4 (min, max)
attributes :
ID value
1 fields
2 forest
3 meadows
4 settlement

@gisma
Copy link
Contributor

gisma commented Feb 26, 2022

@Muenchj4 the message says that there is noch plot method for your object. actually if you look into the example code you you will find that you are not plotting the object@layer but the object

prediction_rf  = predict(stack ,cv_model, progress = "text")
mapview(prediction_rf,col.regions = c('darkgreen', 'burlywood', 'green', 'orange'))

you usually have to check what is the class the object . For instant if you want to plot the classification map of a RSToolbox result you need to choose the layer like exampleobject@map

@Muenchj4
Copy link

Muenchj4 commented Feb 26, 2022 via email

Copy link

At the moment I have some trouble with the number of layers. As you can see here:
DF = DF2[createDataPartition(DF2$id,list = FALSE,p = 0.05),]

prediction_kmeans = unsuperClass(stack, nSamples = 25000, nClasses = 5, nStarts = 25,

  •                              nIter = 250, norm = TRUE, clusterMap = TRUE,
    
  •                              algorithm = "MacQueen")       
    

Error in names<-(*tmp*, value = lnames) :
incorrect number of layer names
In addition: There were 50 or more warnings (use warnings() to see the first 50)
I looked into the stack parameters. As you can see there are 4 layers
class : RasterStack
dimensions : 10000, 20000, 2e+08, 4 (nrow, ncol, ncell, nlayers)
resolution : 0.2, 0.2 (x, y)
extent : 480000, 484000, 5626000, 5628000 (xmin, xmax, ymin, ymax)
crs : +proj=utm +zone=32 +ellps=GRS80 +units=m +no_defs
names : red, green, blue, nir
min values : 1, 1, 1, 0
max values : 255, 255, 255, 255
and head(DF2) showed me that this was correct:
id red green blue nir class
1 1 44 45 49 63 building
2 1 44 45 49 63 building
3 1 44 45 49 63 building
4 1 44 45 49 63 building
5 1 44 45 49 63 building
But why this error message although the number of layers and the layer names are identical?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants