Speeding up a loop over rasters





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}







0















I have a big dataset with 30000 rasters. My goal is to extract a mean value using the polygon located within the raster and create a file with extracted rasters values and dates from rasters filenames.



I succeeded in doing this by performing the following loop:



for (i in 1:length(rasters2014)){
a <- raster(rasters2014[i])
ext[i] <- as.vector(extract(a, poligon2, fun=mean, na.rm=TRUE, df=F))
}
output2 = data.frame(ext, filename=filename2014)


The problem is that the presented above loop takes about 2.5h hours to complete the calculation. Does anyone have an idea how I could speed up this process?










share|improve this question


















  • 1





    It looks like you might be growing ext in a loop (not a good idea!). Try either initializing it to the length you need or use a different loop, for example: out = lapply(rasters2014, function(x) {as.vector(extract(raster(x), poligon2, fun=mean, na.rm=TRUE, df=F))})

    – docendo discimus
    Nov 28 '18 at 13:12











  • Thanks for help! I've tried yours loop, but unluckily it doesn't speed up the operations. I can't narrow the length of the loop as I must do it for all 30000 elements.

    – PaulG
    Nov 28 '18 at 15:01






  • 1





    If each raster has a single band you can stack the rasters into a multi-band raster and then call extract once (no loop) to get the mean within the polygon for all of the bands.

    – qdread
    Nov 28 '18 at 16:26


















0















I have a big dataset with 30000 rasters. My goal is to extract a mean value using the polygon located within the raster and create a file with extracted rasters values and dates from rasters filenames.



I succeeded in doing this by performing the following loop:



for (i in 1:length(rasters2014)){
a <- raster(rasters2014[i])
ext[i] <- as.vector(extract(a, poligon2, fun=mean, na.rm=TRUE, df=F))
}
output2 = data.frame(ext, filename=filename2014)


The problem is that the presented above loop takes about 2.5h hours to complete the calculation. Does anyone have an idea how I could speed up this process?










share|improve this question


















  • 1





    It looks like you might be growing ext in a loop (not a good idea!). Try either initializing it to the length you need or use a different loop, for example: out = lapply(rasters2014, function(x) {as.vector(extract(raster(x), poligon2, fun=mean, na.rm=TRUE, df=F))})

    – docendo discimus
    Nov 28 '18 at 13:12











  • Thanks for help! I've tried yours loop, but unluckily it doesn't speed up the operations. I can't narrow the length of the loop as I must do it for all 30000 elements.

    – PaulG
    Nov 28 '18 at 15:01






  • 1





    If each raster has a single band you can stack the rasters into a multi-band raster and then call extract once (no loop) to get the mean within the polygon for all of the bands.

    – qdread
    Nov 28 '18 at 16:26














0












0








0


0






I have a big dataset with 30000 rasters. My goal is to extract a mean value using the polygon located within the raster and create a file with extracted rasters values and dates from rasters filenames.



I succeeded in doing this by performing the following loop:



for (i in 1:length(rasters2014)){
a <- raster(rasters2014[i])
ext[i] <- as.vector(extract(a, poligon2, fun=mean, na.rm=TRUE, df=F))
}
output2 = data.frame(ext, filename=filename2014)


The problem is that the presented above loop takes about 2.5h hours to complete the calculation. Does anyone have an idea how I could speed up this process?










share|improve this question














I have a big dataset with 30000 rasters. My goal is to extract a mean value using the polygon located within the raster and create a file with extracted rasters values and dates from rasters filenames.



I succeeded in doing this by performing the following loop:



for (i in 1:length(rasters2014)){
a <- raster(rasters2014[i])
ext[i] <- as.vector(extract(a, poligon2, fun=mean, na.rm=TRUE, df=F))
}
output2 = data.frame(ext, filename=filename2014)


The problem is that the presented above loop takes about 2.5h hours to complete the calculation. Does anyone have an idea how I could speed up this process?







r r-raster






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Nov 28 '18 at 12:53









PaulGPaulG

266




266








  • 1





    It looks like you might be growing ext in a loop (not a good idea!). Try either initializing it to the length you need or use a different loop, for example: out = lapply(rasters2014, function(x) {as.vector(extract(raster(x), poligon2, fun=mean, na.rm=TRUE, df=F))})

    – docendo discimus
    Nov 28 '18 at 13:12











  • Thanks for help! I've tried yours loop, but unluckily it doesn't speed up the operations. I can't narrow the length of the loop as I must do it for all 30000 elements.

    – PaulG
    Nov 28 '18 at 15:01






  • 1





    If each raster has a single band you can stack the rasters into a multi-band raster and then call extract once (no loop) to get the mean within the polygon for all of the bands.

    – qdread
    Nov 28 '18 at 16:26














  • 1





    It looks like you might be growing ext in a loop (not a good idea!). Try either initializing it to the length you need or use a different loop, for example: out = lapply(rasters2014, function(x) {as.vector(extract(raster(x), poligon2, fun=mean, na.rm=TRUE, df=F))})

    – docendo discimus
    Nov 28 '18 at 13:12











  • Thanks for help! I've tried yours loop, but unluckily it doesn't speed up the operations. I can't narrow the length of the loop as I must do it for all 30000 elements.

    – PaulG
    Nov 28 '18 at 15:01






  • 1





    If each raster has a single band you can stack the rasters into a multi-band raster and then call extract once (no loop) to get the mean within the polygon for all of the bands.

    – qdread
    Nov 28 '18 at 16:26








1




1





It looks like you might be growing ext in a loop (not a good idea!). Try either initializing it to the length you need or use a different loop, for example: out = lapply(rasters2014, function(x) {as.vector(extract(raster(x), poligon2, fun=mean, na.rm=TRUE, df=F))})

– docendo discimus
Nov 28 '18 at 13:12





It looks like you might be growing ext in a loop (not a good idea!). Try either initializing it to the length you need or use a different loop, for example: out = lapply(rasters2014, function(x) {as.vector(extract(raster(x), poligon2, fun=mean, na.rm=TRUE, df=F))})

– docendo discimus
Nov 28 '18 at 13:12













Thanks for help! I've tried yours loop, but unluckily it doesn't speed up the operations. I can't narrow the length of the loop as I must do it for all 30000 elements.

– PaulG
Nov 28 '18 at 15:01





Thanks for help! I've tried yours loop, but unluckily it doesn't speed up the operations. I can't narrow the length of the loop as I must do it for all 30000 elements.

– PaulG
Nov 28 '18 at 15:01




1




1





If each raster has a single band you can stack the rasters into a multi-band raster and then call extract once (no loop) to get the mean within the polygon for all of the bands.

– qdread
Nov 28 '18 at 16:26





If each raster has a single band you can stack the rasters into a multi-band raster and then call extract once (no loop) to get the mean within the polygon for all of the bands.

– qdread
Nov 28 '18 at 16:26












3 Answers
3






active

oldest

votes


















2














If your raster are all properly aligned (same ncol, nrow, extent, origin, resolution), you could try identifying the "cell numbers" to be extracted by looking on the first file, then
extracting based on those. This could speed-up the processing beacause raster does not need to compute which cells to extract. Something like this:



rast1 <- raster(rasters2014[1])
cells <- extract(rast1, poligon2, cellnumbers = TRUE, df = TRUE)[,"cells"]
ext <- list()

for (i in 1:length(rasters2014)){
a <- raster(rasters2014[i])
ext[[i]] <- as.vector(extract(a, cells, fun=mean, na.rm=TRUE, df=F))
}


Note that I am also using a list to store the results to avoid "growing" a vector, which is usually wasteful.



Alternatively, as suggested by @qdread, you could build a rasterStack using raster::stack(rasters2014, quick = TRUE) and call extract over the stack to avoid the for loop. Don't know which would be faster.



HTH






share|improve this answer

































    1














    If your polygons do not overlap (and in most cases they don't) an alternative route is



    library(raster)
    x <- rasterize(poligon2, rasters2014[1])
    s <- raster::stack(rasters2014, quick = TRUE)
    z <- zonal(s, x, "mean")


    PS: Faster is nicer, but I would suggest getting lunch while this runs.






    share|improve this answer































      0














      Thanks for your help! I've tried all of the proposed solutions and the computation time generally the same regardless of the applied method. Therefore, I guess that it is just not possible to significantly speed up the computational process.






      share|improve this answer
























        Your Answer






        StackExchange.ifUsing("editor", function () {
        StackExchange.using("externalEditor", function () {
        StackExchange.using("snippets", function () {
        StackExchange.snippets.init();
        });
        });
        }, "code-snippets");

        StackExchange.ready(function() {
        var channelOptions = {
        tags: "".split(" "),
        id: "1"
        };
        initTagRenderer("".split(" "), "".split(" "), channelOptions);

        StackExchange.using("externalEditor", function() {
        // Have to fire editor after snippets, if snippets enabled
        if (StackExchange.settings.snippets.snippetsEnabled) {
        StackExchange.using("snippets", function() {
        createEditor();
        });
        }
        else {
        createEditor();
        }
        });

        function createEditor() {
        StackExchange.prepareEditor({
        heartbeatType: 'answer',
        autoActivateHeartbeat: false,
        convertImagesToLinks: true,
        noModals: true,
        showLowRepImageUploadWarning: true,
        reputationToPostImages: 10,
        bindNavPrevention: true,
        postfix: "",
        imageUploader: {
        brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
        contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
        allowUrls: true
        },
        onDemand: true,
        discardSelector: ".discard-answer"
        ,immediatelyShowMarkdownHelp:true
        });


        }
        });














        draft saved

        draft discarded


















        StackExchange.ready(
        function () {
        StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53519947%2fspeeding-up-a-loop-over-rasters%23new-answer', 'question_page');
        }
        );

        Post as a guest















        Required, but never shown

























        3 Answers
        3






        active

        oldest

        votes








        3 Answers
        3






        active

        oldest

        votes









        active

        oldest

        votes






        active

        oldest

        votes









        2














        If your raster are all properly aligned (same ncol, nrow, extent, origin, resolution), you could try identifying the "cell numbers" to be extracted by looking on the first file, then
        extracting based on those. This could speed-up the processing beacause raster does not need to compute which cells to extract. Something like this:



        rast1 <- raster(rasters2014[1])
        cells <- extract(rast1, poligon2, cellnumbers = TRUE, df = TRUE)[,"cells"]
        ext <- list()

        for (i in 1:length(rasters2014)){
        a <- raster(rasters2014[i])
        ext[[i]] <- as.vector(extract(a, cells, fun=mean, na.rm=TRUE, df=F))
        }


        Note that I am also using a list to store the results to avoid "growing" a vector, which is usually wasteful.



        Alternatively, as suggested by @qdread, you could build a rasterStack using raster::stack(rasters2014, quick = TRUE) and call extract over the stack to avoid the for loop. Don't know which would be faster.



        HTH






        share|improve this answer






























          2














          If your raster are all properly aligned (same ncol, nrow, extent, origin, resolution), you could try identifying the "cell numbers" to be extracted by looking on the first file, then
          extracting based on those. This could speed-up the processing beacause raster does not need to compute which cells to extract. Something like this:



          rast1 <- raster(rasters2014[1])
          cells <- extract(rast1, poligon2, cellnumbers = TRUE, df = TRUE)[,"cells"]
          ext <- list()

          for (i in 1:length(rasters2014)){
          a <- raster(rasters2014[i])
          ext[[i]] <- as.vector(extract(a, cells, fun=mean, na.rm=TRUE, df=F))
          }


          Note that I am also using a list to store the results to avoid "growing" a vector, which is usually wasteful.



          Alternatively, as suggested by @qdread, you could build a rasterStack using raster::stack(rasters2014, quick = TRUE) and call extract over the stack to avoid the for loop. Don't know which would be faster.



          HTH






          share|improve this answer




























            2












            2








            2







            If your raster are all properly aligned (same ncol, nrow, extent, origin, resolution), you could try identifying the "cell numbers" to be extracted by looking on the first file, then
            extracting based on those. This could speed-up the processing beacause raster does not need to compute which cells to extract. Something like this:



            rast1 <- raster(rasters2014[1])
            cells <- extract(rast1, poligon2, cellnumbers = TRUE, df = TRUE)[,"cells"]
            ext <- list()

            for (i in 1:length(rasters2014)){
            a <- raster(rasters2014[i])
            ext[[i]] <- as.vector(extract(a, cells, fun=mean, na.rm=TRUE, df=F))
            }


            Note that I am also using a list to store the results to avoid "growing" a vector, which is usually wasteful.



            Alternatively, as suggested by @qdread, you could build a rasterStack using raster::stack(rasters2014, quick = TRUE) and call extract over the stack to avoid the for loop. Don't know which would be faster.



            HTH






            share|improve this answer















            If your raster are all properly aligned (same ncol, nrow, extent, origin, resolution), you could try identifying the "cell numbers" to be extracted by looking on the first file, then
            extracting based on those. This could speed-up the processing beacause raster does not need to compute which cells to extract. Something like this:



            rast1 <- raster(rasters2014[1])
            cells <- extract(rast1, poligon2, cellnumbers = TRUE, df = TRUE)[,"cells"]
            ext <- list()

            for (i in 1:length(rasters2014)){
            a <- raster(rasters2014[i])
            ext[[i]] <- as.vector(extract(a, cells, fun=mean, na.rm=TRUE, df=F))
            }


            Note that I am also using a list to store the results to avoid "growing" a vector, which is usually wasteful.



            Alternatively, as suggested by @qdread, you could build a rasterStack using raster::stack(rasters2014, quick = TRUE) and call extract over the stack to avoid the for loop. Don't know which would be faster.



            HTH







            share|improve this answer














            share|improve this answer



            share|improve this answer








            edited Nov 29 '18 at 11:01

























            answered Nov 28 '18 at 16:48









            lbusettlbusett

            3,64921434




            3,64921434

























                1














                If your polygons do not overlap (and in most cases they don't) an alternative route is



                library(raster)
                x <- rasterize(poligon2, rasters2014[1])
                s <- raster::stack(rasters2014, quick = TRUE)
                z <- zonal(s, x, "mean")


                PS: Faster is nicer, but I would suggest getting lunch while this runs.






                share|improve this answer




























                  1














                  If your polygons do not overlap (and in most cases they don't) an alternative route is



                  library(raster)
                  x <- rasterize(poligon2, rasters2014[1])
                  s <- raster::stack(rasters2014, quick = TRUE)
                  z <- zonal(s, x, "mean")


                  PS: Faster is nicer, but I would suggest getting lunch while this runs.






                  share|improve this answer


























                    1












                    1








                    1







                    If your polygons do not overlap (and in most cases they don't) an alternative route is



                    library(raster)
                    x <- rasterize(poligon2, rasters2014[1])
                    s <- raster::stack(rasters2014, quick = TRUE)
                    z <- zonal(s, x, "mean")


                    PS: Faster is nicer, but I would suggest getting lunch while this runs.






                    share|improve this answer













                    If your polygons do not overlap (and in most cases they don't) an alternative route is



                    library(raster)
                    x <- rasterize(poligon2, rasters2014[1])
                    s <- raster::stack(rasters2014, quick = TRUE)
                    z <- zonal(s, x, "mean")


                    PS: Faster is nicer, but I would suggest getting lunch while this runs.







                    share|improve this answer












                    share|improve this answer



                    share|improve this answer










                    answered Nov 29 '18 at 3:50









                    Robert HijmansRobert Hijmans

                    14.1k12530




                    14.1k12530























                        0














                        Thanks for your help! I've tried all of the proposed solutions and the computation time generally the same regardless of the applied method. Therefore, I guess that it is just not possible to significantly speed up the computational process.






                        share|improve this answer




























                          0














                          Thanks for your help! I've tried all of the proposed solutions and the computation time generally the same regardless of the applied method. Therefore, I guess that it is just not possible to significantly speed up the computational process.






                          share|improve this answer


























                            0












                            0








                            0







                            Thanks for your help! I've tried all of the proposed solutions and the computation time generally the same regardless of the applied method. Therefore, I guess that it is just not possible to significantly speed up the computational process.






                            share|improve this answer













                            Thanks for your help! I've tried all of the proposed solutions and the computation time generally the same regardless of the applied method. Therefore, I guess that it is just not possible to significantly speed up the computational process.







                            share|improve this answer












                            share|improve this answer



                            share|improve this answer










                            answered Nov 29 '18 at 10:13









                            PaulGPaulG

                            266




                            266






























                                draft saved

                                draft discarded




















































                                Thanks for contributing an answer to Stack Overflow!


                                • Please be sure to answer the question. Provide details and share your research!

                                But avoid



                                • Asking for help, clarification, or responding to other answers.

                                • Making statements based on opinion; back them up with references or personal experience.


                                To learn more, see our tips on writing great answers.




                                draft saved


                                draft discarded














                                StackExchange.ready(
                                function () {
                                StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53519947%2fspeeding-up-a-loop-over-rasters%23new-answer', 'question_page');
                                }
                                );

                                Post as a guest















                                Required, but never shown





















































                                Required, but never shown














                                Required, but never shown












                                Required, but never shown







                                Required, but never shown

































                                Required, but never shown














                                Required, but never shown












                                Required, but never shown







                                Required, but never shown







                                Popular posts from this blog

                                A CLEAN and SIMPLE way to add appendices to Table of Contents and bookmarks

                                Calculate evaluation metrics using cross_val_predict sklearn

                                Insert data from modal to MySQL (multiple modal on website)