Simulating data in R





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}







0















For part of my statistics assignment I have to study the distribution of the mean of a Poisson distribution. I'm asked to create a function poi_bar with inputs n, N, lambda where we have a vector of length N, and each individual entry in the vector is equal to the mean of n numbers chosen on a Poisson distribution with lambda = lambda.



I've tried a dozen things and searched the internet for hours and have found nothing that tells me how to do this. The closest I was able to get was when I defined the function like this:



 poi_bar = function(n, N, lambda) {

V = rep(c(mean(rpois(n, lambda = lambda))), times = N)

return(V)
}


To test if this really worked, I tried n = 8, N = 25, lambda = 17, and the result was this:



 poi_bar(8,25,17)

[1] 18.375 18.375 18.375 18.375 18.375 18.375 18.375 18.375

[9] 18.375 18.375 18.375 18.375 18.375 18.375 18.375 18.375

[17] 18.375 18.375 18.375 18.375 18.375 18.375 18.375 18.375

[25] 18.375


But I want the samples to be different, not just repeat one twenty-five times.










share|improve this question

























  • This is an R question not RStudio related. I will edit the question title.

    – Rui Barradas
    Nov 29 '18 at 7:49











  • See the code I added to the answer. It makes it more complete, with an alternative solution meant for speed.

    – Rui Barradas
    Nov 29 '18 at 19:05


















0















For part of my statistics assignment I have to study the distribution of the mean of a Poisson distribution. I'm asked to create a function poi_bar with inputs n, N, lambda where we have a vector of length N, and each individual entry in the vector is equal to the mean of n numbers chosen on a Poisson distribution with lambda = lambda.



I've tried a dozen things and searched the internet for hours and have found nothing that tells me how to do this. The closest I was able to get was when I defined the function like this:



 poi_bar = function(n, N, lambda) {

V = rep(c(mean(rpois(n, lambda = lambda))), times = N)

return(V)
}


To test if this really worked, I tried n = 8, N = 25, lambda = 17, and the result was this:



 poi_bar(8,25,17)

[1] 18.375 18.375 18.375 18.375 18.375 18.375 18.375 18.375

[9] 18.375 18.375 18.375 18.375 18.375 18.375 18.375 18.375

[17] 18.375 18.375 18.375 18.375 18.375 18.375 18.375 18.375

[25] 18.375


But I want the samples to be different, not just repeat one twenty-five times.










share|improve this question

























  • This is an R question not RStudio related. I will edit the question title.

    – Rui Barradas
    Nov 29 '18 at 7:49











  • See the code I added to the answer. It makes it more complete, with an alternative solution meant for speed.

    – Rui Barradas
    Nov 29 '18 at 19:05














0












0








0








For part of my statistics assignment I have to study the distribution of the mean of a Poisson distribution. I'm asked to create a function poi_bar with inputs n, N, lambda where we have a vector of length N, and each individual entry in the vector is equal to the mean of n numbers chosen on a Poisson distribution with lambda = lambda.



I've tried a dozen things and searched the internet for hours and have found nothing that tells me how to do this. The closest I was able to get was when I defined the function like this:



 poi_bar = function(n, N, lambda) {

V = rep(c(mean(rpois(n, lambda = lambda))), times = N)

return(V)
}


To test if this really worked, I tried n = 8, N = 25, lambda = 17, and the result was this:



 poi_bar(8,25,17)

[1] 18.375 18.375 18.375 18.375 18.375 18.375 18.375 18.375

[9] 18.375 18.375 18.375 18.375 18.375 18.375 18.375 18.375

[17] 18.375 18.375 18.375 18.375 18.375 18.375 18.375 18.375

[25] 18.375


But I want the samples to be different, not just repeat one twenty-five times.










share|improve this question
















For part of my statistics assignment I have to study the distribution of the mean of a Poisson distribution. I'm asked to create a function poi_bar with inputs n, N, lambda where we have a vector of length N, and each individual entry in the vector is equal to the mean of n numbers chosen on a Poisson distribution with lambda = lambda.



I've tried a dozen things and searched the internet for hours and have found nothing that tells me how to do this. The closest I was able to get was when I defined the function like this:



 poi_bar = function(n, N, lambda) {

V = rep(c(mean(rpois(n, lambda = lambda))), times = N)

return(V)
}


To test if this really worked, I tried n = 8, N = 25, lambda = 17, and the result was this:



 poi_bar(8,25,17)

[1] 18.375 18.375 18.375 18.375 18.375 18.375 18.375 18.375

[9] 18.375 18.375 18.375 18.375 18.375 18.375 18.375 18.375

[17] 18.375 18.375 18.375 18.375 18.375 18.375 18.375 18.375

[25] 18.375


But I want the samples to be different, not just repeat one twenty-five times.







r






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 29 '18 at 7:49









Rui Barradas

18.2k51833




18.2k51833










asked Nov 29 '18 at 6:25









Perry AinsworthPerry Ainsworth

31




31













  • This is an R question not RStudio related. I will edit the question title.

    – Rui Barradas
    Nov 29 '18 at 7:49











  • See the code I added to the answer. It makes it more complete, with an alternative solution meant for speed.

    – Rui Barradas
    Nov 29 '18 at 19:05



















  • This is an R question not RStudio related. I will edit the question title.

    – Rui Barradas
    Nov 29 '18 at 7:49











  • See the code I added to the answer. It makes it more complete, with an alternative solution meant for speed.

    – Rui Barradas
    Nov 29 '18 at 19:05

















This is an R question not RStudio related. I will edit the question title.

– Rui Barradas
Nov 29 '18 at 7:49





This is an R question not RStudio related. I will edit the question title.

– Rui Barradas
Nov 29 '18 at 7:49













See the code I added to the answer. It makes it more complete, with an alternative solution meant for speed.

– Rui Barradas
Nov 29 '18 at 19:05





See the code I added to the answer. It makes it more complete, with an alternative solution meant for speed.

– Rui Barradas
Nov 29 '18 at 19:05












1 Answer
1






active

oldest

votes


















1














You should use replicate, not rep.



poi_bar <- function(n, N, lambda) {
V <- replicate(N, mean(rpois(n, lambda = lambda)))
V
}

set.seed(1234)
poi_bar(8, 25, 17)


Edit.

Though the answer was already accepted, I realized that there is a better, faster way of doing the same.

Functions colMeans and rowMeans are considerably faster than repeated applications of mean, so what follows checks whether that is true in this use case.



Note that the function poi_bar is the same as above but in order to make the timings fair I have rewritten it as a one-liner. The original is more clear.



poi_bar = function(n, N, lambda) {
replicate(N, mean(rpois(n, lambda = lambda)))
}

poi_bar2 = function(n, N, lambda) {
colMeans(replicate(N, rpois(n, lambda = lambda)))
}


Now test them and see that the results are identical.



set.seed(1234)
p <- poi_bar(8, 2500, 17)

set.seed(1234)
p2 <- poi_bar2(8, 2500, 17)

identical(p, p2)
#[1] TRUE


And the timings. I will use two CRAN packages, microbenchmark and ggplot2 to plot the results.



library(ggplot2)
library(microbenchmark)

mb <- microbenchmark(
v1 = poi_bar(8, 2500, 17),
v2 = poi_bar2(8, 2500, 17)
)
print(mb)
autoplot(mb)


enter image description here






share|improve this answer


























    Your Answer






    StackExchange.ifUsing("editor", function () {
    StackExchange.using("externalEditor", function () {
    StackExchange.using("snippets", function () {
    StackExchange.snippets.init();
    });
    });
    }, "code-snippets");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "1"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53533016%2fsimulating-data-in-r%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    1














    You should use replicate, not rep.



    poi_bar <- function(n, N, lambda) {
    V <- replicate(N, mean(rpois(n, lambda = lambda)))
    V
    }

    set.seed(1234)
    poi_bar(8, 25, 17)


    Edit.

    Though the answer was already accepted, I realized that there is a better, faster way of doing the same.

    Functions colMeans and rowMeans are considerably faster than repeated applications of mean, so what follows checks whether that is true in this use case.



    Note that the function poi_bar is the same as above but in order to make the timings fair I have rewritten it as a one-liner. The original is more clear.



    poi_bar = function(n, N, lambda) {
    replicate(N, mean(rpois(n, lambda = lambda)))
    }

    poi_bar2 = function(n, N, lambda) {
    colMeans(replicate(N, rpois(n, lambda = lambda)))
    }


    Now test them and see that the results are identical.



    set.seed(1234)
    p <- poi_bar(8, 2500, 17)

    set.seed(1234)
    p2 <- poi_bar2(8, 2500, 17)

    identical(p, p2)
    #[1] TRUE


    And the timings. I will use two CRAN packages, microbenchmark and ggplot2 to plot the results.



    library(ggplot2)
    library(microbenchmark)

    mb <- microbenchmark(
    v1 = poi_bar(8, 2500, 17),
    v2 = poi_bar2(8, 2500, 17)
    )
    print(mb)
    autoplot(mb)


    enter image description here






    share|improve this answer






























      1














      You should use replicate, not rep.



      poi_bar <- function(n, N, lambda) {
      V <- replicate(N, mean(rpois(n, lambda = lambda)))
      V
      }

      set.seed(1234)
      poi_bar(8, 25, 17)


      Edit.

      Though the answer was already accepted, I realized that there is a better, faster way of doing the same.

      Functions colMeans and rowMeans are considerably faster than repeated applications of mean, so what follows checks whether that is true in this use case.



      Note that the function poi_bar is the same as above but in order to make the timings fair I have rewritten it as a one-liner. The original is more clear.



      poi_bar = function(n, N, lambda) {
      replicate(N, mean(rpois(n, lambda = lambda)))
      }

      poi_bar2 = function(n, N, lambda) {
      colMeans(replicate(N, rpois(n, lambda = lambda)))
      }


      Now test them and see that the results are identical.



      set.seed(1234)
      p <- poi_bar(8, 2500, 17)

      set.seed(1234)
      p2 <- poi_bar2(8, 2500, 17)

      identical(p, p2)
      #[1] TRUE


      And the timings. I will use two CRAN packages, microbenchmark and ggplot2 to plot the results.



      library(ggplot2)
      library(microbenchmark)

      mb <- microbenchmark(
      v1 = poi_bar(8, 2500, 17),
      v2 = poi_bar2(8, 2500, 17)
      )
      print(mb)
      autoplot(mb)


      enter image description here






      share|improve this answer




























        1












        1








        1







        You should use replicate, not rep.



        poi_bar <- function(n, N, lambda) {
        V <- replicate(N, mean(rpois(n, lambda = lambda)))
        V
        }

        set.seed(1234)
        poi_bar(8, 25, 17)


        Edit.

        Though the answer was already accepted, I realized that there is a better, faster way of doing the same.

        Functions colMeans and rowMeans are considerably faster than repeated applications of mean, so what follows checks whether that is true in this use case.



        Note that the function poi_bar is the same as above but in order to make the timings fair I have rewritten it as a one-liner. The original is more clear.



        poi_bar = function(n, N, lambda) {
        replicate(N, mean(rpois(n, lambda = lambda)))
        }

        poi_bar2 = function(n, N, lambda) {
        colMeans(replicate(N, rpois(n, lambda = lambda)))
        }


        Now test them and see that the results are identical.



        set.seed(1234)
        p <- poi_bar(8, 2500, 17)

        set.seed(1234)
        p2 <- poi_bar2(8, 2500, 17)

        identical(p, p2)
        #[1] TRUE


        And the timings. I will use two CRAN packages, microbenchmark and ggplot2 to plot the results.



        library(ggplot2)
        library(microbenchmark)

        mb <- microbenchmark(
        v1 = poi_bar(8, 2500, 17),
        v2 = poi_bar2(8, 2500, 17)
        )
        print(mb)
        autoplot(mb)


        enter image description here






        share|improve this answer















        You should use replicate, not rep.



        poi_bar <- function(n, N, lambda) {
        V <- replicate(N, mean(rpois(n, lambda = lambda)))
        V
        }

        set.seed(1234)
        poi_bar(8, 25, 17)


        Edit.

        Though the answer was already accepted, I realized that there is a better, faster way of doing the same.

        Functions colMeans and rowMeans are considerably faster than repeated applications of mean, so what follows checks whether that is true in this use case.



        Note that the function poi_bar is the same as above but in order to make the timings fair I have rewritten it as a one-liner. The original is more clear.



        poi_bar = function(n, N, lambda) {
        replicate(N, mean(rpois(n, lambda = lambda)))
        }

        poi_bar2 = function(n, N, lambda) {
        colMeans(replicate(N, rpois(n, lambda = lambda)))
        }


        Now test them and see that the results are identical.



        set.seed(1234)
        p <- poi_bar(8, 2500, 17)

        set.seed(1234)
        p2 <- poi_bar2(8, 2500, 17)

        identical(p, p2)
        #[1] TRUE


        And the timings. I will use two CRAN packages, microbenchmark and ggplot2 to plot the results.



        library(ggplot2)
        library(microbenchmark)

        mb <- microbenchmark(
        v1 = poi_bar(8, 2500, 17),
        v2 = poi_bar2(8, 2500, 17)
        )
        print(mb)
        autoplot(mb)


        enter image description here







        share|improve this answer














        share|improve this answer



        share|improve this answer








        edited Nov 29 '18 at 19:04

























        answered Nov 29 '18 at 7:48









        Rui BarradasRui Barradas

        18.2k51833




        18.2k51833
































            draft saved

            draft discarded




















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53533016%2fsimulating-data-in-r%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            A CLEAN and SIMPLE way to add appendices to Table of Contents and bookmarks

            Calculate evaluation metrics using cross_val_predict sklearn

            Insert data from modal to MySQL (multiple modal on website)