Simulating data in R
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}
For part of my statistics assignment I have to study the distribution of the mean of a Poisson distribution. I'm asked to create a function poi_bar with inputs n, N, lambda where we have a vector of length N, and each individual entry in the vector is equal to the mean of n numbers chosen on a Poisson distribution with lambda = lambda.
I've tried a dozen things and searched the internet for hours and have found nothing that tells me how to do this. The closest I was able to get was when I defined the function like this:
poi_bar = function(n, N, lambda) {
V = rep(c(mean(rpois(n, lambda = lambda))), times = N)
return(V)
}
To test if this really worked, I tried n = 8, N = 25, lambda = 17, and the result was this:
poi_bar(8,25,17)
[1] 18.375 18.375 18.375 18.375 18.375 18.375 18.375 18.375
[9] 18.375 18.375 18.375 18.375 18.375 18.375 18.375 18.375
[17] 18.375 18.375 18.375 18.375 18.375 18.375 18.375 18.375
[25] 18.375
But I want the samples to be different, not just repeat one twenty-five times.
r
add a comment |
For part of my statistics assignment I have to study the distribution of the mean of a Poisson distribution. I'm asked to create a function poi_bar with inputs n, N, lambda where we have a vector of length N, and each individual entry in the vector is equal to the mean of n numbers chosen on a Poisson distribution with lambda = lambda.
I've tried a dozen things and searched the internet for hours and have found nothing that tells me how to do this. The closest I was able to get was when I defined the function like this:
poi_bar = function(n, N, lambda) {
V = rep(c(mean(rpois(n, lambda = lambda))), times = N)
return(V)
}
To test if this really worked, I tried n = 8, N = 25, lambda = 17, and the result was this:
poi_bar(8,25,17)
[1] 18.375 18.375 18.375 18.375 18.375 18.375 18.375 18.375
[9] 18.375 18.375 18.375 18.375 18.375 18.375 18.375 18.375
[17] 18.375 18.375 18.375 18.375 18.375 18.375 18.375 18.375
[25] 18.375
But I want the samples to be different, not just repeat one twenty-five times.
r
This is an R question not RStudio related. I will edit the question title.
– Rui Barradas
Nov 29 '18 at 7:49
See the code I added to the answer. It makes it more complete, with an alternative solution meant for speed.
– Rui Barradas
Nov 29 '18 at 19:05
add a comment |
For part of my statistics assignment I have to study the distribution of the mean of a Poisson distribution. I'm asked to create a function poi_bar with inputs n, N, lambda where we have a vector of length N, and each individual entry in the vector is equal to the mean of n numbers chosen on a Poisson distribution with lambda = lambda.
I've tried a dozen things and searched the internet for hours and have found nothing that tells me how to do this. The closest I was able to get was when I defined the function like this:
poi_bar = function(n, N, lambda) {
V = rep(c(mean(rpois(n, lambda = lambda))), times = N)
return(V)
}
To test if this really worked, I tried n = 8, N = 25, lambda = 17, and the result was this:
poi_bar(8,25,17)
[1] 18.375 18.375 18.375 18.375 18.375 18.375 18.375 18.375
[9] 18.375 18.375 18.375 18.375 18.375 18.375 18.375 18.375
[17] 18.375 18.375 18.375 18.375 18.375 18.375 18.375 18.375
[25] 18.375
But I want the samples to be different, not just repeat one twenty-five times.
r
For part of my statistics assignment I have to study the distribution of the mean of a Poisson distribution. I'm asked to create a function poi_bar with inputs n, N, lambda where we have a vector of length N, and each individual entry in the vector is equal to the mean of n numbers chosen on a Poisson distribution with lambda = lambda.
I've tried a dozen things and searched the internet for hours and have found nothing that tells me how to do this. The closest I was able to get was when I defined the function like this:
poi_bar = function(n, N, lambda) {
V = rep(c(mean(rpois(n, lambda = lambda))), times = N)
return(V)
}
To test if this really worked, I tried n = 8, N = 25, lambda = 17, and the result was this:
poi_bar(8,25,17)
[1] 18.375 18.375 18.375 18.375 18.375 18.375 18.375 18.375
[9] 18.375 18.375 18.375 18.375 18.375 18.375 18.375 18.375
[17] 18.375 18.375 18.375 18.375 18.375 18.375 18.375 18.375
[25] 18.375
But I want the samples to be different, not just repeat one twenty-five times.
r
r
edited Nov 29 '18 at 7:49
Rui Barradas
18.2k51833
18.2k51833
asked Nov 29 '18 at 6:25
Perry AinsworthPerry Ainsworth
31
31
This is an R question not RStudio related. I will edit the question title.
– Rui Barradas
Nov 29 '18 at 7:49
See the code I added to the answer. It makes it more complete, with an alternative solution meant for speed.
– Rui Barradas
Nov 29 '18 at 19:05
add a comment |
This is an R question not RStudio related. I will edit the question title.
– Rui Barradas
Nov 29 '18 at 7:49
See the code I added to the answer. It makes it more complete, with an alternative solution meant for speed.
– Rui Barradas
Nov 29 '18 at 19:05
This is an R question not RStudio related. I will edit the question title.
– Rui Barradas
Nov 29 '18 at 7:49
This is an R question not RStudio related. I will edit the question title.
– Rui Barradas
Nov 29 '18 at 7:49
See the code I added to the answer. It makes it more complete, with an alternative solution meant for speed.
– Rui Barradas
Nov 29 '18 at 19:05
See the code I added to the answer. It makes it more complete, with an alternative solution meant for speed.
– Rui Barradas
Nov 29 '18 at 19:05
add a comment |
1 Answer
1
active
oldest
votes
You should use replicate
, not rep
.
poi_bar <- function(n, N, lambda) {
V <- replicate(N, mean(rpois(n, lambda = lambda)))
V
}
set.seed(1234)
poi_bar(8, 25, 17)
Edit.
Though the answer was already accepted, I realized that there is a better, faster way of doing the same.
Functions colMeans
and rowMeans
are considerably faster than repeated applications of mean
, so what follows checks whether that is true in this use case.
Note that the function poi_bar
is the same as above but in order to make the timings fair I have rewritten it as a one-liner. The original is more clear.
poi_bar = function(n, N, lambda) {
replicate(N, mean(rpois(n, lambda = lambda)))
}
poi_bar2 = function(n, N, lambda) {
colMeans(replicate(N, rpois(n, lambda = lambda)))
}
Now test them and see that the results are identical.
set.seed(1234)
p <- poi_bar(8, 2500, 17)
set.seed(1234)
p2 <- poi_bar2(8, 2500, 17)
identical(p, p2)
#[1] TRUE
And the timings. I will use two CRAN packages, microbenchmark
and ggplot2
to plot the results.
library(ggplot2)
library(microbenchmark)
mb <- microbenchmark(
v1 = poi_bar(8, 2500, 17),
v2 = poi_bar2(8, 2500, 17)
)
print(mb)
autoplot(mb)
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53533016%2fsimulating-data-in-r%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
You should use replicate
, not rep
.
poi_bar <- function(n, N, lambda) {
V <- replicate(N, mean(rpois(n, lambda = lambda)))
V
}
set.seed(1234)
poi_bar(8, 25, 17)
Edit.
Though the answer was already accepted, I realized that there is a better, faster way of doing the same.
Functions colMeans
and rowMeans
are considerably faster than repeated applications of mean
, so what follows checks whether that is true in this use case.
Note that the function poi_bar
is the same as above but in order to make the timings fair I have rewritten it as a one-liner. The original is more clear.
poi_bar = function(n, N, lambda) {
replicate(N, mean(rpois(n, lambda = lambda)))
}
poi_bar2 = function(n, N, lambda) {
colMeans(replicate(N, rpois(n, lambda = lambda)))
}
Now test them and see that the results are identical.
set.seed(1234)
p <- poi_bar(8, 2500, 17)
set.seed(1234)
p2 <- poi_bar2(8, 2500, 17)
identical(p, p2)
#[1] TRUE
And the timings. I will use two CRAN packages, microbenchmark
and ggplot2
to plot the results.
library(ggplot2)
library(microbenchmark)
mb <- microbenchmark(
v1 = poi_bar(8, 2500, 17),
v2 = poi_bar2(8, 2500, 17)
)
print(mb)
autoplot(mb)
add a comment |
You should use replicate
, not rep
.
poi_bar <- function(n, N, lambda) {
V <- replicate(N, mean(rpois(n, lambda = lambda)))
V
}
set.seed(1234)
poi_bar(8, 25, 17)
Edit.
Though the answer was already accepted, I realized that there is a better, faster way of doing the same.
Functions colMeans
and rowMeans
are considerably faster than repeated applications of mean
, so what follows checks whether that is true in this use case.
Note that the function poi_bar
is the same as above but in order to make the timings fair I have rewritten it as a one-liner. The original is more clear.
poi_bar = function(n, N, lambda) {
replicate(N, mean(rpois(n, lambda = lambda)))
}
poi_bar2 = function(n, N, lambda) {
colMeans(replicate(N, rpois(n, lambda = lambda)))
}
Now test them and see that the results are identical.
set.seed(1234)
p <- poi_bar(8, 2500, 17)
set.seed(1234)
p2 <- poi_bar2(8, 2500, 17)
identical(p, p2)
#[1] TRUE
And the timings. I will use two CRAN packages, microbenchmark
and ggplot2
to plot the results.
library(ggplot2)
library(microbenchmark)
mb <- microbenchmark(
v1 = poi_bar(8, 2500, 17),
v2 = poi_bar2(8, 2500, 17)
)
print(mb)
autoplot(mb)
add a comment |
You should use replicate
, not rep
.
poi_bar <- function(n, N, lambda) {
V <- replicate(N, mean(rpois(n, lambda = lambda)))
V
}
set.seed(1234)
poi_bar(8, 25, 17)
Edit.
Though the answer was already accepted, I realized that there is a better, faster way of doing the same.
Functions colMeans
and rowMeans
are considerably faster than repeated applications of mean
, so what follows checks whether that is true in this use case.
Note that the function poi_bar
is the same as above but in order to make the timings fair I have rewritten it as a one-liner. The original is more clear.
poi_bar = function(n, N, lambda) {
replicate(N, mean(rpois(n, lambda = lambda)))
}
poi_bar2 = function(n, N, lambda) {
colMeans(replicate(N, rpois(n, lambda = lambda)))
}
Now test them and see that the results are identical.
set.seed(1234)
p <- poi_bar(8, 2500, 17)
set.seed(1234)
p2 <- poi_bar2(8, 2500, 17)
identical(p, p2)
#[1] TRUE
And the timings. I will use two CRAN packages, microbenchmark
and ggplot2
to plot the results.
library(ggplot2)
library(microbenchmark)
mb <- microbenchmark(
v1 = poi_bar(8, 2500, 17),
v2 = poi_bar2(8, 2500, 17)
)
print(mb)
autoplot(mb)
You should use replicate
, not rep
.
poi_bar <- function(n, N, lambda) {
V <- replicate(N, mean(rpois(n, lambda = lambda)))
V
}
set.seed(1234)
poi_bar(8, 25, 17)
Edit.
Though the answer was already accepted, I realized that there is a better, faster way of doing the same.
Functions colMeans
and rowMeans
are considerably faster than repeated applications of mean
, so what follows checks whether that is true in this use case.
Note that the function poi_bar
is the same as above but in order to make the timings fair I have rewritten it as a one-liner. The original is more clear.
poi_bar = function(n, N, lambda) {
replicate(N, mean(rpois(n, lambda = lambda)))
}
poi_bar2 = function(n, N, lambda) {
colMeans(replicate(N, rpois(n, lambda = lambda)))
}
Now test them and see that the results are identical.
set.seed(1234)
p <- poi_bar(8, 2500, 17)
set.seed(1234)
p2 <- poi_bar2(8, 2500, 17)
identical(p, p2)
#[1] TRUE
And the timings. I will use two CRAN packages, microbenchmark
and ggplot2
to plot the results.
library(ggplot2)
library(microbenchmark)
mb <- microbenchmark(
v1 = poi_bar(8, 2500, 17),
v2 = poi_bar2(8, 2500, 17)
)
print(mb)
autoplot(mb)
edited Nov 29 '18 at 19:04
answered Nov 29 '18 at 7:48
Rui BarradasRui Barradas
18.2k51833
18.2k51833
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53533016%2fsimulating-data-in-r%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
This is an R question not RStudio related. I will edit the question title.
– Rui Barradas
Nov 29 '18 at 7:49
See the code I added to the answer. It makes it more complete, with an alternative solution meant for speed.
– Rui Barradas
Nov 29 '18 at 19:05