How does the Predict function handle continuous values with a 0 in R for a Poisson Log Link Model?












1















I am using a Poisson GLM on some dummy data to predict ClaimCounts based on two variables, frequency and Judicial Orientation.



Dummy Data Frame:



data5 <-data.frame(Year=c("2006","2006","2006","2007","2007","2007","2008","2009","2010","2010","2009","2009"), 
JudicialOrientation=c("Defense","Plaintiff","Plaintiff","Neutral","Defense","Plaintiff","Defense","Plaintiff","Neutral","Neutral","Plaintiff","Defense"),
Frequency=c(0.0,0.06,.07,.04,.03,.02,0,.1,.09,.08,.11,0),
ClaimCount=c(0,5,10,3,4,0,7,8,15,16,17,12),
Loss = c(100000,100,2500,100000,25000,0,7500,5200, 900,100,0,50),
Exposure=c(10,20,30,1,2,4,3,2,1,54,12,13)
)


Model GLM:



ClaimModel <- glm(ClaimCount~JudicialOrientation+Frequency     
,family = poisson(link="log"), offset=log(Exposure), data = data5, na.action=na.pass)

Call:
glm(formula = ClaimCount ~ JudicialOrientation + Frequency, family = poisson(link = "log"),
data = data5, na.action = na.pass, offset = log(Exposure))

Deviance Residuals:
Min 1Q Median 3Q Max
-3.7555 -0.7277 -0.1196 2.6895 7.4768

Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.3493 0.2125 -1.644 0.1
JudicialOrientationNeutral -3.3343 0.5664 -5.887 3.94e-09 ***
JudicialOrientationPlaintiff -3.4512 0.6337 -5.446 5.15e-08 ***
Frequency 39.8765 6.7255 5.929 3.04e-09 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for poisson family taken to be 1)

Null deviance: 149.72 on 11 degrees of freedom
Residual deviance: 111.59 on 8 degrees of freedom
AIC: 159.43

Number of Fisher Scoring iterations: 6


I am using an offset of Exposure as well.



I then want to use this GLM to predict claim counts for the same observations:



data5$ExpClaimCount <- predict(ClaimModel, newdata=data5, type="response")


If I understand correctly then the Poisson glm equation should then be:




ClaimCount = exp(-.3493 + -3.3343*JudicialOrientationNeutral +
-3.4512*JudicialOrientationPlaintiff + 39.8765*Frequency + log(Exposure))




However I tried this manually(In excel =EXP(-0.3493+0+0+LOG(10)) for observation 1 for example) and for some of the observations but did not get the correct answer.



Is my understanding of the GLM equation incorrect?










share|improve this question


















  • 1





    You're probably seeing different results because LOG in Excel is base 10 logarithm. Try using LN instead.

    – tkmckenzie
    Nov 27 '18 at 22:56













  • @tkmckenzie Excatlyl in R it is log(x, base = exp(1)) for default.

    – floe
    Nov 27 '18 at 23:14


















1















I am using a Poisson GLM on some dummy data to predict ClaimCounts based on two variables, frequency and Judicial Orientation.



Dummy Data Frame:



data5 <-data.frame(Year=c("2006","2006","2006","2007","2007","2007","2008","2009","2010","2010","2009","2009"), 
JudicialOrientation=c("Defense","Plaintiff","Plaintiff","Neutral","Defense","Plaintiff","Defense","Plaintiff","Neutral","Neutral","Plaintiff","Defense"),
Frequency=c(0.0,0.06,.07,.04,.03,.02,0,.1,.09,.08,.11,0),
ClaimCount=c(0,5,10,3,4,0,7,8,15,16,17,12),
Loss = c(100000,100,2500,100000,25000,0,7500,5200, 900,100,0,50),
Exposure=c(10,20,30,1,2,4,3,2,1,54,12,13)
)


Model GLM:



ClaimModel <- glm(ClaimCount~JudicialOrientation+Frequency     
,family = poisson(link="log"), offset=log(Exposure), data = data5, na.action=na.pass)

Call:
glm(formula = ClaimCount ~ JudicialOrientation + Frequency, family = poisson(link = "log"),
data = data5, na.action = na.pass, offset = log(Exposure))

Deviance Residuals:
Min 1Q Median 3Q Max
-3.7555 -0.7277 -0.1196 2.6895 7.4768

Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.3493 0.2125 -1.644 0.1
JudicialOrientationNeutral -3.3343 0.5664 -5.887 3.94e-09 ***
JudicialOrientationPlaintiff -3.4512 0.6337 -5.446 5.15e-08 ***
Frequency 39.8765 6.7255 5.929 3.04e-09 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for poisson family taken to be 1)

Null deviance: 149.72 on 11 degrees of freedom
Residual deviance: 111.59 on 8 degrees of freedom
AIC: 159.43

Number of Fisher Scoring iterations: 6


I am using an offset of Exposure as well.



I then want to use this GLM to predict claim counts for the same observations:



data5$ExpClaimCount <- predict(ClaimModel, newdata=data5, type="response")


If I understand correctly then the Poisson glm equation should then be:




ClaimCount = exp(-.3493 + -3.3343*JudicialOrientationNeutral +
-3.4512*JudicialOrientationPlaintiff + 39.8765*Frequency + log(Exposure))




However I tried this manually(In excel =EXP(-0.3493+0+0+LOG(10)) for observation 1 for example) and for some of the observations but did not get the correct answer.



Is my understanding of the GLM equation incorrect?










share|improve this question


















  • 1





    You're probably seeing different results because LOG in Excel is base 10 logarithm. Try using LN instead.

    – tkmckenzie
    Nov 27 '18 at 22:56













  • @tkmckenzie Excatlyl in R it is log(x, base = exp(1)) for default.

    – floe
    Nov 27 '18 at 23:14
















1












1








1








I am using a Poisson GLM on some dummy data to predict ClaimCounts based on two variables, frequency and Judicial Orientation.



Dummy Data Frame:



data5 <-data.frame(Year=c("2006","2006","2006","2007","2007","2007","2008","2009","2010","2010","2009","2009"), 
JudicialOrientation=c("Defense","Plaintiff","Plaintiff","Neutral","Defense","Plaintiff","Defense","Plaintiff","Neutral","Neutral","Plaintiff","Defense"),
Frequency=c(0.0,0.06,.07,.04,.03,.02,0,.1,.09,.08,.11,0),
ClaimCount=c(0,5,10,3,4,0,7,8,15,16,17,12),
Loss = c(100000,100,2500,100000,25000,0,7500,5200, 900,100,0,50),
Exposure=c(10,20,30,1,2,4,3,2,1,54,12,13)
)


Model GLM:



ClaimModel <- glm(ClaimCount~JudicialOrientation+Frequency     
,family = poisson(link="log"), offset=log(Exposure), data = data5, na.action=na.pass)

Call:
glm(formula = ClaimCount ~ JudicialOrientation + Frequency, family = poisson(link = "log"),
data = data5, na.action = na.pass, offset = log(Exposure))

Deviance Residuals:
Min 1Q Median 3Q Max
-3.7555 -0.7277 -0.1196 2.6895 7.4768

Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.3493 0.2125 -1.644 0.1
JudicialOrientationNeutral -3.3343 0.5664 -5.887 3.94e-09 ***
JudicialOrientationPlaintiff -3.4512 0.6337 -5.446 5.15e-08 ***
Frequency 39.8765 6.7255 5.929 3.04e-09 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for poisson family taken to be 1)

Null deviance: 149.72 on 11 degrees of freedom
Residual deviance: 111.59 on 8 degrees of freedom
AIC: 159.43

Number of Fisher Scoring iterations: 6


I am using an offset of Exposure as well.



I then want to use this GLM to predict claim counts for the same observations:



data5$ExpClaimCount <- predict(ClaimModel, newdata=data5, type="response")


If I understand correctly then the Poisson glm equation should then be:




ClaimCount = exp(-.3493 + -3.3343*JudicialOrientationNeutral +
-3.4512*JudicialOrientationPlaintiff + 39.8765*Frequency + log(Exposure))




However I tried this manually(In excel =EXP(-0.3493+0+0+LOG(10)) for observation 1 for example) and for some of the observations but did not get the correct answer.



Is my understanding of the GLM equation incorrect?










share|improve this question














I am using a Poisson GLM on some dummy data to predict ClaimCounts based on two variables, frequency and Judicial Orientation.



Dummy Data Frame:



data5 <-data.frame(Year=c("2006","2006","2006","2007","2007","2007","2008","2009","2010","2010","2009","2009"), 
JudicialOrientation=c("Defense","Plaintiff","Plaintiff","Neutral","Defense","Plaintiff","Defense","Plaintiff","Neutral","Neutral","Plaintiff","Defense"),
Frequency=c(0.0,0.06,.07,.04,.03,.02,0,.1,.09,.08,.11,0),
ClaimCount=c(0,5,10,3,4,0,7,8,15,16,17,12),
Loss = c(100000,100,2500,100000,25000,0,7500,5200, 900,100,0,50),
Exposure=c(10,20,30,1,2,4,3,2,1,54,12,13)
)


Model GLM:



ClaimModel <- glm(ClaimCount~JudicialOrientation+Frequency     
,family = poisson(link="log"), offset=log(Exposure), data = data5, na.action=na.pass)

Call:
glm(formula = ClaimCount ~ JudicialOrientation + Frequency, family = poisson(link = "log"),
data = data5, na.action = na.pass, offset = log(Exposure))

Deviance Residuals:
Min 1Q Median 3Q Max
-3.7555 -0.7277 -0.1196 2.6895 7.4768

Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.3493 0.2125 -1.644 0.1
JudicialOrientationNeutral -3.3343 0.5664 -5.887 3.94e-09 ***
JudicialOrientationPlaintiff -3.4512 0.6337 -5.446 5.15e-08 ***
Frequency 39.8765 6.7255 5.929 3.04e-09 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for poisson family taken to be 1)

Null deviance: 149.72 on 11 degrees of freedom
Residual deviance: 111.59 on 8 degrees of freedom
AIC: 159.43

Number of Fisher Scoring iterations: 6


I am using an offset of Exposure as well.



I then want to use this GLM to predict claim counts for the same observations:



data5$ExpClaimCount <- predict(ClaimModel, newdata=data5, type="response")


If I understand correctly then the Poisson glm equation should then be:




ClaimCount = exp(-.3493 + -3.3343*JudicialOrientationNeutral +
-3.4512*JudicialOrientationPlaintiff + 39.8765*Frequency + log(Exposure))




However I tried this manually(In excel =EXP(-0.3493+0+0+LOG(10)) for observation 1 for example) and for some of the observations but did not get the correct answer.



Is my understanding of the GLM equation incorrect?







r offset glm predict poisson






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Nov 27 '18 at 21:57









Coldchain9Coldchain9

356




356








  • 1





    You're probably seeing different results because LOG in Excel is base 10 logarithm. Try using LN instead.

    – tkmckenzie
    Nov 27 '18 at 22:56













  • @tkmckenzie Excatlyl in R it is log(x, base = exp(1)) for default.

    – floe
    Nov 27 '18 at 23:14
















  • 1





    You're probably seeing different results because LOG in Excel is base 10 logarithm. Try using LN instead.

    – tkmckenzie
    Nov 27 '18 at 22:56













  • @tkmckenzie Excatlyl in R it is log(x, base = exp(1)) for default.

    – floe
    Nov 27 '18 at 23:14










1




1





You're probably seeing different results because LOG in Excel is base 10 logarithm. Try using LN instead.

– tkmckenzie
Nov 27 '18 at 22:56







You're probably seeing different results because LOG in Excel is base 10 logarithm. Try using LN instead.

– tkmckenzie
Nov 27 '18 at 22:56















@tkmckenzie Excatlyl in R it is log(x, base = exp(1)) for default.

– floe
Nov 27 '18 at 23:14







@tkmckenzie Excatlyl in R it is log(x, base = exp(1)) for default.

– floe
Nov 27 '18 at 23:14














1 Answer
1






active

oldest

votes


















2














You are right with the assumption about how predict() for a Poisson GLM works. This can be verified in R:



co <- coef(ClaimModel)
p1 <- with(data5,
exp(log(Exposure) + # offset
co[1] + # intercept
ifelse(as.numeric(JudicialOrientation)>1, # factor term
co[as.numeric(JudicialOrientation)], 0) +
Frequency * co[4])) # linear term

all.equal(p1, predict(ClaimModel, type="response"), check.names=FALSE)
[1] TRUE


As indicated in the comments you probably get the wrong results in Excel because of the different basis of the logarithm (10 in Excel, Euler's number in R).






share|improve this answer
























  • That explains it. Thank you. This small detail I was not aware of but this helps me immensely. Thanks!

    – Coldchain9
    Nov 28 '18 at 12:59











Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53508834%2fhow-does-the-predict-function-handle-continuous-values-with-a-0-in-r-for-a-poiss%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









2














You are right with the assumption about how predict() for a Poisson GLM works. This can be verified in R:



co <- coef(ClaimModel)
p1 <- with(data5,
exp(log(Exposure) + # offset
co[1] + # intercept
ifelse(as.numeric(JudicialOrientation)>1, # factor term
co[as.numeric(JudicialOrientation)], 0) +
Frequency * co[4])) # linear term

all.equal(p1, predict(ClaimModel, type="response"), check.names=FALSE)
[1] TRUE


As indicated in the comments you probably get the wrong results in Excel because of the different basis of the logarithm (10 in Excel, Euler's number in R).






share|improve this answer
























  • That explains it. Thank you. This small detail I was not aware of but this helps me immensely. Thanks!

    – Coldchain9
    Nov 28 '18 at 12:59
















2














You are right with the assumption about how predict() for a Poisson GLM works. This can be verified in R:



co <- coef(ClaimModel)
p1 <- with(data5,
exp(log(Exposure) + # offset
co[1] + # intercept
ifelse(as.numeric(JudicialOrientation)>1, # factor term
co[as.numeric(JudicialOrientation)], 0) +
Frequency * co[4])) # linear term

all.equal(p1, predict(ClaimModel, type="response"), check.names=FALSE)
[1] TRUE


As indicated in the comments you probably get the wrong results in Excel because of the different basis of the logarithm (10 in Excel, Euler's number in R).






share|improve this answer
























  • That explains it. Thank you. This small detail I was not aware of but this helps me immensely. Thanks!

    – Coldchain9
    Nov 28 '18 at 12:59














2












2








2







You are right with the assumption about how predict() for a Poisson GLM works. This can be verified in R:



co <- coef(ClaimModel)
p1 <- with(data5,
exp(log(Exposure) + # offset
co[1] + # intercept
ifelse(as.numeric(JudicialOrientation)>1, # factor term
co[as.numeric(JudicialOrientation)], 0) +
Frequency * co[4])) # linear term

all.equal(p1, predict(ClaimModel, type="response"), check.names=FALSE)
[1] TRUE


As indicated in the comments you probably get the wrong results in Excel because of the different basis of the logarithm (10 in Excel, Euler's number in R).






share|improve this answer













You are right with the assumption about how predict() for a Poisson GLM works. This can be verified in R:



co <- coef(ClaimModel)
p1 <- with(data5,
exp(log(Exposure) + # offset
co[1] + # intercept
ifelse(as.numeric(JudicialOrientation)>1, # factor term
co[as.numeric(JudicialOrientation)], 0) +
Frequency * co[4])) # linear term

all.equal(p1, predict(ClaimModel, type="response"), check.names=FALSE)
[1] TRUE


As indicated in the comments you probably get the wrong results in Excel because of the different basis of the logarithm (10 in Excel, Euler's number in R).







share|improve this answer












share|improve this answer



share|improve this answer










answered Nov 28 '18 at 4:00









FlorianFlorian

1,102818




1,102818













  • That explains it. Thank you. This small detail I was not aware of but this helps me immensely. Thanks!

    – Coldchain9
    Nov 28 '18 at 12:59



















  • That explains it. Thank you. This small detail I was not aware of but this helps me immensely. Thanks!

    – Coldchain9
    Nov 28 '18 at 12:59

















That explains it. Thank you. This small detail I was not aware of but this helps me immensely. Thanks!

– Coldchain9
Nov 28 '18 at 12:59





That explains it. Thank you. This small detail I was not aware of but this helps me immensely. Thanks!

– Coldchain9
Nov 28 '18 at 12:59




















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53508834%2fhow-does-the-predict-function-handle-continuous-values-with-a-0-in-r-for-a-poiss%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

A CLEAN and SIMPLE way to add appendices to Table of Contents and bookmarks

Calculate evaluation metrics using cross_val_predict sklearn

Insert data from modal to MySQL (multiple modal on website)