Shape of pytorch model.parameter is inconsistent with how it's defined in the model












0















I'm attempting to extract the weights and biases from a simple network built in PyTorch. My entire network is composed of nn.Linear layers. When I create a layer by calling nn.Linear(in_dim, out_dim), I expect the parameters that I get from calling model.parameters() for that model to be of shape (in_dim, out_dim) for the weight and (out_dim) for the bias. However, the weights that come out of model.parameters() are instead of shape (out_dim, in_dim).



The intention of my code is to be able to use matrix multiplication to perform a forward pass using only numpy, not any PyTorch. Because of the shape inconsistency, matrix multiplications throw an error. How can I fix this?



Here is my exact code:



class RNN(nn.Module):

def __init__(self, dim_input, dim_recurrent, dim_output):

super(RNN, self).__init__()

self.dim_input = dim_input
self.dim_recurrent = dim_recurrent
self.dim_output = dim_output

self.dense1 = nn.Linear(self.dim_input, self.dim_recurrent)
self.dense2 = nn.Linear(self.dim_recurrent, self.dim_recurrent, bias = False)
self.dense3 = nn.Linear(self.dim_input, self.dim_recurrent)
self.dense4 = nn.Linear(self.dim_recurrent, self.dim_recurrent, bias = False)
self.dense5 = nn.Linear(self.dim_recurrent, self.dim_output)

#There is a defined forward pass

model = RNN(12, 100, 6)

for i in model.parameters():
print(i.shape())


The output is:



torch.Size([100, 12])
torch.Size([100])
torch.Size([100, 100])
torch.Size([100, 12])
torch.Size([100])
torch.Size([100, 100])
torch.Size([6, 100])
torch.Size([6])


The output should, if I'm correct, be:



torch.Size([12, 100])
torch.Size([100])
torch.Size([100, 100])
torch.Size([12, 100])
torch.Size([100])
torch.Size([100, 100])
torch.Size([100, 6])
torch.Size([6])


What is my issue?










share|improve this question




















  • 1





    Please share the relevant code and highlight the exact issue there

    – desertnaut
    Nov 24 '18 at 21:29
















0















I'm attempting to extract the weights and biases from a simple network built in PyTorch. My entire network is composed of nn.Linear layers. When I create a layer by calling nn.Linear(in_dim, out_dim), I expect the parameters that I get from calling model.parameters() for that model to be of shape (in_dim, out_dim) for the weight and (out_dim) for the bias. However, the weights that come out of model.parameters() are instead of shape (out_dim, in_dim).



The intention of my code is to be able to use matrix multiplication to perform a forward pass using only numpy, not any PyTorch. Because of the shape inconsistency, matrix multiplications throw an error. How can I fix this?



Here is my exact code:



class RNN(nn.Module):

def __init__(self, dim_input, dim_recurrent, dim_output):

super(RNN, self).__init__()

self.dim_input = dim_input
self.dim_recurrent = dim_recurrent
self.dim_output = dim_output

self.dense1 = nn.Linear(self.dim_input, self.dim_recurrent)
self.dense2 = nn.Linear(self.dim_recurrent, self.dim_recurrent, bias = False)
self.dense3 = nn.Linear(self.dim_input, self.dim_recurrent)
self.dense4 = nn.Linear(self.dim_recurrent, self.dim_recurrent, bias = False)
self.dense5 = nn.Linear(self.dim_recurrent, self.dim_output)

#There is a defined forward pass

model = RNN(12, 100, 6)

for i in model.parameters():
print(i.shape())


The output is:



torch.Size([100, 12])
torch.Size([100])
torch.Size([100, 100])
torch.Size([100, 12])
torch.Size([100])
torch.Size([100, 100])
torch.Size([6, 100])
torch.Size([6])


The output should, if I'm correct, be:



torch.Size([12, 100])
torch.Size([100])
torch.Size([100, 100])
torch.Size([12, 100])
torch.Size([100])
torch.Size([100, 100])
torch.Size([100, 6])
torch.Size([6])


What is my issue?










share|improve this question




















  • 1





    Please share the relevant code and highlight the exact issue there

    – desertnaut
    Nov 24 '18 at 21:29














0












0








0








I'm attempting to extract the weights and biases from a simple network built in PyTorch. My entire network is composed of nn.Linear layers. When I create a layer by calling nn.Linear(in_dim, out_dim), I expect the parameters that I get from calling model.parameters() for that model to be of shape (in_dim, out_dim) for the weight and (out_dim) for the bias. However, the weights that come out of model.parameters() are instead of shape (out_dim, in_dim).



The intention of my code is to be able to use matrix multiplication to perform a forward pass using only numpy, not any PyTorch. Because of the shape inconsistency, matrix multiplications throw an error. How can I fix this?



Here is my exact code:



class RNN(nn.Module):

def __init__(self, dim_input, dim_recurrent, dim_output):

super(RNN, self).__init__()

self.dim_input = dim_input
self.dim_recurrent = dim_recurrent
self.dim_output = dim_output

self.dense1 = nn.Linear(self.dim_input, self.dim_recurrent)
self.dense2 = nn.Linear(self.dim_recurrent, self.dim_recurrent, bias = False)
self.dense3 = nn.Linear(self.dim_input, self.dim_recurrent)
self.dense4 = nn.Linear(self.dim_recurrent, self.dim_recurrent, bias = False)
self.dense5 = nn.Linear(self.dim_recurrent, self.dim_output)

#There is a defined forward pass

model = RNN(12, 100, 6)

for i in model.parameters():
print(i.shape())


The output is:



torch.Size([100, 12])
torch.Size([100])
torch.Size([100, 100])
torch.Size([100, 12])
torch.Size([100])
torch.Size([100, 100])
torch.Size([6, 100])
torch.Size([6])


The output should, if I'm correct, be:



torch.Size([12, 100])
torch.Size([100])
torch.Size([100, 100])
torch.Size([12, 100])
torch.Size([100])
torch.Size([100, 100])
torch.Size([100, 6])
torch.Size([6])


What is my issue?










share|improve this question
















I'm attempting to extract the weights and biases from a simple network built in PyTorch. My entire network is composed of nn.Linear layers. When I create a layer by calling nn.Linear(in_dim, out_dim), I expect the parameters that I get from calling model.parameters() for that model to be of shape (in_dim, out_dim) for the weight and (out_dim) for the bias. However, the weights that come out of model.parameters() are instead of shape (out_dim, in_dim).



The intention of my code is to be able to use matrix multiplication to perform a forward pass using only numpy, not any PyTorch. Because of the shape inconsistency, matrix multiplications throw an error. How can I fix this?



Here is my exact code:



class RNN(nn.Module):

def __init__(self, dim_input, dim_recurrent, dim_output):

super(RNN, self).__init__()

self.dim_input = dim_input
self.dim_recurrent = dim_recurrent
self.dim_output = dim_output

self.dense1 = nn.Linear(self.dim_input, self.dim_recurrent)
self.dense2 = nn.Linear(self.dim_recurrent, self.dim_recurrent, bias = False)
self.dense3 = nn.Linear(self.dim_input, self.dim_recurrent)
self.dense4 = nn.Linear(self.dim_recurrent, self.dim_recurrent, bias = False)
self.dense5 = nn.Linear(self.dim_recurrent, self.dim_output)

#There is a defined forward pass

model = RNN(12, 100, 6)

for i in model.parameters():
print(i.shape())


The output is:



torch.Size([100, 12])
torch.Size([100])
torch.Size([100, 100])
torch.Size([100, 12])
torch.Size([100])
torch.Size([100, 100])
torch.Size([6, 100])
torch.Size([6])


The output should, if I'm correct, be:



torch.Size([12, 100])
torch.Size([100])
torch.Size([100, 100])
torch.Size([12, 100])
torch.Size([100])
torch.Size([100, 100])
torch.Size([100, 6])
torch.Size([6])


What is my issue?







python machine-learning pytorch






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 24 '18 at 21:34







Samuel Carpenter

















asked Nov 24 '18 at 21:25









Samuel CarpenterSamuel Carpenter

287




287








  • 1





    Please share the relevant code and highlight the exact issue there

    – desertnaut
    Nov 24 '18 at 21:29














  • 1





    Please share the relevant code and highlight the exact issue there

    – desertnaut
    Nov 24 '18 at 21:29








1




1





Please share the relevant code and highlight the exact issue there

– desertnaut
Nov 24 '18 at 21:29





Please share the relevant code and highlight the exact issue there

– desertnaut
Nov 24 '18 at 21:29












1 Answer
1






active

oldest

votes


















1














What you see there is not the (out_dim, in_dim), it is just the shape of the weight matrix. When you call print(model) you can see that input and output features are correct:



RNN(
(dense1): Linear(in_features=12, out_features=100, bias=True)
(dense2): Linear(in_features=100, out_features=100, bias=False)
(dense3): Linear(in_features=12, out_features=100, bias=True)
(dense4): Linear(in_features=100, out_features=100, bias=False)
(dense5): Linear(in_features=100, out_features=6, bias=True)
)


You can check the source code to see that the weights are actually transposed before calling matmul.






nn.Linear is define here:
https://pytorch.org/docs/stable/_modules/torch/nn/modules/linear.html#Linear



You can check the forward, it looks like this:



def forward(self, input):
return F.linear(input, self.weight, self.bias)





F.linear is define here:
https://pytorch.org/docs/stable/_modules/torch/nn/functional.html



The respective line for multiplying the weights is:



output = input.matmul(weight.t())


As mentioned above you can see that the weights are transposed before applying matmul and therefore the shape of the weights is different than you expected.



So if you want to do the matrix multiplication manually, you do:



# dummy input of length 5
input = torch.rand(5, 12)
# apply layer dense1 (without bias, for bias just add + model.dense1.bias)
output_first_layer = input.matmul(model.dense1.weight.t())
print(output_first_layer.shape)


Just as you would expect from your dense1 it returns:



torch.Size([5, 100])


I hope this explains your observations with the shape :)






share|improve this answer


























  • Is there a reason that the weights are transposed?

    – Samuel Carpenter
    Nov 24 '18 at 23:06






  • 1





    @SamuelCarpenter actually I don't know :) I asked a question here: stackoverflow.com/questions/53465608/…

    – blue-phoenox
    Nov 25 '18 at 7:48






  • 1





    @SamuelCarpenter found the answer for this, you can check it out on the link I posted.

    – blue-phoenox
    Nov 25 '18 at 9:26











Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53462493%2fshape-of-pytorch-model-parameter-is-inconsistent-with-how-its-defined-in-the-mo%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









1














What you see there is not the (out_dim, in_dim), it is just the shape of the weight matrix. When you call print(model) you can see that input and output features are correct:



RNN(
(dense1): Linear(in_features=12, out_features=100, bias=True)
(dense2): Linear(in_features=100, out_features=100, bias=False)
(dense3): Linear(in_features=12, out_features=100, bias=True)
(dense4): Linear(in_features=100, out_features=100, bias=False)
(dense5): Linear(in_features=100, out_features=6, bias=True)
)


You can check the source code to see that the weights are actually transposed before calling matmul.






nn.Linear is define here:
https://pytorch.org/docs/stable/_modules/torch/nn/modules/linear.html#Linear



You can check the forward, it looks like this:



def forward(self, input):
return F.linear(input, self.weight, self.bias)





F.linear is define here:
https://pytorch.org/docs/stable/_modules/torch/nn/functional.html



The respective line for multiplying the weights is:



output = input.matmul(weight.t())


As mentioned above you can see that the weights are transposed before applying matmul and therefore the shape of the weights is different than you expected.



So if you want to do the matrix multiplication manually, you do:



# dummy input of length 5
input = torch.rand(5, 12)
# apply layer dense1 (without bias, for bias just add + model.dense1.bias)
output_first_layer = input.matmul(model.dense1.weight.t())
print(output_first_layer.shape)


Just as you would expect from your dense1 it returns:



torch.Size([5, 100])


I hope this explains your observations with the shape :)






share|improve this answer


























  • Is there a reason that the weights are transposed?

    – Samuel Carpenter
    Nov 24 '18 at 23:06






  • 1





    @SamuelCarpenter actually I don't know :) I asked a question here: stackoverflow.com/questions/53465608/…

    – blue-phoenox
    Nov 25 '18 at 7:48






  • 1





    @SamuelCarpenter found the answer for this, you can check it out on the link I posted.

    – blue-phoenox
    Nov 25 '18 at 9:26
















1














What you see there is not the (out_dim, in_dim), it is just the shape of the weight matrix. When you call print(model) you can see that input and output features are correct:



RNN(
(dense1): Linear(in_features=12, out_features=100, bias=True)
(dense2): Linear(in_features=100, out_features=100, bias=False)
(dense3): Linear(in_features=12, out_features=100, bias=True)
(dense4): Linear(in_features=100, out_features=100, bias=False)
(dense5): Linear(in_features=100, out_features=6, bias=True)
)


You can check the source code to see that the weights are actually transposed before calling matmul.






nn.Linear is define here:
https://pytorch.org/docs/stable/_modules/torch/nn/modules/linear.html#Linear



You can check the forward, it looks like this:



def forward(self, input):
return F.linear(input, self.weight, self.bias)





F.linear is define here:
https://pytorch.org/docs/stable/_modules/torch/nn/functional.html



The respective line for multiplying the weights is:



output = input.matmul(weight.t())


As mentioned above you can see that the weights are transposed before applying matmul and therefore the shape of the weights is different than you expected.



So if you want to do the matrix multiplication manually, you do:



# dummy input of length 5
input = torch.rand(5, 12)
# apply layer dense1 (without bias, for bias just add + model.dense1.bias)
output_first_layer = input.matmul(model.dense1.weight.t())
print(output_first_layer.shape)


Just as you would expect from your dense1 it returns:



torch.Size([5, 100])


I hope this explains your observations with the shape :)






share|improve this answer


























  • Is there a reason that the weights are transposed?

    – Samuel Carpenter
    Nov 24 '18 at 23:06






  • 1





    @SamuelCarpenter actually I don't know :) I asked a question here: stackoverflow.com/questions/53465608/…

    – blue-phoenox
    Nov 25 '18 at 7:48






  • 1





    @SamuelCarpenter found the answer for this, you can check it out on the link I posted.

    – blue-phoenox
    Nov 25 '18 at 9:26














1












1








1







What you see there is not the (out_dim, in_dim), it is just the shape of the weight matrix. When you call print(model) you can see that input and output features are correct:



RNN(
(dense1): Linear(in_features=12, out_features=100, bias=True)
(dense2): Linear(in_features=100, out_features=100, bias=False)
(dense3): Linear(in_features=12, out_features=100, bias=True)
(dense4): Linear(in_features=100, out_features=100, bias=False)
(dense5): Linear(in_features=100, out_features=6, bias=True)
)


You can check the source code to see that the weights are actually transposed before calling matmul.






nn.Linear is define here:
https://pytorch.org/docs/stable/_modules/torch/nn/modules/linear.html#Linear



You can check the forward, it looks like this:



def forward(self, input):
return F.linear(input, self.weight, self.bias)





F.linear is define here:
https://pytorch.org/docs/stable/_modules/torch/nn/functional.html



The respective line for multiplying the weights is:



output = input.matmul(weight.t())


As mentioned above you can see that the weights are transposed before applying matmul and therefore the shape of the weights is different than you expected.



So if you want to do the matrix multiplication manually, you do:



# dummy input of length 5
input = torch.rand(5, 12)
# apply layer dense1 (without bias, for bias just add + model.dense1.bias)
output_first_layer = input.matmul(model.dense1.weight.t())
print(output_first_layer.shape)


Just as you would expect from your dense1 it returns:



torch.Size([5, 100])


I hope this explains your observations with the shape :)






share|improve this answer















What you see there is not the (out_dim, in_dim), it is just the shape of the weight matrix. When you call print(model) you can see that input and output features are correct:



RNN(
(dense1): Linear(in_features=12, out_features=100, bias=True)
(dense2): Linear(in_features=100, out_features=100, bias=False)
(dense3): Linear(in_features=12, out_features=100, bias=True)
(dense4): Linear(in_features=100, out_features=100, bias=False)
(dense5): Linear(in_features=100, out_features=6, bias=True)
)


You can check the source code to see that the weights are actually transposed before calling matmul.






nn.Linear is define here:
https://pytorch.org/docs/stable/_modules/torch/nn/modules/linear.html#Linear



You can check the forward, it looks like this:



def forward(self, input):
return F.linear(input, self.weight, self.bias)





F.linear is define here:
https://pytorch.org/docs/stable/_modules/torch/nn/functional.html



The respective line for multiplying the weights is:



output = input.matmul(weight.t())


As mentioned above you can see that the weights are transposed before applying matmul and therefore the shape of the weights is different than you expected.



So if you want to do the matrix multiplication manually, you do:



# dummy input of length 5
input = torch.rand(5, 12)
# apply layer dense1 (without bias, for bias just add + model.dense1.bias)
output_first_layer = input.matmul(model.dense1.weight.t())
print(output_first_layer.shape)


Just as you would expect from your dense1 it returns:



torch.Size([5, 100])


I hope this explains your observations with the shape :)







share|improve this answer














share|improve this answer



share|improve this answer








edited Nov 24 '18 at 22:26

























answered Nov 24 '18 at 22:16









blue-phoenoxblue-phoenox

4,11191543




4,11191543













  • Is there a reason that the weights are transposed?

    – Samuel Carpenter
    Nov 24 '18 at 23:06






  • 1





    @SamuelCarpenter actually I don't know :) I asked a question here: stackoverflow.com/questions/53465608/…

    – blue-phoenox
    Nov 25 '18 at 7:48






  • 1





    @SamuelCarpenter found the answer for this, you can check it out on the link I posted.

    – blue-phoenox
    Nov 25 '18 at 9:26



















  • Is there a reason that the weights are transposed?

    – Samuel Carpenter
    Nov 24 '18 at 23:06






  • 1





    @SamuelCarpenter actually I don't know :) I asked a question here: stackoverflow.com/questions/53465608/…

    – blue-phoenox
    Nov 25 '18 at 7:48






  • 1





    @SamuelCarpenter found the answer for this, you can check it out on the link I posted.

    – blue-phoenox
    Nov 25 '18 at 9:26

















Is there a reason that the weights are transposed?

– Samuel Carpenter
Nov 24 '18 at 23:06





Is there a reason that the weights are transposed?

– Samuel Carpenter
Nov 24 '18 at 23:06




1




1





@SamuelCarpenter actually I don't know :) I asked a question here: stackoverflow.com/questions/53465608/…

– blue-phoenox
Nov 25 '18 at 7:48





@SamuelCarpenter actually I don't know :) I asked a question here: stackoverflow.com/questions/53465608/…

– blue-phoenox
Nov 25 '18 at 7:48




1




1





@SamuelCarpenter found the answer for this, you can check it out on the link I posted.

– blue-phoenox
Nov 25 '18 at 9:26





@SamuelCarpenter found the answer for this, you can check it out on the link I posted.

– blue-phoenox
Nov 25 '18 at 9:26


















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53462493%2fshape-of-pytorch-model-parameter-is-inconsistent-with-how-its-defined-in-the-mo%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

A CLEAN and SIMPLE way to add appendices to Table of Contents and bookmarks

Calculate evaluation metrics using cross_val_predict sklearn

Insert data from modal to MySQL (multiple modal on website)