Shape of pytorch model.parameter is inconsistent with how it's defined in the model
I'm attempting to extract the weights and biases from a simple network built in PyTorch. My entire network is composed of nn.Linear layers. When I create a layer by calling nn.Linear(in_dim, out_dim)
, I expect the parameters that I get from calling model.parameters()
for that model to be of shape (in_dim, out_dim)
for the weight and (out_dim)
for the bias. However, the weights that come out of model.parameters()
are instead of shape (out_dim, in_dim)
.
The intention of my code is to be able to use matrix multiplication to perform a forward pass using only numpy, not any PyTorch. Because of the shape inconsistency, matrix multiplications throw an error. How can I fix this?
Here is my exact code:
class RNN(nn.Module):
def __init__(self, dim_input, dim_recurrent, dim_output):
super(RNN, self).__init__()
self.dim_input = dim_input
self.dim_recurrent = dim_recurrent
self.dim_output = dim_output
self.dense1 = nn.Linear(self.dim_input, self.dim_recurrent)
self.dense2 = nn.Linear(self.dim_recurrent, self.dim_recurrent, bias = False)
self.dense3 = nn.Linear(self.dim_input, self.dim_recurrent)
self.dense4 = nn.Linear(self.dim_recurrent, self.dim_recurrent, bias = False)
self.dense5 = nn.Linear(self.dim_recurrent, self.dim_output)
#There is a defined forward pass
model = RNN(12, 100, 6)
for i in model.parameters():
print(i.shape())
The output is:
torch.Size([100, 12])
torch.Size([100])
torch.Size([100, 100])
torch.Size([100, 12])
torch.Size([100])
torch.Size([100, 100])
torch.Size([6, 100])
torch.Size([6])
The output should, if I'm correct, be:
torch.Size([12, 100])
torch.Size([100])
torch.Size([100, 100])
torch.Size([12, 100])
torch.Size([100])
torch.Size([100, 100])
torch.Size([100, 6])
torch.Size([6])
What is my issue?
python machine-learning pytorch
add a comment |
I'm attempting to extract the weights and biases from a simple network built in PyTorch. My entire network is composed of nn.Linear layers. When I create a layer by calling nn.Linear(in_dim, out_dim)
, I expect the parameters that I get from calling model.parameters()
for that model to be of shape (in_dim, out_dim)
for the weight and (out_dim)
for the bias. However, the weights that come out of model.parameters()
are instead of shape (out_dim, in_dim)
.
The intention of my code is to be able to use matrix multiplication to perform a forward pass using only numpy, not any PyTorch. Because of the shape inconsistency, matrix multiplications throw an error. How can I fix this?
Here is my exact code:
class RNN(nn.Module):
def __init__(self, dim_input, dim_recurrent, dim_output):
super(RNN, self).__init__()
self.dim_input = dim_input
self.dim_recurrent = dim_recurrent
self.dim_output = dim_output
self.dense1 = nn.Linear(self.dim_input, self.dim_recurrent)
self.dense2 = nn.Linear(self.dim_recurrent, self.dim_recurrent, bias = False)
self.dense3 = nn.Linear(self.dim_input, self.dim_recurrent)
self.dense4 = nn.Linear(self.dim_recurrent, self.dim_recurrent, bias = False)
self.dense5 = nn.Linear(self.dim_recurrent, self.dim_output)
#There is a defined forward pass
model = RNN(12, 100, 6)
for i in model.parameters():
print(i.shape())
The output is:
torch.Size([100, 12])
torch.Size([100])
torch.Size([100, 100])
torch.Size([100, 12])
torch.Size([100])
torch.Size([100, 100])
torch.Size([6, 100])
torch.Size([6])
The output should, if I'm correct, be:
torch.Size([12, 100])
torch.Size([100])
torch.Size([100, 100])
torch.Size([12, 100])
torch.Size([100])
torch.Size([100, 100])
torch.Size([100, 6])
torch.Size([6])
What is my issue?
python machine-learning pytorch
1
Please share the relevant code and highlight the exact issue there
– desertnaut
Nov 24 '18 at 21:29
add a comment |
I'm attempting to extract the weights and biases from a simple network built in PyTorch. My entire network is composed of nn.Linear layers. When I create a layer by calling nn.Linear(in_dim, out_dim)
, I expect the parameters that I get from calling model.parameters()
for that model to be of shape (in_dim, out_dim)
for the weight and (out_dim)
for the bias. However, the weights that come out of model.parameters()
are instead of shape (out_dim, in_dim)
.
The intention of my code is to be able to use matrix multiplication to perform a forward pass using only numpy, not any PyTorch. Because of the shape inconsistency, matrix multiplications throw an error. How can I fix this?
Here is my exact code:
class RNN(nn.Module):
def __init__(self, dim_input, dim_recurrent, dim_output):
super(RNN, self).__init__()
self.dim_input = dim_input
self.dim_recurrent = dim_recurrent
self.dim_output = dim_output
self.dense1 = nn.Linear(self.dim_input, self.dim_recurrent)
self.dense2 = nn.Linear(self.dim_recurrent, self.dim_recurrent, bias = False)
self.dense3 = nn.Linear(self.dim_input, self.dim_recurrent)
self.dense4 = nn.Linear(self.dim_recurrent, self.dim_recurrent, bias = False)
self.dense5 = nn.Linear(self.dim_recurrent, self.dim_output)
#There is a defined forward pass
model = RNN(12, 100, 6)
for i in model.parameters():
print(i.shape())
The output is:
torch.Size([100, 12])
torch.Size([100])
torch.Size([100, 100])
torch.Size([100, 12])
torch.Size([100])
torch.Size([100, 100])
torch.Size([6, 100])
torch.Size([6])
The output should, if I'm correct, be:
torch.Size([12, 100])
torch.Size([100])
torch.Size([100, 100])
torch.Size([12, 100])
torch.Size([100])
torch.Size([100, 100])
torch.Size([100, 6])
torch.Size([6])
What is my issue?
python machine-learning pytorch
I'm attempting to extract the weights and biases from a simple network built in PyTorch. My entire network is composed of nn.Linear layers. When I create a layer by calling nn.Linear(in_dim, out_dim)
, I expect the parameters that I get from calling model.parameters()
for that model to be of shape (in_dim, out_dim)
for the weight and (out_dim)
for the bias. However, the weights that come out of model.parameters()
are instead of shape (out_dim, in_dim)
.
The intention of my code is to be able to use matrix multiplication to perform a forward pass using only numpy, not any PyTorch. Because of the shape inconsistency, matrix multiplications throw an error. How can I fix this?
Here is my exact code:
class RNN(nn.Module):
def __init__(self, dim_input, dim_recurrent, dim_output):
super(RNN, self).__init__()
self.dim_input = dim_input
self.dim_recurrent = dim_recurrent
self.dim_output = dim_output
self.dense1 = nn.Linear(self.dim_input, self.dim_recurrent)
self.dense2 = nn.Linear(self.dim_recurrent, self.dim_recurrent, bias = False)
self.dense3 = nn.Linear(self.dim_input, self.dim_recurrent)
self.dense4 = nn.Linear(self.dim_recurrent, self.dim_recurrent, bias = False)
self.dense5 = nn.Linear(self.dim_recurrent, self.dim_output)
#There is a defined forward pass
model = RNN(12, 100, 6)
for i in model.parameters():
print(i.shape())
The output is:
torch.Size([100, 12])
torch.Size([100])
torch.Size([100, 100])
torch.Size([100, 12])
torch.Size([100])
torch.Size([100, 100])
torch.Size([6, 100])
torch.Size([6])
The output should, if I'm correct, be:
torch.Size([12, 100])
torch.Size([100])
torch.Size([100, 100])
torch.Size([12, 100])
torch.Size([100])
torch.Size([100, 100])
torch.Size([100, 6])
torch.Size([6])
What is my issue?
python machine-learning pytorch
python machine-learning pytorch
edited Nov 24 '18 at 21:34
Samuel Carpenter
asked Nov 24 '18 at 21:25
Samuel CarpenterSamuel Carpenter
287
287
1
Please share the relevant code and highlight the exact issue there
– desertnaut
Nov 24 '18 at 21:29
add a comment |
1
Please share the relevant code and highlight the exact issue there
– desertnaut
Nov 24 '18 at 21:29
1
1
Please share the relevant code and highlight the exact issue there
– desertnaut
Nov 24 '18 at 21:29
Please share the relevant code and highlight the exact issue there
– desertnaut
Nov 24 '18 at 21:29
add a comment |
1 Answer
1
active
oldest
votes
What you see there is not the (out_dim, in_dim), it is just the shape of the weight matrix. When you call print(model)
you can see that input and output features are correct:
RNN(
(dense1): Linear(in_features=12, out_features=100, bias=True)
(dense2): Linear(in_features=100, out_features=100, bias=False)
(dense3): Linear(in_features=12, out_features=100, bias=True)
(dense4): Linear(in_features=100, out_features=100, bias=False)
(dense5): Linear(in_features=100, out_features=6, bias=True)
)
You can check the source code to see that the weights are actually transposed before calling matmul
.
nn.Linear
is define here:
https://pytorch.org/docs/stable/_modules/torch/nn/modules/linear.html#Linear
You can check the forward
, it looks like this:
def forward(self, input):
return F.linear(input, self.weight, self.bias)
F.linear
is define here:
https://pytorch.org/docs/stable/_modules/torch/nn/functional.html
The respective line for multiplying the weights is:
output = input.matmul(weight.t())
As mentioned above you can see that the weights are transposed before applying matmul
and therefore the shape of the weights is different than you expected.
So if you want to do the matrix multiplication manually, you do:
# dummy input of length 5
input = torch.rand(5, 12)
# apply layer dense1 (without bias, for bias just add + model.dense1.bias)
output_first_layer = input.matmul(model.dense1.weight.t())
print(output_first_layer.shape)
Just as you would expect from your dense1
it returns:
torch.Size([5, 100])
I hope this explains your observations with the shape :)
Is there a reason that the weights are transposed?
– Samuel Carpenter
Nov 24 '18 at 23:06
1
@SamuelCarpenter actually I don't know :) I asked a question here: stackoverflow.com/questions/53465608/…
– blue-phoenox
Nov 25 '18 at 7:48
1
@SamuelCarpenter found the answer for this, you can check it out on the link I posted.
– blue-phoenox
Nov 25 '18 at 9:26
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53462493%2fshape-of-pytorch-model-parameter-is-inconsistent-with-how-its-defined-in-the-mo%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
What you see there is not the (out_dim, in_dim), it is just the shape of the weight matrix. When you call print(model)
you can see that input and output features are correct:
RNN(
(dense1): Linear(in_features=12, out_features=100, bias=True)
(dense2): Linear(in_features=100, out_features=100, bias=False)
(dense3): Linear(in_features=12, out_features=100, bias=True)
(dense4): Linear(in_features=100, out_features=100, bias=False)
(dense5): Linear(in_features=100, out_features=6, bias=True)
)
You can check the source code to see that the weights are actually transposed before calling matmul
.
nn.Linear
is define here:
https://pytorch.org/docs/stable/_modules/torch/nn/modules/linear.html#Linear
You can check the forward
, it looks like this:
def forward(self, input):
return F.linear(input, self.weight, self.bias)
F.linear
is define here:
https://pytorch.org/docs/stable/_modules/torch/nn/functional.html
The respective line for multiplying the weights is:
output = input.matmul(weight.t())
As mentioned above you can see that the weights are transposed before applying matmul
and therefore the shape of the weights is different than you expected.
So if you want to do the matrix multiplication manually, you do:
# dummy input of length 5
input = torch.rand(5, 12)
# apply layer dense1 (without bias, for bias just add + model.dense1.bias)
output_first_layer = input.matmul(model.dense1.weight.t())
print(output_first_layer.shape)
Just as you would expect from your dense1
it returns:
torch.Size([5, 100])
I hope this explains your observations with the shape :)
Is there a reason that the weights are transposed?
– Samuel Carpenter
Nov 24 '18 at 23:06
1
@SamuelCarpenter actually I don't know :) I asked a question here: stackoverflow.com/questions/53465608/…
– blue-phoenox
Nov 25 '18 at 7:48
1
@SamuelCarpenter found the answer for this, you can check it out on the link I posted.
– blue-phoenox
Nov 25 '18 at 9:26
add a comment |
What you see there is not the (out_dim, in_dim), it is just the shape of the weight matrix. When you call print(model)
you can see that input and output features are correct:
RNN(
(dense1): Linear(in_features=12, out_features=100, bias=True)
(dense2): Linear(in_features=100, out_features=100, bias=False)
(dense3): Linear(in_features=12, out_features=100, bias=True)
(dense4): Linear(in_features=100, out_features=100, bias=False)
(dense5): Linear(in_features=100, out_features=6, bias=True)
)
You can check the source code to see that the weights are actually transposed before calling matmul
.
nn.Linear
is define here:
https://pytorch.org/docs/stable/_modules/torch/nn/modules/linear.html#Linear
You can check the forward
, it looks like this:
def forward(self, input):
return F.linear(input, self.weight, self.bias)
F.linear
is define here:
https://pytorch.org/docs/stable/_modules/torch/nn/functional.html
The respective line for multiplying the weights is:
output = input.matmul(weight.t())
As mentioned above you can see that the weights are transposed before applying matmul
and therefore the shape of the weights is different than you expected.
So if you want to do the matrix multiplication manually, you do:
# dummy input of length 5
input = torch.rand(5, 12)
# apply layer dense1 (without bias, for bias just add + model.dense1.bias)
output_first_layer = input.matmul(model.dense1.weight.t())
print(output_first_layer.shape)
Just as you would expect from your dense1
it returns:
torch.Size([5, 100])
I hope this explains your observations with the shape :)
Is there a reason that the weights are transposed?
– Samuel Carpenter
Nov 24 '18 at 23:06
1
@SamuelCarpenter actually I don't know :) I asked a question here: stackoverflow.com/questions/53465608/…
– blue-phoenox
Nov 25 '18 at 7:48
1
@SamuelCarpenter found the answer for this, you can check it out on the link I posted.
– blue-phoenox
Nov 25 '18 at 9:26
add a comment |
What you see there is not the (out_dim, in_dim), it is just the shape of the weight matrix. When you call print(model)
you can see that input and output features are correct:
RNN(
(dense1): Linear(in_features=12, out_features=100, bias=True)
(dense2): Linear(in_features=100, out_features=100, bias=False)
(dense3): Linear(in_features=12, out_features=100, bias=True)
(dense4): Linear(in_features=100, out_features=100, bias=False)
(dense5): Linear(in_features=100, out_features=6, bias=True)
)
You can check the source code to see that the weights are actually transposed before calling matmul
.
nn.Linear
is define here:
https://pytorch.org/docs/stable/_modules/torch/nn/modules/linear.html#Linear
You can check the forward
, it looks like this:
def forward(self, input):
return F.linear(input, self.weight, self.bias)
F.linear
is define here:
https://pytorch.org/docs/stable/_modules/torch/nn/functional.html
The respective line for multiplying the weights is:
output = input.matmul(weight.t())
As mentioned above you can see that the weights are transposed before applying matmul
and therefore the shape of the weights is different than you expected.
So if you want to do the matrix multiplication manually, you do:
# dummy input of length 5
input = torch.rand(5, 12)
# apply layer dense1 (without bias, for bias just add + model.dense1.bias)
output_first_layer = input.matmul(model.dense1.weight.t())
print(output_first_layer.shape)
Just as you would expect from your dense1
it returns:
torch.Size([5, 100])
I hope this explains your observations with the shape :)
What you see there is not the (out_dim, in_dim), it is just the shape of the weight matrix. When you call print(model)
you can see that input and output features are correct:
RNN(
(dense1): Linear(in_features=12, out_features=100, bias=True)
(dense2): Linear(in_features=100, out_features=100, bias=False)
(dense3): Linear(in_features=12, out_features=100, bias=True)
(dense4): Linear(in_features=100, out_features=100, bias=False)
(dense5): Linear(in_features=100, out_features=6, bias=True)
)
You can check the source code to see that the weights are actually transposed before calling matmul
.
nn.Linear
is define here:
https://pytorch.org/docs/stable/_modules/torch/nn/modules/linear.html#Linear
You can check the forward
, it looks like this:
def forward(self, input):
return F.linear(input, self.weight, self.bias)
F.linear
is define here:
https://pytorch.org/docs/stable/_modules/torch/nn/functional.html
The respective line for multiplying the weights is:
output = input.matmul(weight.t())
As mentioned above you can see that the weights are transposed before applying matmul
and therefore the shape of the weights is different than you expected.
So if you want to do the matrix multiplication manually, you do:
# dummy input of length 5
input = torch.rand(5, 12)
# apply layer dense1 (without bias, for bias just add + model.dense1.bias)
output_first_layer = input.matmul(model.dense1.weight.t())
print(output_first_layer.shape)
Just as you would expect from your dense1
it returns:
torch.Size([5, 100])
I hope this explains your observations with the shape :)
edited Nov 24 '18 at 22:26
answered Nov 24 '18 at 22:16
blue-phoenoxblue-phoenox
4,11191543
4,11191543
Is there a reason that the weights are transposed?
– Samuel Carpenter
Nov 24 '18 at 23:06
1
@SamuelCarpenter actually I don't know :) I asked a question here: stackoverflow.com/questions/53465608/…
– blue-phoenox
Nov 25 '18 at 7:48
1
@SamuelCarpenter found the answer for this, you can check it out on the link I posted.
– blue-phoenox
Nov 25 '18 at 9:26
add a comment |
Is there a reason that the weights are transposed?
– Samuel Carpenter
Nov 24 '18 at 23:06
1
@SamuelCarpenter actually I don't know :) I asked a question here: stackoverflow.com/questions/53465608/…
– blue-phoenox
Nov 25 '18 at 7:48
1
@SamuelCarpenter found the answer for this, you can check it out on the link I posted.
– blue-phoenox
Nov 25 '18 at 9:26
Is there a reason that the weights are transposed?
– Samuel Carpenter
Nov 24 '18 at 23:06
Is there a reason that the weights are transposed?
– Samuel Carpenter
Nov 24 '18 at 23:06
1
1
@SamuelCarpenter actually I don't know :) I asked a question here: stackoverflow.com/questions/53465608/…
– blue-phoenox
Nov 25 '18 at 7:48
@SamuelCarpenter actually I don't know :) I asked a question here: stackoverflow.com/questions/53465608/…
– blue-phoenox
Nov 25 '18 at 7:48
1
1
@SamuelCarpenter found the answer for this, you can check it out on the link I posted.
– blue-phoenox
Nov 25 '18 at 9:26
@SamuelCarpenter found the answer for this, you can check it out on the link I posted.
– blue-phoenox
Nov 25 '18 at 9:26
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53462493%2fshape-of-pytorch-model-parameter-is-inconsistent-with-how-its-defined-in-the-mo%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
Please share the relevant code and highlight the exact issue there
– desertnaut
Nov 24 '18 at 21:29