Confusion about input shape for Keras Embedding layer
I'm trying to use the Keras embedding layer to create my own CBoW implementation to see how it works.
I've generated outputs represented by a vector of the context word I'm searching for with size equal to my vocab. I've also generated inputs so that each context word has X many nearby words represented by their one-hot encoded vectors.
So for example if my sentence is:
"I ran over the fence to find my dog"
using window size 2, I could generate the following input/output:
[[over, the, to, find], fence] where 'fence' is my context word, 'over', 'the', 'to', 'find' are my nearby words with window 2 (2 in front, 2 in back).
Using sample vocab size of 500 and 100 training samples, after one-hot encoding my input and output, it would have the following dimensions:
y.shape -> (100,500)
X.shape -> (100,4,500)
That is, I have 100 outputs each represented by a 500-sized vector. I have 100 inputs each represented by a series of 4 500-sized vectors.
I have a simple model defined as:
model = Sequential()
model.add(Embedding(input_dim=vocabulary_size, output_dim=embedding_size, input_length=2*window_size))
#take average of context words at hidden layer
model.add(Lambda(lambda x: K.mean(x, axis = 1), output_shape=(embedding_size,)))
model.add(Dense(vocabulary_size, activation='softmax'))
model.compile(loss = 'categorical_crossentropy', optimizer = 'adam')
However, when I try to fit my model, I get a dimensional exception:
model.fit(X, y, batch_size=10, epochs=2, verbose=1)
ValueError: Error when checking input: expected embedding_6_input to have 2 dimensions, but got array with shape (100, 4, 500)
Now, I can only assume I'm using the embedding layer wrongly. I've read both this CrossValidated Question and the Keras documentation.
I'm still not sure exactly how the inputs of this embedding layer works. I'm fairly certain my input_dim
and output_dim
are correct, which leaves input_length
. According to the CrossValidated, my input_length
is the length of my sequence. According to Keras, my input should be of dimension (batch_size, input_length)
.
If my inputs are 4 words each represented by a word vector of size vocab_size
, how do I input this to the model?
python machine-learning keras word2vec word-embedding
add a comment |
I'm trying to use the Keras embedding layer to create my own CBoW implementation to see how it works.
I've generated outputs represented by a vector of the context word I'm searching for with size equal to my vocab. I've also generated inputs so that each context word has X many nearby words represented by their one-hot encoded vectors.
So for example if my sentence is:
"I ran over the fence to find my dog"
using window size 2, I could generate the following input/output:
[[over, the, to, find], fence] where 'fence' is my context word, 'over', 'the', 'to', 'find' are my nearby words with window 2 (2 in front, 2 in back).
Using sample vocab size of 500 and 100 training samples, after one-hot encoding my input and output, it would have the following dimensions:
y.shape -> (100,500)
X.shape -> (100,4,500)
That is, I have 100 outputs each represented by a 500-sized vector. I have 100 inputs each represented by a series of 4 500-sized vectors.
I have a simple model defined as:
model = Sequential()
model.add(Embedding(input_dim=vocabulary_size, output_dim=embedding_size, input_length=2*window_size))
#take average of context words at hidden layer
model.add(Lambda(lambda x: K.mean(x, axis = 1), output_shape=(embedding_size,)))
model.add(Dense(vocabulary_size, activation='softmax'))
model.compile(loss = 'categorical_crossentropy', optimizer = 'adam')
However, when I try to fit my model, I get a dimensional exception:
model.fit(X, y, batch_size=10, epochs=2, verbose=1)
ValueError: Error when checking input: expected embedding_6_input to have 2 dimensions, but got array with shape (100, 4, 500)
Now, I can only assume I'm using the embedding layer wrongly. I've read both this CrossValidated Question and the Keras documentation.
I'm still not sure exactly how the inputs of this embedding layer works. I'm fairly certain my input_dim
and output_dim
are correct, which leaves input_length
. According to the CrossValidated, my input_length
is the length of my sequence. According to Keras, my input should be of dimension (batch_size, input_length)
.
If my inputs are 4 words each represented by a word vector of size vocab_size
, how do I input this to the model?
python machine-learning keras word2vec word-embedding
If the answer resolved your issue, kindly accept it by clicking on the checkmark next to the answer to mark it as "answered" - see What should I do when someone answers my question?
– today
Dec 6 '18 at 12:58
add a comment |
I'm trying to use the Keras embedding layer to create my own CBoW implementation to see how it works.
I've generated outputs represented by a vector of the context word I'm searching for with size equal to my vocab. I've also generated inputs so that each context word has X many nearby words represented by their one-hot encoded vectors.
So for example if my sentence is:
"I ran over the fence to find my dog"
using window size 2, I could generate the following input/output:
[[over, the, to, find], fence] where 'fence' is my context word, 'over', 'the', 'to', 'find' are my nearby words with window 2 (2 in front, 2 in back).
Using sample vocab size of 500 and 100 training samples, after one-hot encoding my input and output, it would have the following dimensions:
y.shape -> (100,500)
X.shape -> (100,4,500)
That is, I have 100 outputs each represented by a 500-sized vector. I have 100 inputs each represented by a series of 4 500-sized vectors.
I have a simple model defined as:
model = Sequential()
model.add(Embedding(input_dim=vocabulary_size, output_dim=embedding_size, input_length=2*window_size))
#take average of context words at hidden layer
model.add(Lambda(lambda x: K.mean(x, axis = 1), output_shape=(embedding_size,)))
model.add(Dense(vocabulary_size, activation='softmax'))
model.compile(loss = 'categorical_crossentropy', optimizer = 'adam')
However, when I try to fit my model, I get a dimensional exception:
model.fit(X, y, batch_size=10, epochs=2, verbose=1)
ValueError: Error when checking input: expected embedding_6_input to have 2 dimensions, but got array with shape (100, 4, 500)
Now, I can only assume I'm using the embedding layer wrongly. I've read both this CrossValidated Question and the Keras documentation.
I'm still not sure exactly how the inputs of this embedding layer works. I'm fairly certain my input_dim
and output_dim
are correct, which leaves input_length
. According to the CrossValidated, my input_length
is the length of my sequence. According to Keras, my input should be of dimension (batch_size, input_length)
.
If my inputs are 4 words each represented by a word vector of size vocab_size
, how do I input this to the model?
python machine-learning keras word2vec word-embedding
I'm trying to use the Keras embedding layer to create my own CBoW implementation to see how it works.
I've generated outputs represented by a vector of the context word I'm searching for with size equal to my vocab. I've also generated inputs so that each context word has X many nearby words represented by their one-hot encoded vectors.
So for example if my sentence is:
"I ran over the fence to find my dog"
using window size 2, I could generate the following input/output:
[[over, the, to, find], fence] where 'fence' is my context word, 'over', 'the', 'to', 'find' are my nearby words with window 2 (2 in front, 2 in back).
Using sample vocab size of 500 and 100 training samples, after one-hot encoding my input and output, it would have the following dimensions:
y.shape -> (100,500)
X.shape -> (100,4,500)
That is, I have 100 outputs each represented by a 500-sized vector. I have 100 inputs each represented by a series of 4 500-sized vectors.
I have a simple model defined as:
model = Sequential()
model.add(Embedding(input_dim=vocabulary_size, output_dim=embedding_size, input_length=2*window_size))
#take average of context words at hidden layer
model.add(Lambda(lambda x: K.mean(x, axis = 1), output_shape=(embedding_size,)))
model.add(Dense(vocabulary_size, activation='softmax'))
model.compile(loss = 'categorical_crossentropy', optimizer = 'adam')
However, when I try to fit my model, I get a dimensional exception:
model.fit(X, y, batch_size=10, epochs=2, verbose=1)
ValueError: Error when checking input: expected embedding_6_input to have 2 dimensions, but got array with shape (100, 4, 500)
Now, I can only assume I'm using the embedding layer wrongly. I've read both this CrossValidated Question and the Keras documentation.
I'm still not sure exactly how the inputs of this embedding layer works. I'm fairly certain my input_dim
and output_dim
are correct, which leaves input_length
. According to the CrossValidated, my input_length
is the length of my sequence. According to Keras, my input should be of dimension (batch_size, input_length)
.
If my inputs are 4 words each represented by a word vector of size vocab_size
, how do I input this to the model?
python machine-learning keras word2vec word-embedding
python machine-learning keras word2vec word-embedding
edited Nov 28 '18 at 12:16
today
10.7k21737
10.7k21737
asked Nov 25 '18 at 7:22
KevinKevin
1,08452641
1,08452641
If the answer resolved your issue, kindly accept it by clicking on the checkmark next to the answer to mark it as "answered" - see What should I do when someone answers my question?
– today
Dec 6 '18 at 12:58
add a comment |
If the answer resolved your issue, kindly accept it by clicking on the checkmark next to the answer to mark it as "answered" - see What should I do when someone answers my question?
– today
Dec 6 '18 at 12:58
If the answer resolved your issue, kindly accept it by clicking on the checkmark next to the answer to mark it as "answered" - see What should I do when someone answers my question?
– today
Dec 6 '18 at 12:58
If the answer resolved your issue, kindly accept it by clicking on the checkmark next to the answer to mark it as "answered" - see What should I do when someone answers my question?
– today
Dec 6 '18 at 12:58
add a comment |
1 Answer
1
active
oldest
votes
The problem is that you are thinking about the embedding layer in a wrong way. An Embedding layer is just a trainable look-up table: you give it an integer, which is the index of the word in the vocabulary, and it returns the word-vector (i.e. word embedding) of the given index. Therefore, its input must be the indices of the words in a sentence.
As an example, if the indices of the words "over", "the", "to" and "find" are 43, 6, 9 and 33 respectively, then the input of the Embedding layer would be an array of those indices, i.e. [43, 6, 9, 33]
. Therefore, the training data must have a shape of (num_samples, num_words_in_a_sentence)
. In your case, it would be (100, 4)
. In other words, you don't need to one-hot encode the words for the input data. You can also use word indices as the labels as well if you use sparse_categorical_crossentropy
as the loss function instead.
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53465475%2fconfusion-about-input-shape-for-keras-embedding-layer%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
The problem is that you are thinking about the embedding layer in a wrong way. An Embedding layer is just a trainable look-up table: you give it an integer, which is the index of the word in the vocabulary, and it returns the word-vector (i.e. word embedding) of the given index. Therefore, its input must be the indices of the words in a sentence.
As an example, if the indices of the words "over", "the", "to" and "find" are 43, 6, 9 and 33 respectively, then the input of the Embedding layer would be an array of those indices, i.e. [43, 6, 9, 33]
. Therefore, the training data must have a shape of (num_samples, num_words_in_a_sentence)
. In your case, it would be (100, 4)
. In other words, you don't need to one-hot encode the words for the input data. You can also use word indices as the labels as well if you use sparse_categorical_crossentropy
as the loss function instead.
add a comment |
The problem is that you are thinking about the embedding layer in a wrong way. An Embedding layer is just a trainable look-up table: you give it an integer, which is the index of the word in the vocabulary, and it returns the word-vector (i.e. word embedding) of the given index. Therefore, its input must be the indices of the words in a sentence.
As an example, if the indices of the words "over", "the", "to" and "find" are 43, 6, 9 and 33 respectively, then the input of the Embedding layer would be an array of those indices, i.e. [43, 6, 9, 33]
. Therefore, the training data must have a shape of (num_samples, num_words_in_a_sentence)
. In your case, it would be (100, 4)
. In other words, you don't need to one-hot encode the words for the input data. You can also use word indices as the labels as well if you use sparse_categorical_crossentropy
as the loss function instead.
add a comment |
The problem is that you are thinking about the embedding layer in a wrong way. An Embedding layer is just a trainable look-up table: you give it an integer, which is the index of the word in the vocabulary, and it returns the word-vector (i.e. word embedding) of the given index. Therefore, its input must be the indices of the words in a sentence.
As an example, if the indices of the words "over", "the", "to" and "find" are 43, 6, 9 and 33 respectively, then the input of the Embedding layer would be an array of those indices, i.e. [43, 6, 9, 33]
. Therefore, the training data must have a shape of (num_samples, num_words_in_a_sentence)
. In your case, it would be (100, 4)
. In other words, you don't need to one-hot encode the words for the input data. You can also use word indices as the labels as well if you use sparse_categorical_crossentropy
as the loss function instead.
The problem is that you are thinking about the embedding layer in a wrong way. An Embedding layer is just a trainable look-up table: you give it an integer, which is the index of the word in the vocabulary, and it returns the word-vector (i.e. word embedding) of the given index. Therefore, its input must be the indices of the words in a sentence.
As an example, if the indices of the words "over", "the", "to" and "find" are 43, 6, 9 and 33 respectively, then the input of the Embedding layer would be an array of those indices, i.e. [43, 6, 9, 33]
. Therefore, the training data must have a shape of (num_samples, num_words_in_a_sentence)
. In your case, it would be (100, 4)
. In other words, you don't need to one-hot encode the words for the input data. You can also use word indices as the labels as well if you use sparse_categorical_crossentropy
as the loss function instead.
edited Nov 28 '18 at 12:19
answered Nov 28 '18 at 12:11
todaytoday
10.7k21737
10.7k21737
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53465475%2fconfusion-about-input-shape-for-keras-embedding-layer%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
If the answer resolved your issue, kindly accept it by clicking on the checkmark next to the answer to mark it as "answered" - see What should I do when someone answers my question?
– today
Dec 6 '18 at 12:58