Why the “get_output_size” is len(alphabet) + 1 not len(alphabet) in the Keras OCR example?
up vote
-2
down vote
favorite
I am just a Keras beginner and I try to implement a OCR project by Keras.So I try to learn from Keras OCR example.Here's a link!
I do not understand why "get_output_size" in class TextImageGenerator is len(alphabet) + 1 but not len(alphabet)?
I will appreciate it if someone can tell me why ..
keras ocr
add a comment |
up vote
-2
down vote
favorite
I am just a Keras beginner and I try to implement a OCR project by Keras.So I try to learn from Keras OCR example.Here's a link!
I do not understand why "get_output_size" in class TextImageGenerator is len(alphabet) + 1 but not len(alphabet)?
I will appreciate it if someone can tell me why ..
keras ocr
add a comment |
up vote
-2
down vote
favorite
up vote
-2
down vote
favorite
I am just a Keras beginner and I try to implement a OCR project by Keras.So I try to learn from Keras OCR example.Here's a link!
I do not understand why "get_output_size" in class TextImageGenerator is len(alphabet) + 1 but not len(alphabet)?
I will appreciate it if someone can tell me why ..
keras ocr
I am just a Keras beginner and I try to implement a OCR project by Keras.So I try to learn from Keras OCR example.Here's a link!
I do not understand why "get_output_size" in class TextImageGenerator is len(alphabet) + 1 but not len(alphabet)?
I will appreciate it if someone can tell me why ..
keras ocr
keras ocr
asked Nov 22 at 6:11
CaptainSama
11
11
add a comment |
add a comment |
2 Answers
2
active
oldest
votes
up vote
0
down vote
It's related to the CTC layer used as cost function. Maybe reading the scientific papers will give you more perspective, but it's related to a "extra" class used by the model to say ("there is no letter").
Paper by Graves explaining the algorithm behind
I will read this paper..thank you .
– CaptainSama
Nov 27 at 8:16
add a comment |
up vote
0
down vote
There is one extra-character needed in neural networks trained with CTC loss. This extra-character essentially means "no character seen at this position" and is called CTC blank.
It is used to allow different alignments of a text or to allow some white-space between characters (think of an image containing " hello" or "hello " with whitespace around them, for both you want to recognize "hello").
When recognizing the text, these blanks are removed: e.g. when using best path decoding, the best-scoring character at each position is taken, but the blanks will be removed.
To get a better idea of this special CTC blank character, let's look at the illustration below. The output of the neural network contains the characters a, b and the CTC blank (denoted as "-").
Let's pick the best-scoring characters for each position t0...t4, this gives us "aaa-b". Best path decoding removes repeated characters, this gives us "a-b", and finally removes all blanks, which gives us "ab".
If you want some more information, you can look at my CTC article, or this article, or the original paper.
Thank you for helping me to explain this problem...I will read some paper about CTC...
– CaptainSama
Nov 27 at 8:12
add a comment |
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
0
down vote
It's related to the CTC layer used as cost function. Maybe reading the scientific papers will give you more perspective, but it's related to a "extra" class used by the model to say ("there is no letter").
Paper by Graves explaining the algorithm behind
I will read this paper..thank you .
– CaptainSama
Nov 27 at 8:16
add a comment |
up vote
0
down vote
It's related to the CTC layer used as cost function. Maybe reading the scientific papers will give you more perspective, but it's related to a "extra" class used by the model to say ("there is no letter").
Paper by Graves explaining the algorithm behind
I will read this paper..thank you .
– CaptainSama
Nov 27 at 8:16
add a comment |
up vote
0
down vote
up vote
0
down vote
It's related to the CTC layer used as cost function. Maybe reading the scientific papers will give you more perspective, but it's related to a "extra" class used by the model to say ("there is no letter").
Paper by Graves explaining the algorithm behind
It's related to the CTC layer used as cost function. Maybe reading the scientific papers will give you more perspective, but it's related to a "extra" class used by the model to say ("there is no letter").
Paper by Graves explaining the algorithm behind
answered Nov 22 at 9:25
Daniel GL
757316
757316
I will read this paper..thank you .
– CaptainSama
Nov 27 at 8:16
add a comment |
I will read this paper..thank you .
– CaptainSama
Nov 27 at 8:16
I will read this paper..thank you .
– CaptainSama
Nov 27 at 8:16
I will read this paper..thank you .
– CaptainSama
Nov 27 at 8:16
add a comment |
up vote
0
down vote
There is one extra-character needed in neural networks trained with CTC loss. This extra-character essentially means "no character seen at this position" and is called CTC blank.
It is used to allow different alignments of a text or to allow some white-space between characters (think of an image containing " hello" or "hello " with whitespace around them, for both you want to recognize "hello").
When recognizing the text, these blanks are removed: e.g. when using best path decoding, the best-scoring character at each position is taken, but the blanks will be removed.
To get a better idea of this special CTC blank character, let's look at the illustration below. The output of the neural network contains the characters a, b and the CTC blank (denoted as "-").
Let's pick the best-scoring characters for each position t0...t4, this gives us "aaa-b". Best path decoding removes repeated characters, this gives us "a-b", and finally removes all blanks, which gives us "ab".
If you want some more information, you can look at my CTC article, or this article, or the original paper.
Thank you for helping me to explain this problem...I will read some paper about CTC...
– CaptainSama
Nov 27 at 8:12
add a comment |
up vote
0
down vote
There is one extra-character needed in neural networks trained with CTC loss. This extra-character essentially means "no character seen at this position" and is called CTC blank.
It is used to allow different alignments of a text or to allow some white-space between characters (think of an image containing " hello" or "hello " with whitespace around them, for both you want to recognize "hello").
When recognizing the text, these blanks are removed: e.g. when using best path decoding, the best-scoring character at each position is taken, but the blanks will be removed.
To get a better idea of this special CTC blank character, let's look at the illustration below. The output of the neural network contains the characters a, b and the CTC blank (denoted as "-").
Let's pick the best-scoring characters for each position t0...t4, this gives us "aaa-b". Best path decoding removes repeated characters, this gives us "a-b", and finally removes all blanks, which gives us "ab".
If you want some more information, you can look at my CTC article, or this article, or the original paper.
Thank you for helping me to explain this problem...I will read some paper about CTC...
– CaptainSama
Nov 27 at 8:12
add a comment |
up vote
0
down vote
up vote
0
down vote
There is one extra-character needed in neural networks trained with CTC loss. This extra-character essentially means "no character seen at this position" and is called CTC blank.
It is used to allow different alignments of a text or to allow some white-space between characters (think of an image containing " hello" or "hello " with whitespace around them, for both you want to recognize "hello").
When recognizing the text, these blanks are removed: e.g. when using best path decoding, the best-scoring character at each position is taken, but the blanks will be removed.
To get a better idea of this special CTC blank character, let's look at the illustration below. The output of the neural network contains the characters a, b and the CTC blank (denoted as "-").
Let's pick the best-scoring characters for each position t0...t4, this gives us "aaa-b". Best path decoding removes repeated characters, this gives us "a-b", and finally removes all blanks, which gives us "ab".
If you want some more information, you can look at my CTC article, or this article, or the original paper.
There is one extra-character needed in neural networks trained with CTC loss. This extra-character essentially means "no character seen at this position" and is called CTC blank.
It is used to allow different alignments of a text or to allow some white-space between characters (think of an image containing " hello" or "hello " with whitespace around them, for both you want to recognize "hello").
When recognizing the text, these blanks are removed: e.g. when using best path decoding, the best-scoring character at each position is taken, but the blanks will be removed.
To get a better idea of this special CTC blank character, let's look at the illustration below. The output of the neural network contains the characters a, b and the CTC blank (denoted as "-").
Let's pick the best-scoring characters for each position t0...t4, this gives us "aaa-b". Best path decoding removes repeated characters, this gives us "a-b", and finally removes all blanks, which gives us "ab".
If you want some more information, you can look at my CTC article, or this article, or the original paper.
answered Nov 22 at 15:18
Harry
379213
379213
Thank you for helping me to explain this problem...I will read some paper about CTC...
– CaptainSama
Nov 27 at 8:12
add a comment |
Thank you for helping me to explain this problem...I will read some paper about CTC...
– CaptainSama
Nov 27 at 8:12
Thank you for helping me to explain this problem...I will read some paper about CTC...
– CaptainSama
Nov 27 at 8:12
Thank you for helping me to explain this problem...I will read some paper about CTC...
– CaptainSama
Nov 27 at 8:12
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53424847%2fwhy-the-get-output-size-is-lenalphabet-1-not-lenalphabet-in-the-keras-oc%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown