How to check if a file is a valid image file?
I am currently using PIL.
from PIL import Image
try:
im=Image.open(filename)
# do stuff
except IOError:
# filename not an image file
However, while this sufficiently covers most cases, some image files like, xcf, svg and psd are not being detected. Psd files throws an OverflowError exception.
Is there someway I could include them as well?
python image identification imghdr
add a comment |
I am currently using PIL.
from PIL import Image
try:
im=Image.open(filename)
# do stuff
except IOError:
# filename not an image file
However, while this sufficiently covers most cases, some image files like, xcf, svg and psd are not being detected. Psd files throws an OverflowError exception.
Is there someway I could include them as well?
python image identification imghdr
19
It's not particularly common practice to close duplicates across different languages. If you can't find any other Python questions with this leave it open as there could be Python-specific solutions that people want to post that did not make it to the question you posted.
– Paolo Bergantino
May 20 '09 at 18:09
yes, first of all I was really hoping for a python lib I didnt know about :P and then as ben pointed out, just the magic numbers doesnt validate the entire image.
– Sujoy
May 20 '09 at 18:14
@Sujoy, validating an entire image is nearly impossible, unless you already have a copy of it, because the computer can't tell the difference between a correct colour pixel, and a garbled set of 1s and 0s, as long as all the control (magic numbers) are correct.
– DevinB
May 20 '09 at 18:25
@devinb, agreed, i will just get the magic numbers and be done with it unless someone else comes up with something better to call for a refactor :)
– Sujoy
May 20 '09 at 18:31
xcf and psd aren't really images, they're project files that contain (often many) images... you could probably make a case for svg though.
– mgalgs
Jan 1 '14 at 19:10
add a comment |
I am currently using PIL.
from PIL import Image
try:
im=Image.open(filename)
# do stuff
except IOError:
# filename not an image file
However, while this sufficiently covers most cases, some image files like, xcf, svg and psd are not being detected. Psd files throws an OverflowError exception.
Is there someway I could include them as well?
python image identification imghdr
I am currently using PIL.
from PIL import Image
try:
im=Image.open(filename)
# do stuff
except IOError:
# filename not an image file
However, while this sufficiently covers most cases, some image files like, xcf, svg and psd are not being detected. Psd files throws an OverflowError exception.
Is there someway I could include them as well?
python image identification imghdr
python image identification imghdr
edited Jan 24 '18 at 2:08
mhaghighat
509821
509821
asked May 20 '09 at 17:55
SujoySujoy
5,35122236
5,35122236
19
It's not particularly common practice to close duplicates across different languages. If you can't find any other Python questions with this leave it open as there could be Python-specific solutions that people want to post that did not make it to the question you posted.
– Paolo Bergantino
May 20 '09 at 18:09
yes, first of all I was really hoping for a python lib I didnt know about :P and then as ben pointed out, just the magic numbers doesnt validate the entire image.
– Sujoy
May 20 '09 at 18:14
@Sujoy, validating an entire image is nearly impossible, unless you already have a copy of it, because the computer can't tell the difference between a correct colour pixel, and a garbled set of 1s and 0s, as long as all the control (magic numbers) are correct.
– DevinB
May 20 '09 at 18:25
@devinb, agreed, i will just get the magic numbers and be done with it unless someone else comes up with something better to call for a refactor :)
– Sujoy
May 20 '09 at 18:31
xcf and psd aren't really images, they're project files that contain (often many) images... you could probably make a case for svg though.
– mgalgs
Jan 1 '14 at 19:10
add a comment |
19
It's not particularly common practice to close duplicates across different languages. If you can't find any other Python questions with this leave it open as there could be Python-specific solutions that people want to post that did not make it to the question you posted.
– Paolo Bergantino
May 20 '09 at 18:09
yes, first of all I was really hoping for a python lib I didnt know about :P and then as ben pointed out, just the magic numbers doesnt validate the entire image.
– Sujoy
May 20 '09 at 18:14
@Sujoy, validating an entire image is nearly impossible, unless you already have a copy of it, because the computer can't tell the difference between a correct colour pixel, and a garbled set of 1s and 0s, as long as all the control (magic numbers) are correct.
– DevinB
May 20 '09 at 18:25
@devinb, agreed, i will just get the magic numbers and be done with it unless someone else comes up with something better to call for a refactor :)
– Sujoy
May 20 '09 at 18:31
xcf and psd aren't really images, they're project files that contain (often many) images... you could probably make a case for svg though.
– mgalgs
Jan 1 '14 at 19:10
19
19
It's not particularly common practice to close duplicates across different languages. If you can't find any other Python questions with this leave it open as there could be Python-specific solutions that people want to post that did not make it to the question you posted.
– Paolo Bergantino
May 20 '09 at 18:09
It's not particularly common practice to close duplicates across different languages. If you can't find any other Python questions with this leave it open as there could be Python-specific solutions that people want to post that did not make it to the question you posted.
– Paolo Bergantino
May 20 '09 at 18:09
yes, first of all I was really hoping for a python lib I didnt know about :P and then as ben pointed out, just the magic numbers doesnt validate the entire image.
– Sujoy
May 20 '09 at 18:14
yes, first of all I was really hoping for a python lib I didnt know about :P and then as ben pointed out, just the magic numbers doesnt validate the entire image.
– Sujoy
May 20 '09 at 18:14
@Sujoy, validating an entire image is nearly impossible, unless you already have a copy of it, because the computer can't tell the difference between a correct colour pixel, and a garbled set of 1s and 0s, as long as all the control (magic numbers) are correct.
– DevinB
May 20 '09 at 18:25
@Sujoy, validating an entire image is nearly impossible, unless you already have a copy of it, because the computer can't tell the difference between a correct colour pixel, and a garbled set of 1s and 0s, as long as all the control (magic numbers) are correct.
– DevinB
May 20 '09 at 18:25
@devinb, agreed, i will just get the magic numbers and be done with it unless someone else comes up with something better to call for a refactor :)
– Sujoy
May 20 '09 at 18:31
@devinb, agreed, i will just get the magic numbers and be done with it unless someone else comes up with something better to call for a refactor :)
– Sujoy
May 20 '09 at 18:31
xcf and psd aren't really images, they're project files that contain (often many) images... you could probably make a case for svg though.
– mgalgs
Jan 1 '14 at 19:10
xcf and psd aren't really images, they're project files that contain (often many) images... you could probably make a case for svg though.
– mgalgs
Jan 1 '14 at 19:10
add a comment |
8 Answers
8
active
oldest
votes
A lot of times the first couple chars will be a magic number for various file formats. You could check for this in addition to your exception checking above.
6
That won't be sufficient if he's really testing for "valid" images; the presence of a magic number doesn't guarantee that the file hasn't been truncated, for example.
– Ben Blank
May 20 '09 at 18:11
excellent advice, now i just need to figure out what those numbers are. thanks :)
– Sujoy
May 20 '09 at 18:11
@ben, ouch i didnt think of that yet. thats a good point indeed
– Sujoy
May 20 '09 at 18:12
@Ben, how would you expect a library to infer a file has been truncated?
– DevinB
May 20 '09 at 18:25
5
@Ben Blank: True, but solving a problem 99% of the way is often better then not solving it at all.
– Brian R. Bondy
May 20 '09 at 21:14
add a comment |
I have just found the builtin imghdr module. From python documentation:
The imghdr module determines the type
of image contained in a file or byte
stream.
This is how it works:
>>> import imghdr
>>> imghdr.what('/tmp/bass')
'gif'
Using a module is much better than reimplementing similar functionality
2
yes imghdr works for most image formats but not all. as per my original problem with svg, xcf and psd files, well those are undetected in imghdr as well
– Sujoy
May 26 '09 at 12:54
2
Your answer is actually better, thanks. Like someone above said ...but solving a problem 99% of the way is often better then not solving it at all..
– RinkyPinku
Jun 3 '15 at 11:54
2
Worth to note:imghdr.what(path)returnsNoneif givenpathis not recognized image file type. List of currently recognized image types: rgb, gif, pbm, pgm, ppm, tiff, rast, xbm, jpeg, bmp, png, webp, exr.
– patryk.beza
Apr 6 '16 at 15:29
1
Be careful! A valid hdr doesn't mean a valid image (e.g. the image bytes may have been scrambled!)
– Filippo Mazza
Nov 30 '17 at 13:37
1
Per @FilippoMazza 's comment, I can confirm that a bad image that got cut off during transfer can pass this test, but will break when PIL tries to read it.
– kevinmicke
Mar 21 '18 at 19:41
|
show 3 more comments
In addition to what Brian is suggesting you could use PIL's verify method to check if the file is broken.
im.verify()
Attempts to determine if the file is
broken, without actually decoding the
image data. If this method finds any
problems, it raises suitable
exceptions. This method only works on
a newly opened image; if the image has
already been loaded, the result is
undefined. Also, if you need to load
the image after using this method, you
must reopen the image file. Attributes
well the main problem is that svg,xcf and psd files cannot be opened with Image.open() hence, no chance of verifying with im.verify()
– Sujoy
May 20 '09 at 19:07
14
My god the PIL documentation is terrible. What is exactly is a "suitable exception"?
– Timmmm
Jul 26 '12 at 19:45
Here's the link to the Pillow documentation for Image.verify(). Unfortunately, it's no better, and it looks like they just lifted the paragraph above without adding anything.
– Two-Bit Alchemist
Aug 8 '14 at 18:34
I've seen verify raise SyntaxError for corrupt png files
– Carl
Nov 20 '15 at 3:41
is there a way to verify "WITH actually decoding the image data"?
– Trevor Boyd Smith
Sep 13 '17 at 14:38
|
show 1 more comment
You could use the Python bindings to libmagic, python-magic and then check the mime types. This won't tell you if the files are corrupted or intact but it should be able to determine what type of image it is.
add a comment |
Well, I do not know about the insides of psd, but I, sure, know that, as a matter of fact, svg is not an image file per se, -- it is based on xml, so it is, essentially, a plain text file.
aha, you are right. it is xml. however, it contains some image data embedded in it.
– Sujoy
May 20 '09 at 18:10
add a comment |
On Linux, you could use python-magic (http://pypi.python.org/pypi/python-magic/0.1) which uses libmagic to identify file formats.
AFAIK, libmagic looks into the file and tries to tell you more about it than just the format, like bitmap dimensions, format version etc.. So you might see this as a superficial test for "validity".
For other definitions of "valid" you might have to write your own tests.
add a comment |
Would checking the file extensions be acceptable or are you trying to confirm the data itself represents an image file?
If you can check the file extension a regular expression or a simple comparison could satisfy the requirement.
simply checking extension wont suffice, as one can rename a txt file as jpg or something. i guess, if i can find no solution, only then i will use extension checking for xcf and svg
– Sujoy
May 20 '09 at 17:59
Understandable, I was just hoping for some clarification before I proceeded to devise a solution that might better suit your needs. Thanks!
– doomspork
May 20 '09 at 18:01
add a comment |
Update
I also implemented the following solution in my Python script here on GitHub.
I also verified that damaged files (jpg) frequently are not 'broken' images i.e, a damaged picture file sometimes remains a legit picture file, the original image is lost or altered but you are still able to load it with no errors. But, file truncation cause always errors.
End Update
You can use Python Pillow(PIL) module, with most image formats, to check if a file is a valid and intact image file.
In the case you aim at detecting also broken images, @Nadia Alramli correctly suggests the im.verify() method, but this does not detect all the possible image defects, e.g., im.verify does not detect truncated images (that most viewers often load with a greyed area).
Pillow is able to detect these type of defects too, but you have to apply image manipulation or image decode/recode in or to trigger the check. Finally I suggest to use this code:
try:
im = Image.load(filename)
im.verify() #I perform also verify, don't know if he sees other types o defects
im.close() #reload is necessary in my case
im = Image.load(filename)
im.transpose(PIL.Image.FLIP_LEFT_RIGHT)
im.close()
except:
#manage excetions here
In case of image defects this code will raise an exception.
Please consider that im.verify is about 100 times faster than performing the image manipulation (and I think that flip is one of the cheaper transformations).
With this code you are going to verify a set of images at about 10 MBytes/sec with standard Pillow or 40 MBytes/sec with Pillow-SIMD module (modern 2.5Ghz x86_64 CPU).
For the other formats psd,xcf,.. you can use Imagemagick wrapper Wand, the code is as follows:
im = wand.image.Image(filename=filename)
temp = im.flip;
im.close()
But, from my experiments Wand does not detect truncated images, I think it loads lacking parts as greyed area without prompting.
I red that Imagemagick has an external command identify that could make the job, but I have not found a way to invoke that function programmatically and I have not tested this route.
I suggest to always perform a preliminary check, check the filesize to not be zero (or very small), is a very cheap idea:
statfile = os.stat(filename)
filesize = statfile.st_size
if filesize == 0:
#manage here the 'faulty image' case
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f889333%2fhow-to-check-if-a-file-is-a-valid-image-file%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
8 Answers
8
active
oldest
votes
8 Answers
8
active
oldest
votes
active
oldest
votes
active
oldest
votes
A lot of times the first couple chars will be a magic number for various file formats. You could check for this in addition to your exception checking above.
6
That won't be sufficient if he's really testing for "valid" images; the presence of a magic number doesn't guarantee that the file hasn't been truncated, for example.
– Ben Blank
May 20 '09 at 18:11
excellent advice, now i just need to figure out what those numbers are. thanks :)
– Sujoy
May 20 '09 at 18:11
@ben, ouch i didnt think of that yet. thats a good point indeed
– Sujoy
May 20 '09 at 18:12
@Ben, how would you expect a library to infer a file has been truncated?
– DevinB
May 20 '09 at 18:25
5
@Ben Blank: True, but solving a problem 99% of the way is often better then not solving it at all.
– Brian R. Bondy
May 20 '09 at 21:14
add a comment |
A lot of times the first couple chars will be a magic number for various file formats. You could check for this in addition to your exception checking above.
6
That won't be sufficient if he's really testing for "valid" images; the presence of a magic number doesn't guarantee that the file hasn't been truncated, for example.
– Ben Blank
May 20 '09 at 18:11
excellent advice, now i just need to figure out what those numbers are. thanks :)
– Sujoy
May 20 '09 at 18:11
@ben, ouch i didnt think of that yet. thats a good point indeed
– Sujoy
May 20 '09 at 18:12
@Ben, how would you expect a library to infer a file has been truncated?
– DevinB
May 20 '09 at 18:25
5
@Ben Blank: True, but solving a problem 99% of the way is often better then not solving it at all.
– Brian R. Bondy
May 20 '09 at 21:14
add a comment |
A lot of times the first couple chars will be a magic number for various file formats. You could check for this in addition to your exception checking above.
A lot of times the first couple chars will be a magic number for various file formats. You could check for this in addition to your exception checking above.
answered May 20 '09 at 17:58
Brian R. BondyBrian R. Bondy
252k98541592
252k98541592
6
That won't be sufficient if he's really testing for "valid" images; the presence of a magic number doesn't guarantee that the file hasn't been truncated, for example.
– Ben Blank
May 20 '09 at 18:11
excellent advice, now i just need to figure out what those numbers are. thanks :)
– Sujoy
May 20 '09 at 18:11
@ben, ouch i didnt think of that yet. thats a good point indeed
– Sujoy
May 20 '09 at 18:12
@Ben, how would you expect a library to infer a file has been truncated?
– DevinB
May 20 '09 at 18:25
5
@Ben Blank: True, but solving a problem 99% of the way is often better then not solving it at all.
– Brian R. Bondy
May 20 '09 at 21:14
add a comment |
6
That won't be sufficient if he's really testing for "valid" images; the presence of a magic number doesn't guarantee that the file hasn't been truncated, for example.
– Ben Blank
May 20 '09 at 18:11
excellent advice, now i just need to figure out what those numbers are. thanks :)
– Sujoy
May 20 '09 at 18:11
@ben, ouch i didnt think of that yet. thats a good point indeed
– Sujoy
May 20 '09 at 18:12
@Ben, how would you expect a library to infer a file has been truncated?
– DevinB
May 20 '09 at 18:25
5
@Ben Blank: True, but solving a problem 99% of the way is often better then not solving it at all.
– Brian R. Bondy
May 20 '09 at 21:14
6
6
That won't be sufficient if he's really testing for "valid" images; the presence of a magic number doesn't guarantee that the file hasn't been truncated, for example.
– Ben Blank
May 20 '09 at 18:11
That won't be sufficient if he's really testing for "valid" images; the presence of a magic number doesn't guarantee that the file hasn't been truncated, for example.
– Ben Blank
May 20 '09 at 18:11
excellent advice, now i just need to figure out what those numbers are. thanks :)
– Sujoy
May 20 '09 at 18:11
excellent advice, now i just need to figure out what those numbers are. thanks :)
– Sujoy
May 20 '09 at 18:11
@ben, ouch i didnt think of that yet. thats a good point indeed
– Sujoy
May 20 '09 at 18:12
@ben, ouch i didnt think of that yet. thats a good point indeed
– Sujoy
May 20 '09 at 18:12
@Ben, how would you expect a library to infer a file has been truncated?
– DevinB
May 20 '09 at 18:25
@Ben, how would you expect a library to infer a file has been truncated?
– DevinB
May 20 '09 at 18:25
5
5
@Ben Blank: True, but solving a problem 99% of the way is often better then not solving it at all.
– Brian R. Bondy
May 20 '09 at 21:14
@Ben Blank: True, but solving a problem 99% of the way is often better then not solving it at all.
– Brian R. Bondy
May 20 '09 at 21:14
add a comment |
I have just found the builtin imghdr module. From python documentation:
The imghdr module determines the type
of image contained in a file or byte
stream.
This is how it works:
>>> import imghdr
>>> imghdr.what('/tmp/bass')
'gif'
Using a module is much better than reimplementing similar functionality
2
yes imghdr works for most image formats but not all. as per my original problem with svg, xcf and psd files, well those are undetected in imghdr as well
– Sujoy
May 26 '09 at 12:54
2
Your answer is actually better, thanks. Like someone above said ...but solving a problem 99% of the way is often better then not solving it at all..
– RinkyPinku
Jun 3 '15 at 11:54
2
Worth to note:imghdr.what(path)returnsNoneif givenpathis not recognized image file type. List of currently recognized image types: rgb, gif, pbm, pgm, ppm, tiff, rast, xbm, jpeg, bmp, png, webp, exr.
– patryk.beza
Apr 6 '16 at 15:29
1
Be careful! A valid hdr doesn't mean a valid image (e.g. the image bytes may have been scrambled!)
– Filippo Mazza
Nov 30 '17 at 13:37
1
Per @FilippoMazza 's comment, I can confirm that a bad image that got cut off during transfer can pass this test, but will break when PIL tries to read it.
– kevinmicke
Mar 21 '18 at 19:41
|
show 3 more comments
I have just found the builtin imghdr module. From python documentation:
The imghdr module determines the type
of image contained in a file or byte
stream.
This is how it works:
>>> import imghdr
>>> imghdr.what('/tmp/bass')
'gif'
Using a module is much better than reimplementing similar functionality
2
yes imghdr works for most image formats but not all. as per my original problem with svg, xcf and psd files, well those are undetected in imghdr as well
– Sujoy
May 26 '09 at 12:54
2
Your answer is actually better, thanks. Like someone above said ...but solving a problem 99% of the way is often better then not solving it at all..
– RinkyPinku
Jun 3 '15 at 11:54
2
Worth to note:imghdr.what(path)returnsNoneif givenpathis not recognized image file type. List of currently recognized image types: rgb, gif, pbm, pgm, ppm, tiff, rast, xbm, jpeg, bmp, png, webp, exr.
– patryk.beza
Apr 6 '16 at 15:29
1
Be careful! A valid hdr doesn't mean a valid image (e.g. the image bytes may have been scrambled!)
– Filippo Mazza
Nov 30 '17 at 13:37
1
Per @FilippoMazza 's comment, I can confirm that a bad image that got cut off during transfer can pass this test, but will break when PIL tries to read it.
– kevinmicke
Mar 21 '18 at 19:41
|
show 3 more comments
I have just found the builtin imghdr module. From python documentation:
The imghdr module determines the type
of image contained in a file or byte
stream.
This is how it works:
>>> import imghdr
>>> imghdr.what('/tmp/bass')
'gif'
Using a module is much better than reimplementing similar functionality
I have just found the builtin imghdr module. From python documentation:
The imghdr module determines the type
of image contained in a file or byte
stream.
This is how it works:
>>> import imghdr
>>> imghdr.what('/tmp/bass')
'gif'
Using a module is much better than reimplementing similar functionality
answered May 24 '09 at 0:29
Nadia AlramliNadia Alramli
79.1k25153147
79.1k25153147
2
yes imghdr works for most image formats but not all. as per my original problem with svg, xcf and psd files, well those are undetected in imghdr as well
– Sujoy
May 26 '09 at 12:54
2
Your answer is actually better, thanks. Like someone above said ...but solving a problem 99% of the way is often better then not solving it at all..
– RinkyPinku
Jun 3 '15 at 11:54
2
Worth to note:imghdr.what(path)returnsNoneif givenpathis not recognized image file type. List of currently recognized image types: rgb, gif, pbm, pgm, ppm, tiff, rast, xbm, jpeg, bmp, png, webp, exr.
– patryk.beza
Apr 6 '16 at 15:29
1
Be careful! A valid hdr doesn't mean a valid image (e.g. the image bytes may have been scrambled!)
– Filippo Mazza
Nov 30 '17 at 13:37
1
Per @FilippoMazza 's comment, I can confirm that a bad image that got cut off during transfer can pass this test, but will break when PIL tries to read it.
– kevinmicke
Mar 21 '18 at 19:41
|
show 3 more comments
2
yes imghdr works for most image formats but not all. as per my original problem with svg, xcf and psd files, well those are undetected in imghdr as well
– Sujoy
May 26 '09 at 12:54
2
Your answer is actually better, thanks. Like someone above said ...but solving a problem 99% of the way is often better then not solving it at all..
– RinkyPinku
Jun 3 '15 at 11:54
2
Worth to note:imghdr.what(path)returnsNoneif givenpathis not recognized image file type. List of currently recognized image types: rgb, gif, pbm, pgm, ppm, tiff, rast, xbm, jpeg, bmp, png, webp, exr.
– patryk.beza
Apr 6 '16 at 15:29
1
Be careful! A valid hdr doesn't mean a valid image (e.g. the image bytes may have been scrambled!)
– Filippo Mazza
Nov 30 '17 at 13:37
1
Per @FilippoMazza 's comment, I can confirm that a bad image that got cut off during transfer can pass this test, but will break when PIL tries to read it.
– kevinmicke
Mar 21 '18 at 19:41
2
2
yes imghdr works for most image formats but not all. as per my original problem with svg, xcf and psd files, well those are undetected in imghdr as well
– Sujoy
May 26 '09 at 12:54
yes imghdr works for most image formats but not all. as per my original problem with svg, xcf and psd files, well those are undetected in imghdr as well
– Sujoy
May 26 '09 at 12:54
2
2
Your answer is actually better, thanks. Like someone above said ...but solving a problem 99% of the way is often better then not solving it at all..
– RinkyPinku
Jun 3 '15 at 11:54
Your answer is actually better, thanks. Like someone above said ...but solving a problem 99% of the way is often better then not solving it at all..
– RinkyPinku
Jun 3 '15 at 11:54
2
2
Worth to note:
imghdr.what(path) returns None if given path is not recognized image file type. List of currently recognized image types: rgb, gif, pbm, pgm, ppm, tiff, rast, xbm, jpeg, bmp, png, webp, exr.– patryk.beza
Apr 6 '16 at 15:29
Worth to note:
imghdr.what(path) returns None if given path is not recognized image file type. List of currently recognized image types: rgb, gif, pbm, pgm, ppm, tiff, rast, xbm, jpeg, bmp, png, webp, exr.– patryk.beza
Apr 6 '16 at 15:29
1
1
Be careful! A valid hdr doesn't mean a valid image (e.g. the image bytes may have been scrambled!)
– Filippo Mazza
Nov 30 '17 at 13:37
Be careful! A valid hdr doesn't mean a valid image (e.g. the image bytes may have been scrambled!)
– Filippo Mazza
Nov 30 '17 at 13:37
1
1
Per @FilippoMazza 's comment, I can confirm that a bad image that got cut off during transfer can pass this test, but will break when PIL tries to read it.
– kevinmicke
Mar 21 '18 at 19:41
Per @FilippoMazza 's comment, I can confirm that a bad image that got cut off during transfer can pass this test, but will break when PIL tries to read it.
– kevinmicke
Mar 21 '18 at 19:41
|
show 3 more comments
In addition to what Brian is suggesting you could use PIL's verify method to check if the file is broken.
im.verify()
Attempts to determine if the file is
broken, without actually decoding the
image data. If this method finds any
problems, it raises suitable
exceptions. This method only works on
a newly opened image; if the image has
already been loaded, the result is
undefined. Also, if you need to load
the image after using this method, you
must reopen the image file. Attributes
well the main problem is that svg,xcf and psd files cannot be opened with Image.open() hence, no chance of verifying with im.verify()
– Sujoy
May 20 '09 at 19:07
14
My god the PIL documentation is terrible. What is exactly is a "suitable exception"?
– Timmmm
Jul 26 '12 at 19:45
Here's the link to the Pillow documentation for Image.verify(). Unfortunately, it's no better, and it looks like they just lifted the paragraph above without adding anything.
– Two-Bit Alchemist
Aug 8 '14 at 18:34
I've seen verify raise SyntaxError for corrupt png files
– Carl
Nov 20 '15 at 3:41
is there a way to verify "WITH actually decoding the image data"?
– Trevor Boyd Smith
Sep 13 '17 at 14:38
|
show 1 more comment
In addition to what Brian is suggesting you could use PIL's verify method to check if the file is broken.
im.verify()
Attempts to determine if the file is
broken, without actually decoding the
image data. If this method finds any
problems, it raises suitable
exceptions. This method only works on
a newly opened image; if the image has
already been loaded, the result is
undefined. Also, if you need to load
the image after using this method, you
must reopen the image file. Attributes
well the main problem is that svg,xcf and psd files cannot be opened with Image.open() hence, no chance of verifying with im.verify()
– Sujoy
May 20 '09 at 19:07
14
My god the PIL documentation is terrible. What is exactly is a "suitable exception"?
– Timmmm
Jul 26 '12 at 19:45
Here's the link to the Pillow documentation for Image.verify(). Unfortunately, it's no better, and it looks like they just lifted the paragraph above without adding anything.
– Two-Bit Alchemist
Aug 8 '14 at 18:34
I've seen verify raise SyntaxError for corrupt png files
– Carl
Nov 20 '15 at 3:41
is there a way to verify "WITH actually decoding the image data"?
– Trevor Boyd Smith
Sep 13 '17 at 14:38
|
show 1 more comment
In addition to what Brian is suggesting you could use PIL's verify method to check if the file is broken.
im.verify()
Attempts to determine if the file is
broken, without actually decoding the
image data. If this method finds any
problems, it raises suitable
exceptions. This method only works on
a newly opened image; if the image has
already been loaded, the result is
undefined. Also, if you need to load
the image after using this method, you
must reopen the image file. Attributes
In addition to what Brian is suggesting you could use PIL's verify method to check if the file is broken.
im.verify()
Attempts to determine if the file is
broken, without actually decoding the
image data. If this method finds any
problems, it raises suitable
exceptions. This method only works on
a newly opened image; if the image has
already been loaded, the result is
undefined. Also, if you need to load
the image after using this method, you
must reopen the image file. Attributes
edited Aug 8 '14 at 18:32
Two-Bit Alchemist
10.4k43063
10.4k43063
answered May 20 '09 at 19:02
Nadia AlramliNadia Alramli
79.1k25153147
79.1k25153147
well the main problem is that svg,xcf and psd files cannot be opened with Image.open() hence, no chance of verifying with im.verify()
– Sujoy
May 20 '09 at 19:07
14
My god the PIL documentation is terrible. What is exactly is a "suitable exception"?
– Timmmm
Jul 26 '12 at 19:45
Here's the link to the Pillow documentation for Image.verify(). Unfortunately, it's no better, and it looks like they just lifted the paragraph above without adding anything.
– Two-Bit Alchemist
Aug 8 '14 at 18:34
I've seen verify raise SyntaxError for corrupt png files
– Carl
Nov 20 '15 at 3:41
is there a way to verify "WITH actually decoding the image data"?
– Trevor Boyd Smith
Sep 13 '17 at 14:38
|
show 1 more comment
well the main problem is that svg,xcf and psd files cannot be opened with Image.open() hence, no chance of verifying with im.verify()
– Sujoy
May 20 '09 at 19:07
14
My god the PIL documentation is terrible. What is exactly is a "suitable exception"?
– Timmmm
Jul 26 '12 at 19:45
Here's the link to the Pillow documentation for Image.verify(). Unfortunately, it's no better, and it looks like they just lifted the paragraph above without adding anything.
– Two-Bit Alchemist
Aug 8 '14 at 18:34
I've seen verify raise SyntaxError for corrupt png files
– Carl
Nov 20 '15 at 3:41
is there a way to verify "WITH actually decoding the image data"?
– Trevor Boyd Smith
Sep 13 '17 at 14:38
well the main problem is that svg,xcf and psd files cannot be opened with Image.open() hence, no chance of verifying with im.verify()
– Sujoy
May 20 '09 at 19:07
well the main problem is that svg,xcf and psd files cannot be opened with Image.open() hence, no chance of verifying with im.verify()
– Sujoy
May 20 '09 at 19:07
14
14
My god the PIL documentation is terrible. What is exactly is a "suitable exception"?
– Timmmm
Jul 26 '12 at 19:45
My god the PIL documentation is terrible. What is exactly is a "suitable exception"?
– Timmmm
Jul 26 '12 at 19:45
Here's the link to the Pillow documentation for Image.verify(). Unfortunately, it's no better, and it looks like they just lifted the paragraph above without adding anything.
– Two-Bit Alchemist
Aug 8 '14 at 18:34
Here's the link to the Pillow documentation for Image.verify(). Unfortunately, it's no better, and it looks like they just lifted the paragraph above without adding anything.
– Two-Bit Alchemist
Aug 8 '14 at 18:34
I've seen verify raise SyntaxError for corrupt png files
– Carl
Nov 20 '15 at 3:41
I've seen verify raise SyntaxError for corrupt png files
– Carl
Nov 20 '15 at 3:41
is there a way to verify "WITH actually decoding the image data"?
– Trevor Boyd Smith
Sep 13 '17 at 14:38
is there a way to verify "WITH actually decoding the image data"?
– Trevor Boyd Smith
Sep 13 '17 at 14:38
|
show 1 more comment
You could use the Python bindings to libmagic, python-magic and then check the mime types. This won't tell you if the files are corrupted or intact but it should be able to determine what type of image it is.
add a comment |
You could use the Python bindings to libmagic, python-magic and then check the mime types. This won't tell you if the files are corrupted or intact but it should be able to determine what type of image it is.
add a comment |
You could use the Python bindings to libmagic, python-magic and then check the mime types. This won't tell you if the files are corrupted or intact but it should be able to determine what type of image it is.
You could use the Python bindings to libmagic, python-magic and then check the mime types. This won't tell you if the files are corrupted or intact but it should be able to determine what type of image it is.
answered May 20 '09 at 19:29
Kamil KisielKamil Kisiel
11.8k93850
11.8k93850
add a comment |
add a comment |
Well, I do not know about the insides of psd, but I, sure, know that, as a matter of fact, svg is not an image file per se, -- it is based on xml, so it is, essentially, a plain text file.
aha, you are right. it is xml. however, it contains some image data embedded in it.
– Sujoy
May 20 '09 at 18:10
add a comment |
Well, I do not know about the insides of psd, but I, sure, know that, as a matter of fact, svg is not an image file per se, -- it is based on xml, so it is, essentially, a plain text file.
aha, you are right. it is xml. however, it contains some image data embedded in it.
– Sujoy
May 20 '09 at 18:10
add a comment |
Well, I do not know about the insides of psd, but I, sure, know that, as a matter of fact, svg is not an image file per se, -- it is based on xml, so it is, essentially, a plain text file.
Well, I do not know about the insides of psd, but I, sure, know that, as a matter of fact, svg is not an image file per se, -- it is based on xml, so it is, essentially, a plain text file.
answered May 20 '09 at 18:03
shylentshylent
6,89052849
6,89052849
aha, you are right. it is xml. however, it contains some image data embedded in it.
– Sujoy
May 20 '09 at 18:10
add a comment |
aha, you are right. it is xml. however, it contains some image data embedded in it.
– Sujoy
May 20 '09 at 18:10
aha, you are right. it is xml. however, it contains some image data embedded in it.
– Sujoy
May 20 '09 at 18:10
aha, you are right. it is xml. however, it contains some image data embedded in it.
– Sujoy
May 20 '09 at 18:10
add a comment |
On Linux, you could use python-magic (http://pypi.python.org/pypi/python-magic/0.1) which uses libmagic to identify file formats.
AFAIK, libmagic looks into the file and tries to tell you more about it than just the format, like bitmap dimensions, format version etc.. So you might see this as a superficial test for "validity".
For other definitions of "valid" you might have to write your own tests.
add a comment |
On Linux, you could use python-magic (http://pypi.python.org/pypi/python-magic/0.1) which uses libmagic to identify file formats.
AFAIK, libmagic looks into the file and tries to tell you more about it than just the format, like bitmap dimensions, format version etc.. So you might see this as a superficial test for "validity".
For other definitions of "valid" you might have to write your own tests.
add a comment |
On Linux, you could use python-magic (http://pypi.python.org/pypi/python-magic/0.1) which uses libmagic to identify file formats.
AFAIK, libmagic looks into the file and tries to tell you more about it than just the format, like bitmap dimensions, format version etc.. So you might see this as a superficial test for "validity".
For other definitions of "valid" you might have to write your own tests.
On Linux, you could use python-magic (http://pypi.python.org/pypi/python-magic/0.1) which uses libmagic to identify file formats.
AFAIK, libmagic looks into the file and tries to tell you more about it than just the format, like bitmap dimensions, format version etc.. So you might see this as a superficial test for "validity".
For other definitions of "valid" you might have to write your own tests.
edited May 20 '09 at 18:22
answered May 20 '09 at 18:05
fmarcfmarc
1,123919
1,123919
add a comment |
add a comment |
Would checking the file extensions be acceptable or are you trying to confirm the data itself represents an image file?
If you can check the file extension a regular expression or a simple comparison could satisfy the requirement.
simply checking extension wont suffice, as one can rename a txt file as jpg or something. i guess, if i can find no solution, only then i will use extension checking for xcf and svg
– Sujoy
May 20 '09 at 17:59
Understandable, I was just hoping for some clarification before I proceeded to devise a solution that might better suit your needs. Thanks!
– doomspork
May 20 '09 at 18:01
add a comment |
Would checking the file extensions be acceptable or are you trying to confirm the data itself represents an image file?
If you can check the file extension a regular expression or a simple comparison could satisfy the requirement.
simply checking extension wont suffice, as one can rename a txt file as jpg or something. i guess, if i can find no solution, only then i will use extension checking for xcf and svg
– Sujoy
May 20 '09 at 17:59
Understandable, I was just hoping for some clarification before I proceeded to devise a solution that might better suit your needs. Thanks!
– doomspork
May 20 '09 at 18:01
add a comment |
Would checking the file extensions be acceptable or are you trying to confirm the data itself represents an image file?
If you can check the file extension a regular expression or a simple comparison could satisfy the requirement.
Would checking the file extensions be acceptable or are you trying to confirm the data itself represents an image file?
If you can check the file extension a regular expression or a simple comparison could satisfy the requirement.
answered May 20 '09 at 17:57
doomsporkdoomspork
2,0721324
2,0721324
simply checking extension wont suffice, as one can rename a txt file as jpg or something. i guess, if i can find no solution, only then i will use extension checking for xcf and svg
– Sujoy
May 20 '09 at 17:59
Understandable, I was just hoping for some clarification before I proceeded to devise a solution that might better suit your needs. Thanks!
– doomspork
May 20 '09 at 18:01
add a comment |
simply checking extension wont suffice, as one can rename a txt file as jpg or something. i guess, if i can find no solution, only then i will use extension checking for xcf and svg
– Sujoy
May 20 '09 at 17:59
Understandable, I was just hoping for some clarification before I proceeded to devise a solution that might better suit your needs. Thanks!
– doomspork
May 20 '09 at 18:01
simply checking extension wont suffice, as one can rename a txt file as jpg or something. i guess, if i can find no solution, only then i will use extension checking for xcf and svg
– Sujoy
May 20 '09 at 17:59
simply checking extension wont suffice, as one can rename a txt file as jpg or something. i guess, if i can find no solution, only then i will use extension checking for xcf and svg
– Sujoy
May 20 '09 at 17:59
Understandable, I was just hoping for some clarification before I proceeded to devise a solution that might better suit your needs. Thanks!
– doomspork
May 20 '09 at 18:01
Understandable, I was just hoping for some clarification before I proceeded to devise a solution that might better suit your needs. Thanks!
– doomspork
May 20 '09 at 18:01
add a comment |
Update
I also implemented the following solution in my Python script here on GitHub.
I also verified that damaged files (jpg) frequently are not 'broken' images i.e, a damaged picture file sometimes remains a legit picture file, the original image is lost or altered but you are still able to load it with no errors. But, file truncation cause always errors.
End Update
You can use Python Pillow(PIL) module, with most image formats, to check if a file is a valid and intact image file.
In the case you aim at detecting also broken images, @Nadia Alramli correctly suggests the im.verify() method, but this does not detect all the possible image defects, e.g., im.verify does not detect truncated images (that most viewers often load with a greyed area).
Pillow is able to detect these type of defects too, but you have to apply image manipulation or image decode/recode in or to trigger the check. Finally I suggest to use this code:
try:
im = Image.load(filename)
im.verify() #I perform also verify, don't know if he sees other types o defects
im.close() #reload is necessary in my case
im = Image.load(filename)
im.transpose(PIL.Image.FLIP_LEFT_RIGHT)
im.close()
except:
#manage excetions here
In case of image defects this code will raise an exception.
Please consider that im.verify is about 100 times faster than performing the image manipulation (and I think that flip is one of the cheaper transformations).
With this code you are going to verify a set of images at about 10 MBytes/sec with standard Pillow or 40 MBytes/sec with Pillow-SIMD module (modern 2.5Ghz x86_64 CPU).
For the other formats psd,xcf,.. you can use Imagemagick wrapper Wand, the code is as follows:
im = wand.image.Image(filename=filename)
temp = im.flip;
im.close()
But, from my experiments Wand does not detect truncated images, I think it loads lacking parts as greyed area without prompting.
I red that Imagemagick has an external command identify that could make the job, but I have not found a way to invoke that function programmatically and I have not tested this route.
I suggest to always perform a preliminary check, check the filesize to not be zero (or very small), is a very cheap idea:
statfile = os.stat(filename)
filesize = statfile.st_size
if filesize == 0:
#manage here the 'faulty image' case
add a comment |
Update
I also implemented the following solution in my Python script here on GitHub.
I also verified that damaged files (jpg) frequently are not 'broken' images i.e, a damaged picture file sometimes remains a legit picture file, the original image is lost or altered but you are still able to load it with no errors. But, file truncation cause always errors.
End Update
You can use Python Pillow(PIL) module, with most image formats, to check if a file is a valid and intact image file.
In the case you aim at detecting also broken images, @Nadia Alramli correctly suggests the im.verify() method, but this does not detect all the possible image defects, e.g., im.verify does not detect truncated images (that most viewers often load with a greyed area).
Pillow is able to detect these type of defects too, but you have to apply image manipulation or image decode/recode in or to trigger the check. Finally I suggest to use this code:
try:
im = Image.load(filename)
im.verify() #I perform also verify, don't know if he sees other types o defects
im.close() #reload is necessary in my case
im = Image.load(filename)
im.transpose(PIL.Image.FLIP_LEFT_RIGHT)
im.close()
except:
#manage excetions here
In case of image defects this code will raise an exception.
Please consider that im.verify is about 100 times faster than performing the image manipulation (and I think that flip is one of the cheaper transformations).
With this code you are going to verify a set of images at about 10 MBytes/sec with standard Pillow or 40 MBytes/sec with Pillow-SIMD module (modern 2.5Ghz x86_64 CPU).
For the other formats psd,xcf,.. you can use Imagemagick wrapper Wand, the code is as follows:
im = wand.image.Image(filename=filename)
temp = im.flip;
im.close()
But, from my experiments Wand does not detect truncated images, I think it loads lacking parts as greyed area without prompting.
I red that Imagemagick has an external command identify that could make the job, but I have not found a way to invoke that function programmatically and I have not tested this route.
I suggest to always perform a preliminary check, check the filesize to not be zero (or very small), is a very cheap idea:
statfile = os.stat(filename)
filesize = statfile.st_size
if filesize == 0:
#manage here the 'faulty image' case
add a comment |
Update
I also implemented the following solution in my Python script here on GitHub.
I also verified that damaged files (jpg) frequently are not 'broken' images i.e, a damaged picture file sometimes remains a legit picture file, the original image is lost or altered but you are still able to load it with no errors. But, file truncation cause always errors.
End Update
You can use Python Pillow(PIL) module, with most image formats, to check if a file is a valid and intact image file.
In the case you aim at detecting also broken images, @Nadia Alramli correctly suggests the im.verify() method, but this does not detect all the possible image defects, e.g., im.verify does not detect truncated images (that most viewers often load with a greyed area).
Pillow is able to detect these type of defects too, but you have to apply image manipulation or image decode/recode in or to trigger the check. Finally I suggest to use this code:
try:
im = Image.load(filename)
im.verify() #I perform also verify, don't know if he sees other types o defects
im.close() #reload is necessary in my case
im = Image.load(filename)
im.transpose(PIL.Image.FLIP_LEFT_RIGHT)
im.close()
except:
#manage excetions here
In case of image defects this code will raise an exception.
Please consider that im.verify is about 100 times faster than performing the image manipulation (and I think that flip is one of the cheaper transformations).
With this code you are going to verify a set of images at about 10 MBytes/sec with standard Pillow or 40 MBytes/sec with Pillow-SIMD module (modern 2.5Ghz x86_64 CPU).
For the other formats psd,xcf,.. you can use Imagemagick wrapper Wand, the code is as follows:
im = wand.image.Image(filename=filename)
temp = im.flip;
im.close()
But, from my experiments Wand does not detect truncated images, I think it loads lacking parts as greyed area without prompting.
I red that Imagemagick has an external command identify that could make the job, but I have not found a way to invoke that function programmatically and I have not tested this route.
I suggest to always perform a preliminary check, check the filesize to not be zero (or very small), is a very cheap idea:
statfile = os.stat(filename)
filesize = statfile.st_size
if filesize == 0:
#manage here the 'faulty image' case
Update
I also implemented the following solution in my Python script here on GitHub.
I also verified that damaged files (jpg) frequently are not 'broken' images i.e, a damaged picture file sometimes remains a legit picture file, the original image is lost or altered but you are still able to load it with no errors. But, file truncation cause always errors.
End Update
You can use Python Pillow(PIL) module, with most image formats, to check if a file is a valid and intact image file.
In the case you aim at detecting also broken images, @Nadia Alramli correctly suggests the im.verify() method, but this does not detect all the possible image defects, e.g., im.verify does not detect truncated images (that most viewers often load with a greyed area).
Pillow is able to detect these type of defects too, but you have to apply image manipulation or image decode/recode in or to trigger the check. Finally I suggest to use this code:
try:
im = Image.load(filename)
im.verify() #I perform also verify, don't know if he sees other types o defects
im.close() #reload is necessary in my case
im = Image.load(filename)
im.transpose(PIL.Image.FLIP_LEFT_RIGHT)
im.close()
except:
#manage excetions here
In case of image defects this code will raise an exception.
Please consider that im.verify is about 100 times faster than performing the image manipulation (and I think that flip is one of the cheaper transformations).
With this code you are going to verify a set of images at about 10 MBytes/sec with standard Pillow or 40 MBytes/sec with Pillow-SIMD module (modern 2.5Ghz x86_64 CPU).
For the other formats psd,xcf,.. you can use Imagemagick wrapper Wand, the code is as follows:
im = wand.image.Image(filename=filename)
temp = im.flip;
im.close()
But, from my experiments Wand does not detect truncated images, I think it loads lacking parts as greyed area without prompting.
I red that Imagemagick has an external command identify that could make the job, but I have not found a way to invoke that function programmatically and I have not tested this route.
I suggest to always perform a preliminary check, check the filesize to not be zero (or very small), is a very cheap idea:
statfile = os.stat(filename)
filesize = statfile.st_size
if filesize == 0:
#manage here the 'faulty image' case
edited Nov 28 '18 at 1:30
answered Nov 25 '18 at 19:03
Fabiano TarlaoFabiano Tarlao
1,3451823
1,3451823
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f889333%2fhow-to-check-if-a-file-is-a-valid-image-file%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
19
It's not particularly common practice to close duplicates across different languages. If you can't find any other Python questions with this leave it open as there could be Python-specific solutions that people want to post that did not make it to the question you posted.
– Paolo Bergantino
May 20 '09 at 18:09
yes, first of all I was really hoping for a python lib I didnt know about :P and then as ben pointed out, just the magic numbers doesnt validate the entire image.
– Sujoy
May 20 '09 at 18:14
@Sujoy, validating an entire image is nearly impossible, unless you already have a copy of it, because the computer can't tell the difference between a correct colour pixel, and a garbled set of 1s and 0s, as long as all the control (magic numbers) are correct.
– DevinB
May 20 '09 at 18:25
@devinb, agreed, i will just get the magic numbers and be done with it unless someone else comes up with something better to call for a refactor :)
– Sujoy
May 20 '09 at 18:31
xcf and psd aren't really images, they're project files that contain (often many) images... you could probably make a case for svg though.
– mgalgs
Jan 1 '14 at 19:10