Getting PDF page length












0














In my articles which formatted PDF, one or more pages may be blanked and I want to detect them and remove from PDF file. If I can identify pages that are less than 60 KB, I think I can detect the pages that are empty. Because they're probably empty.



I tried like this:



var reader = new PdfReader("D:\_test\file.pdf");
/*
* With reader.FileLength, I can get whole pdf file size.
* But I dont know, how can I get pages'sizes...
*/
for (var i = 1; i <= reader.NumberOfPages; i++)
{
/*
* MessageBox.Show(???);
*/
}









share|improve this question




















  • 1




    How about splitting the PDF into multiple PDFs, one for each page and then measure their respective sizes?
    – Uwe Keim
    Nov 23 at 8:25












  • @uweKeim, I don't want to split PDF file page by page. Because think about what would be useful for me if I splited a storybook page by page. It didn't sound professionally to reassemble the page after splitting the pages page by page and removing the blank pages.
    – Colin Henricks
    Nov 23 at 14:59
















0














In my articles which formatted PDF, one or more pages may be blanked and I want to detect them and remove from PDF file. If I can identify pages that are less than 60 KB, I think I can detect the pages that are empty. Because they're probably empty.



I tried like this:



var reader = new PdfReader("D:\_test\file.pdf");
/*
* With reader.FileLength, I can get whole pdf file size.
* But I dont know, how can I get pages'sizes...
*/
for (var i = 1; i <= reader.NumberOfPages; i++)
{
/*
* MessageBox.Show(???);
*/
}









share|improve this question




















  • 1




    How about splitting the PDF into multiple PDFs, one for each page and then measure their respective sizes?
    – Uwe Keim
    Nov 23 at 8:25












  • @uweKeim, I don't want to split PDF file page by page. Because think about what would be useful for me if I splited a storybook page by page. It didn't sound professionally to reassemble the page after splitting the pages page by page and removing the blank pages.
    – Colin Henricks
    Nov 23 at 14:59














0












0








0







In my articles which formatted PDF, one or more pages may be blanked and I want to detect them and remove from PDF file. If I can identify pages that are less than 60 KB, I think I can detect the pages that are empty. Because they're probably empty.



I tried like this:



var reader = new PdfReader("D:\_test\file.pdf");
/*
* With reader.FileLength, I can get whole pdf file size.
* But I dont know, how can I get pages'sizes...
*/
for (var i = 1; i <= reader.NumberOfPages; i++)
{
/*
* MessageBox.Show(???);
*/
}









share|improve this question















In my articles which formatted PDF, one or more pages may be blanked and I want to detect them and remove from PDF file. If I can identify pages that are less than 60 KB, I think I can detect the pages that are empty. Because they're probably empty.



I tried like this:



var reader = new PdfReader("D:\_test\file.pdf");
/*
* With reader.FileLength, I can get whole pdf file size.
* But I dont know, how can I get pages'sizes...
*/
for (var i = 1; i <= reader.NumberOfPages; i++)
{
/*
* MessageBox.Show(???);
*/
}






c# itext page-size






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 23 at 15:04

























asked Nov 22 at 17:46









Colin Henricks

93652245




93652245








  • 1




    How about splitting the PDF into multiple PDFs, one for each page and then measure their respective sizes?
    – Uwe Keim
    Nov 23 at 8:25












  • @uweKeim, I don't want to split PDF file page by page. Because think about what would be useful for me if I splited a storybook page by page. It didn't sound professionally to reassemble the page after splitting the pages page by page and removing the blank pages.
    – Colin Henricks
    Nov 23 at 14:59














  • 1




    How about splitting the PDF into multiple PDFs, one for each page and then measure their respective sizes?
    – Uwe Keim
    Nov 23 at 8:25












  • @uweKeim, I don't want to split PDF file page by page. Because think about what would be useful for me if I splited a storybook page by page. It didn't sound professionally to reassemble the page after splitting the pages page by page and removing the blank pages.
    – Colin Henricks
    Nov 23 at 14:59








1




1




How about splitting the PDF into multiple PDFs, one for each page and then measure their respective sizes?
– Uwe Keim
Nov 23 at 8:25






How about splitting the PDF into multiple PDFs, one for each page and then measure their respective sizes?
– Uwe Keim
Nov 23 at 8:25














@uweKeim, I don't want to split PDF file page by page. Because think about what would be useful for me if I splited a storybook page by page. It didn't sound professionally to reassemble the page after splitting the pages page by page and removing the blank pages.
– Colin Henricks
Nov 23 at 14:59




@uweKeim, I don't want to split PDF file page by page. Because think about what would be useful for me if I splited a storybook page by page. It didn't sound professionally to reassemble the page after splitting the pages page by page and removing the blank pages.
– Colin Henricks
Nov 23 at 14:59












1 Answer
1






active

oldest

votes


















2














I would do this in 2 steps:




  • first go over the document using IEventListener to detect which pages are empty

  • once you've determined which pages are empty, simply create a new document by copying the non-empty pages from the source document into the new document


step 1:



List<Integer> emptyPages = new ArrayList<>();
PdfDocument pdfDocument = new PdfDocument(new PdfReader(new File(SRC)));
for(int i=1;i<pdfDocument.getNumberOfPages();i++){
IsEmptyEventListener l = new IsEmptyEventListener();
new PdfCanvasProcessor(l).processPageContent(pdfDocument.getPage(i));
if(l.isEmptyPage()){
emptyPages.add(i);
}
}


Then you need the proper implementation of IsEmptyEventListener. Which may be tricky and depend on your specific document(s). This is a demo.



class IsEmptyEventListener implements IEventListener {
private int eventCount = 0;
public void eventOccurred(IEventData data, EventType type){
// perhaps count only text rendering events?
eventCount++;
}
public boolean isEmptyPage(){ return eventCount < 32; }
}


step 2:



Based on this example: https://developers.itextpdf.com/examples/stamping-content-existing-pdfs/clone-reordering-pages



void copyNonBlankPages(List<Integer> blankPages, PdfDocument src, PdfDocument dst){
int N = src.getNumberOfPages();
List<Integer> toCopy = new ArrayList<>();
for(int i=1;i<N;i++){
if(!blankPages.contains(i)){
toCopy.add(i);
}
}
src.copyPagesTo(toCopy, dst);
}





share|improve this answer





















    Your Answer






    StackExchange.ifUsing("editor", function () {
    StackExchange.using("externalEditor", function () {
    StackExchange.using("snippets", function () {
    StackExchange.snippets.init();
    });
    });
    }, "code-snippets");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "1"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53436062%2fgetting-pdf-page-length%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    2














    I would do this in 2 steps:




    • first go over the document using IEventListener to detect which pages are empty

    • once you've determined which pages are empty, simply create a new document by copying the non-empty pages from the source document into the new document


    step 1:



    List<Integer> emptyPages = new ArrayList<>();
    PdfDocument pdfDocument = new PdfDocument(new PdfReader(new File(SRC)));
    for(int i=1;i<pdfDocument.getNumberOfPages();i++){
    IsEmptyEventListener l = new IsEmptyEventListener();
    new PdfCanvasProcessor(l).processPageContent(pdfDocument.getPage(i));
    if(l.isEmptyPage()){
    emptyPages.add(i);
    }
    }


    Then you need the proper implementation of IsEmptyEventListener. Which may be tricky and depend on your specific document(s). This is a demo.



    class IsEmptyEventListener implements IEventListener {
    private int eventCount = 0;
    public void eventOccurred(IEventData data, EventType type){
    // perhaps count only text rendering events?
    eventCount++;
    }
    public boolean isEmptyPage(){ return eventCount < 32; }
    }


    step 2:



    Based on this example: https://developers.itextpdf.com/examples/stamping-content-existing-pdfs/clone-reordering-pages



    void copyNonBlankPages(List<Integer> blankPages, PdfDocument src, PdfDocument dst){
    int N = src.getNumberOfPages();
    List<Integer> toCopy = new ArrayList<>();
    for(int i=1;i<N;i++){
    if(!blankPages.contains(i)){
    toCopy.add(i);
    }
    }
    src.copyPagesTo(toCopy, dst);
    }





    share|improve this answer


























      2














      I would do this in 2 steps:




      • first go over the document using IEventListener to detect which pages are empty

      • once you've determined which pages are empty, simply create a new document by copying the non-empty pages from the source document into the new document


      step 1:



      List<Integer> emptyPages = new ArrayList<>();
      PdfDocument pdfDocument = new PdfDocument(new PdfReader(new File(SRC)));
      for(int i=1;i<pdfDocument.getNumberOfPages();i++){
      IsEmptyEventListener l = new IsEmptyEventListener();
      new PdfCanvasProcessor(l).processPageContent(pdfDocument.getPage(i));
      if(l.isEmptyPage()){
      emptyPages.add(i);
      }
      }


      Then you need the proper implementation of IsEmptyEventListener. Which may be tricky and depend on your specific document(s). This is a demo.



      class IsEmptyEventListener implements IEventListener {
      private int eventCount = 0;
      public void eventOccurred(IEventData data, EventType type){
      // perhaps count only text rendering events?
      eventCount++;
      }
      public boolean isEmptyPage(){ return eventCount < 32; }
      }


      step 2:



      Based on this example: https://developers.itextpdf.com/examples/stamping-content-existing-pdfs/clone-reordering-pages



      void copyNonBlankPages(List<Integer> blankPages, PdfDocument src, PdfDocument dst){
      int N = src.getNumberOfPages();
      List<Integer> toCopy = new ArrayList<>();
      for(int i=1;i<N;i++){
      if(!blankPages.contains(i)){
      toCopy.add(i);
      }
      }
      src.copyPagesTo(toCopy, dst);
      }





      share|improve this answer
























        2












        2








        2






        I would do this in 2 steps:




        • first go over the document using IEventListener to detect which pages are empty

        • once you've determined which pages are empty, simply create a new document by copying the non-empty pages from the source document into the new document


        step 1:



        List<Integer> emptyPages = new ArrayList<>();
        PdfDocument pdfDocument = new PdfDocument(new PdfReader(new File(SRC)));
        for(int i=1;i<pdfDocument.getNumberOfPages();i++){
        IsEmptyEventListener l = new IsEmptyEventListener();
        new PdfCanvasProcessor(l).processPageContent(pdfDocument.getPage(i));
        if(l.isEmptyPage()){
        emptyPages.add(i);
        }
        }


        Then you need the proper implementation of IsEmptyEventListener. Which may be tricky and depend on your specific document(s). This is a demo.



        class IsEmptyEventListener implements IEventListener {
        private int eventCount = 0;
        public void eventOccurred(IEventData data, EventType type){
        // perhaps count only text rendering events?
        eventCount++;
        }
        public boolean isEmptyPage(){ return eventCount < 32; }
        }


        step 2:



        Based on this example: https://developers.itextpdf.com/examples/stamping-content-existing-pdfs/clone-reordering-pages



        void copyNonBlankPages(List<Integer> blankPages, PdfDocument src, PdfDocument dst){
        int N = src.getNumberOfPages();
        List<Integer> toCopy = new ArrayList<>();
        for(int i=1;i<N;i++){
        if(!blankPages.contains(i)){
        toCopy.add(i);
        }
        }
        src.copyPagesTo(toCopy, dst);
        }





        share|improve this answer












        I would do this in 2 steps:




        • first go over the document using IEventListener to detect which pages are empty

        • once you've determined which pages are empty, simply create a new document by copying the non-empty pages from the source document into the new document


        step 1:



        List<Integer> emptyPages = new ArrayList<>();
        PdfDocument pdfDocument = new PdfDocument(new PdfReader(new File(SRC)));
        for(int i=1;i<pdfDocument.getNumberOfPages();i++){
        IsEmptyEventListener l = new IsEmptyEventListener();
        new PdfCanvasProcessor(l).processPageContent(pdfDocument.getPage(i));
        if(l.isEmptyPage()){
        emptyPages.add(i);
        }
        }


        Then you need the proper implementation of IsEmptyEventListener. Which may be tricky and depend on your specific document(s). This is a demo.



        class IsEmptyEventListener implements IEventListener {
        private int eventCount = 0;
        public void eventOccurred(IEventData data, EventType type){
        // perhaps count only text rendering events?
        eventCount++;
        }
        public boolean isEmptyPage(){ return eventCount < 32; }
        }


        step 2:



        Based on this example: https://developers.itextpdf.com/examples/stamping-content-existing-pdfs/clone-reordering-pages



        void copyNonBlankPages(List<Integer> blankPages, PdfDocument src, PdfDocument dst){
        int N = src.getNumberOfPages();
        List<Integer> toCopy = new ArrayList<>();
        for(int i=1;i<N;i++){
        if(!blankPages.contains(i)){
        toCopy.add(i);
        }
        }
        src.copyPagesTo(toCopy, dst);
        }






        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Nov 27 at 12:36









        Joris Schellekens

        6,01111141




        6,01111141






























            draft saved

            draft discarded




















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.





            Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


            Please pay close attention to the following guidance:


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53436062%2fgetting-pdf-page-length%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            A CLEAN and SIMPLE way to add appendices to Table of Contents and bookmarks

            Calculate evaluation metrics using cross_val_predict sklearn

            Insert data from modal to MySQL (multiple modal on website)