PDFBOX 2.0+的Java扁平化注释,由Foxit创建的自由文本。

huangapple go评论128阅读模式
英文:

PDFBOX 2.0+ java flatten annotations freetext created by foxit

问题

public static void main(String [] args)
{
    String startDoc = "C:/test2/test.pdf";
    String finalFlat = "C:/test2/test_FLAT.pdf";

    try {
        // for testing
        try {
            //BasicConfigurator.configure();
            File myFile = new File(startDoc);
            PDDocument pdDoc = PDDocument.load(myFile);
            PDDocumentCatalog pdCatalog = pdDoc.getDocumentCatalog();
            PDAcroForm pdAcroForm = pdCatalog.getAcroForm();

            // set the NeedApperances flag
            pdAcroForm.setNeedAppearances(false);

            // correct the missing page link for the annotations
            for (PDPage page : pdDoc.getPages()) {

                for (PDAnnotation annot : page.getAnnotations()) {
                    System.out.println(annot.getContents());
                    System.out.println(annot.isPrinted());
                    System.out.println(annot.isLocked());

                    System.out.println(annot.getAppearance().toString());
                    PDPageContentStream contentStream = new PDPageContentStream(pdDoc, page, PDPageContentStream.AppendMode.APPEND, true, true);
                    int fontHeight = 14;
                    contentStream.setFont(PDType1Font.TIMES_ROMAN, fontHeight);

                    float height = annot.getRectangle().getLowerLeftY();

                    String s = annot.getContents().replaceAll("\t", "    ");

                    String ss[] = s.split("\\r");
                    for (String sss : ss) {
                        contentStream.beginText();
                        contentStream.newLineAtOffset(annot.getRectangle().getLowerLeftX(), height);
                        contentStream.showText(sss);
                        height = height + fontHeight * 2;

                        contentStream.endText();
                    }
                    contentStream.close();
                    page.getAnnotations().remove(annot);
                }
            }
            pdAcroForm.flatten();
            pdDoc.save(finalFlat);
            pdDoc.close();
        } catch (Exception e) {
            e.printStackTrace();
        }

    } catch (Exception e) {
        System.err.println("Exception: " + e.getLocalizedMessage());
    }
}
英文:

I ran into a very tough issue. We have forms that were supposed to be filled out, but some people used annotation freeform text comments in foxit instead of filling the form fields, so the annotations never flatten. When our render software generates the final document annotations are not included.

The solution I tried is to basically go through the document, get the annotation text content and write it to the pdf so it is on the final document then remove the actual annotation, but I run into an issue where I don't know the font the annotation is using, line space, etc so cannot find out how to get it from a pdfbox to recreate exacactly as the annotation looks on the unflattened form.
Basically I want to flatten annotatations that are freeform created in foxit (The typewriter comment feature)
Here is the code. It is working, but again I am struggling with figuring out how to get the annotations to write to my final pdf document. Again flatten on the acroform is not working because these are not acroform fields! The live code filters out anything that is not a freetext type annotation, but below code should show my issue.

    public static void main(String [] args)
{
String startDoc = "C:/test2/test.pdf";
String  finalFlat = "C:/test2/test_FLAT.pdf";
try {
// for testing
try {
//BasicConfigurator.configure();
File myFile = new File(startDoc);
PDDocument pdDoc = PDDocument.load( myFile );
PDDocumentCatalog pdCatalog = pdDoc.getDocumentCatalog();
PDAcroForm pdAcroForm = pdCatalog.getAcroForm();
// set the NeedApperances flag
pdAcroForm.setNeedAppearances(false);
// correct the missing page link for the annotations
for (PDPage page : pdDoc.getPages()) {
for (PDAnnotation annot : page.getAnnotations()) {
System.out.println(annot.getContents());
System.out.println(annot.isPrinted());
System.out.println(annot.isLocked());
System.out.println(annot.getAppearance().toString());
PDPageContentStream contentStream = new PDPageContentStream(pdDoc, page, PDPageContentStream.AppendMode.APPEND,true,true);
int fontHeight = 14; 
contentStream.setFont(PDType1Font.TIMES_ROMAN, fontHeight);
float height = annot.getRectangle().getLowerLeftY();
String s  = annot.getContents().replaceAll("\t", "    ");
String ss[] = s.split("\\r");
for(String sss : ss)
{
contentStream.beginText();	
contentStream.newLineAtOffset(annot.getRectangle().getLowerLeftX(),height );
contentStream.showText(sss);
height = height + fontHeight * 2 ;
contentStream.endText();
}
contentStream.close();
page.getAnnotations().remove(annot);    				
}
}    			
pdAcroForm.flatten();
pdDoc.save(finalFlat);
pdDoc.close();
}
catch (Exception e) {
e.printStackTrace();
}   
}
catch (Exception e) {
System.err.println("Exception: " + e.getLocalizedMessage());
}
}

答案1

得分: 1

这个代码片段不是很有趣。经过无数次不同的测试,我仍然不明白所有的细微差别,但这是版本,它似乎可以将所有可见于 PDF 上的 PDF 文件和注释平铺。我测试了大约半打 PDF 创建器,如果一个页面上有注释,希望它可以将其平铺。我怀疑通过提取矩阵并进行变换等方式可能有更好的方法,但这是我在各个地方都能使其工作的唯一方法。

public static void flattenv3(String startDoc, String endDoc) {
    org.apache.log4j.Logger.getRootLogger().setLevel(org.apache.log4j.Level.INFO);
    String finalFlat = endDoc;

    try {
        try {
            File myFile = new File(startDoc);
            PDDocument pdDoc = PDDocument.load(myFile);
            PDDocumentCatalog pdCatalog = pdDoc.getDocumentCatalog();
            PDAcroForm pdAcroForm = pdCatalog.getAcroForm();

            if (pdAcroForm != null) {
                pdAcroForm.setNeedAppearances(false);
                pdAcroForm.flatten();
            }

            boolean isContentStreamWrapped;
            int ii = 0;

            for (PDPage page : pdDoc.getPages()) {
                PDPageContentStream contentStream;
                isContentStreamWrapped = false;
                List<PDAnnotation> annotations = new ArrayList<>();

                for (PDAnnotation annotation : page.getAnnotations()) {
                    if (!annotation.isInvisible() && !annotation.isHidden() && annotation.getNormalAppearanceStream() != null) {
                        ii++;
                        if (ii > 1) {
                            // contentStream.close();
                            // continue;
                        }

                        if (!isContentStreamWrapped) {
                            contentStream = new PDPageContentStream(pdDoc, page, AppendMode.APPEND, true, true);
                            isContentStreamWrapped = true;
                        } else {
                            contentStream = new PDPageContentStream(pdDoc, page, AppendMode.APPEND, true);
                        }

                        PDAppearanceStream appearanceStream = annotation.getNormalAppearanceStream();
                        PDFormXObject fieldObject = new PDFormXObject(appearanceStream.getCOSObject());

                        contentStream.saveGraphicsState();

                        boolean needsTranslation = resolveNeedsTranslation(appearanceStream);

                        Matrix transformationMatrix = new Matrix();
                        boolean transformed = false;

                        // Transformations logic...

                        transformationMatrix.translate(lowerLeftX, lowerLeftY);
                        contentStream.transform(transformationMatrix);

                        contentStream.drawForm(fieldObject);
                        contentStream.restoreGraphicsState();
                        contentStream.close();
                    }
                }
                page.setAnnotations(annotations);
            }

            pdDoc.save(finalFlat);
            pdDoc.close();
            File file = new File(finalFlat);

        } catch (Exception e) {
            e.printStackTrace();
        }
    } catch (Exception e) {
        System.err.println("Exception: " + e.getLocalizedMessage());
    }
}
英文:

This was not a fun one. After a million different tests, and I STILL do not understand all the nuances, but this is the version that appeas to flatten all pdf files and annotations if they are visible on PDF. Tested about half a dozen pdf creators and if an annotation is visible on a page this hopefully flattens it. I suspect there is a better way by pulling the matrix and transforming it and what not, but this is the only way I got it to work everywhere.

public static void flattenv3(String startDoc, String endDoc) {
org.apache.log4j.Logger.getRootLogger().setLevel(org.apache.log4j.Level.INFO);
String finalFlat = endDoc;
try {
try {
//BasicConfigurator.configure();
File myFile = new File(startDoc);
PDDocument pdDoc = PDDocument.load(myFile);
PDDocumentCatalog pdCatalog = pdDoc.getDocumentCatalog();
PDAcroForm pdAcroForm = pdCatalog.getAcroForm();
if (pdAcroForm != null) {
pdAcroForm.setNeedAppearances(false);
pdAcroForm.flatten();
}
// set the NeedApperances flag
boolean isContentStreamWrapped;
int ii = 0;
for (PDPage page: pdDoc.getPages()) {
PDPageContentStream contentStream;
isContentStreamWrapped = false;
List &lt; PDAnnotation &gt; annotations = new ArrayList &lt; &gt; ();
for (PDAnnotation annotation: page.getAnnotations()) {
if (!annotation.isInvisible() &amp;&amp; !annotation.isHidden() &amp;&amp; annotation.getNormalAppearanceStream() != null)
{
ii++;
if (ii &gt; 1) {
// contentStream.close();
// continue;
}
if (!isContentStreamWrapped) {
contentStream = new PDPageContentStream(pdDoc, page, AppendMode.APPEND, true, true);
isContentStreamWrapped = true;
} else {
contentStream = new PDPageContentStream(pdDoc, page, AppendMode.APPEND, true);
}
PDAppearanceStream appearanceStream = annotation.getNormalAppearanceStream();
PDFormXObject fieldObject = new PDFormXObject(appearanceStream.getCOSObject());
contentStream.saveGraphicsState();
boolean needsTranslation = resolveNeedsTranslation(appearanceStream);
Matrix transformationMatrix = new Matrix();
boolean transformed = false;
float lowerLeftX = annotation.getNormalAppearanceStream().getBBox().getLowerLeftX();
float lowerLeftY = annotation.getNormalAppearanceStream().getBBox().getLowerLeftY();
PDRectangle bbox = appearanceStream.getBBox();
PDRectangle fieldRect = annotation.getRectangle();
float xScale = fieldRect.getWidth() - bbox.getWidth();
transformed = true;
lowerLeftX = fieldRect.getLowerLeftX();
lowerLeftY = fieldRect.getLowerLeftY();
if (bbox.getLowerLeftX() &lt;= 0 &amp;&amp; bbox.getLowerLeftY() &lt; 0 &amp;&amp; Math.abs(xScale) &lt; 1) //BASICALLY EQUAL TO 0 WITH ROUNDING
{
lowerLeftY = fieldRect.getLowerLeftY() - bbox.getLowerLeftY();
if (bbox.getLowerLeftX() &lt; 0 &amp;&amp; bbox.getLowerLeftY() &lt; 0) //THis is for the o
{
lowerLeftX = lowerLeftX - bbox.getLowerLeftX(); 
}
} else if (bbox.getLowerLeftX() == 0 &amp;&amp; bbox.getLowerLeftY() &lt; 0 &amp;&amp; xScale &gt;= 0) {
lowerLeftX = fieldRect.getUpperRightX();
} else if (bbox.getLowerLeftY() &lt;= 0 &amp;&amp; xScale &gt;= 0) {
lowerLeftY = fieldRect.getLowerLeftY() - bbox.getLowerLeftY() - xScale;
} else if (bbox.getUpperRightY() &lt;= 0) {
if (annotation.getNormalAppearanceStream().getMatrix().getShearY() &lt; 0) {
lowerLeftY = fieldRect.getUpperRightY();
lowerLeftX = fieldRect.getUpperRightX();
}
} else {
}
transformationMatrix.translate(lowerLeftX,
lowerLeftY);
contentStream.transform(transformationMatrix);
contentStream.drawForm(fieldObject);
contentStream.restoreGraphicsState();
contentStream.close();
}
}
page.setAnnotations(annotations);
}
pdDoc.save(finalFlat);
pdDoc.close();
File file = new File(finalFlat);
// Desktop.getDesktop().browse(file.toURI());
} catch (Exception e) {
e.printStackTrace();
}
} catch (Exception e) {
System.err.println(&quot;Exception: &quot; + e.getLocalizedMessage());
}
}

}

huangapple
  • 本文由 发表于 2020年4月9日 10:04:43
  • 转载请务必保留本文链接:https://go.coder-hub.com/61112832.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定