使用Apache POI在docx文档中的特定单词或运行中添加注释。

huangapple go评论68阅读模式
英文:

Adding comment to a specific word or run in docx document using Apache POI

问题

我的目标是在Word .docx文档中搜索一个单词或短语,并为其添加注释。关于使用Apache POI添加注释,我一直在参考这里这里这里的示例代码。然而,这三个示例都是将注释添加到整个段落(甚至整个表格),而不是添加到特定的单词或运行。

我尝试在运行级别创建XML游标,但无法将其转换为必要的CTMarkupRange以应用注释的起始和结束。

// 创建注释
BigInteger cId = getCommentId(comments);
ctComment = comments.addNewComment();
ctComment.setAuthor("John Smith");
ctComment.setInitials("JS");
ctComment.setDate(new GregorianCalendar(Locale.getDefault()));
ctComment.addNewP().addNewR().addNewT().setStringValue("Test Comment");
ctComment.setId(cId);

// 设置CommentRangeStart
String uri = CTMarkupRange.type.getName().getNamespaceURI();
String localPart = "commentRangeStart";

XmlCursor cursor = r.getCTR().newCursor(); 	
cursor.toFirstChild();
cursor.beginElement(localPart, uri);
cursor.toParent();
CTMarkupRange commentRangeStart =  (CTMarkupRange) cursor.getObject(); // 这一行会抛出ClassCastException错误
cursor.dispose();

commentRangeStart.setId(cId);

// 设置CommentRangeEnd和CommentReference
p.getCTP().addNewCommentRangeEnd().setId(cId);
r.getCTR().addNewCommentReference().setId(cId);

编辑1:循环遍历运行的逻辑示例

for(XWPFParagraph p:paragraphs){
    List<XWPFRun> runs = p.getRuns();
    if (runs.size() > 0) {
        for (XWPFRun r : runs) {
            String text = r.getText(0);
            for (Map.Entry<String, List<String>> entry : rules.entrySet()) {
                String key = entry.getKey();
                List<String> value = entry.getValue();

                for (int i = 0; i < value.size(); i++) {
                    if (text != null && regexContains(text, value.get(i))) {
                        // 创建注释
                        BigInteger cId = getCommentId(comments);
                        ctComment = comments.addNewComment();
                        ctComment.setAuthor("John Smith");
                        ctComment.setInitials("JS");
                        ctComment.setDate(new GregorianCalendar(Locale.getDefault()));
                        ctComment.addNewP().addNewR().addNewT().setStringValue(key);
                        ctComment.setId(cId);

                        // 新的Axel Richter代码片段
                        p.getCTP().addNewCommentRangeStart().setId(cId);
                        
                        p.getCTP().addNewCommentRangeEnd().setId(cId);
                        p.getCTP().addNewR().addNewCommentReference().setId(cId);
                    }
                }
            }
        }
    }
}
英文:

My goal is to search for a word or a phrase in a Word .docx document, and add a comment to it. I have been referring to the sample code found here, here, and here with regards to adding comments using Apache POI. However, all three examples add comments to a whole paragraph (or even a whole table) rather than to a specific word, or run.

I have tried creating an XML cursor at the run level, but cannot cast it to the necessary CTMarkupRange to apply the start and end of the comment.

		// Create comment
						BigInteger cId = getCommentId(comments);
						ctComment = comments.addNewComment();
						ctComment.setAuthor(&quot;John Smith&quot;);
						ctComment.setInitials(&quot;JS&quot;);
						ctComment.setDate(new GregorianCalendar(Locale.getDefault()));
						ctComment.addNewP().addNewR().addNewT().setStringValue(&quot;Test Comment&quot;);
						ctComment.setId(cId);
						
		// Set CommentRangeStart
						String uri = CTMarkupRange.type.getName().getNamespaceURI();
						String localPart = &quot;commentRangeStart&quot;;

						// XmlCursor cursor = p.getCTP().newCursor();
						XmlCursor cursor = r.getCTR().newCursor(); 	
						cursor.toFirstChild();
						cursor.beginElement(localPart, uri);
						cursor.toParent();
						CTMarkupRange commentRangeStart =  (CTMarkupRange) cursor.getObject(); // This line throws a ClassCastException error
						cursor.dispose();

						commentRangeStart.setId(cId);

		// Set CommentRangeEnd and CommentReference

						p.getCTP().addNewCommentRangeEnd().setId(cId);
						// p.getCTP().addNewR().addNewCommentReference().setId(cId);
						r.getCTR().addNewCommentReference().setId(cId);

EDIT1: Snippet showing the logic for looping through the runs

for(XWPFParagraph p:paragraphs){
	List&lt;XWPFRun&gt; runs = p.getRuns();
	if (runs.size() &gt; 0) {
		for (XWPFRun r : runs) {
			String text = r.getText(0);
			for (Map.Entry&lt;String, List&lt;String&gt;&gt; entry : rules.entrySet()) {
				String key = entry.getKey();
				List&lt;String&gt; value = entry.getValue();

				for (int i = 0; i &lt; value.size(); i++) {
					if (text != null &amp;&amp; regexContains(text, value.get(i))) {
						// Create comment
						BigInteger cId = getCommentId(comments);
						ctComment = comments.addNewComment();
						ctComment.setAuthor(&quot;John Smith&quot;);
						ctComment.setInitials(&quot;JS&quot;);
						ctComment.setDate(new GregorianCalendar(Locale.getDefault()));
						ctComment.addNewP().addNewR().addNewT().setStringValue(key);
						ctComment.setId(cId);

						// New snippet from Axel Richter
						p.getCTP().addNewCommentRangeStart().setId(cId);
						

						p.getCTP().addNewCommentRangeEnd().setId(cId);
						p.getCTP().addNewR().addNewCommentReference().setId(cId);
					}
				}
			}
		}
	}

}

答案1

得分: 2

以下是您提供的代码的翻译部分:

这并不像你想象的那么难

要在段落内注释一个运行注释范围的起始点需要设置在文本运行之前的段落内注释范围的结束点需要设置在段落内的文本运行结束后这正是我的代码示例已经完成的当然我代码示例中的所有段落都只有一个文本运行

在以下完整示例中第二个注释仅注释了单词second”。为此段落有三个文本运行第一个文本运行包含文本Paragraph with the ”,第二个包含文本second并带有注释第三个包含文本 comment.”。

    import java.io.*;
    // ... 其他导入 ...

    public class CreateWordWithComments {
    
        // 创建 *.docx ZIP 归档中的 /word/comments.xml 的方法
        private static MyXWPFCommentsDocument createCommentsDocument(XWPFDocument document) throws Exception {
            // ... 方法内容 ...
        }
    
        public static void main(String[] args) throws Exception {
            XWPFDocument document = new XWPFDocument();
            MyXWPFCommentsDocument myXWPFCommentsDocument = createCommentsDocument(document);
            CTComments comments = myXWPFCommentsDocument.getComments();
            CTComment ctComment;
            XWPFParagraph paragraph;
            XWPFRun run;
    
            // 第一个注释
            BigInteger cId = BigInteger.ZERO;
            // ... 其他操作 ...
    
            // 没有注释的段落
            paragraph = document.createParagraph();
            run = paragraph.createRun();
            run.setText("段落没有注释。");
    
            // 第二个注释
            cId = cId.add(BigInteger.ONE);
            // ... 其他操作 ...
    
            // 写入文档
            FileOutputStream out = new FileOutputStream("CreateWordWithComments.docx");
            document.write(out);
            out.close();
            document.close();
        }
    
        // 对 *.docx ZIP 归档中的 /word/comments.xml 的封装类
        private static class MyXWPFCommentsDocument extends POIXMLDocumentPart {
            // ... 类内容 ...
        }
    }

请注意,这里只提供了代码的翻译部分,其他内容被省略。如果您还有其他问题或需要进一步帮助,请随时提问。

英文:

This is not as difficult as you might think.

To comment a run inside a paragraph, the comment range start needs to be set before text run starts in paragraph. The comment range end needs to be set after text run ends in paragraph. This is exactly what my code examples had done already. Of course all paragraphs in my code examples have had only one text run.

In following complete example the second comment comments the word "second" only. To do so the paragraph has three text runs. First having text "Paragraph with the ", second having text "second" and has comment and third having text " comment.".

import java.io.*;
import org.apache.poi.*;
import org.apache.poi.ooxml.*;
import org.apache.poi.openxml4j.opc.*;
import org.apache.xmlbeans.*;
import org.apache.poi.xwpf.usermodel.*;
import static org.apache.poi.ooxml.POIXMLTypeLoader.DEFAULT_XML_OPTIONS;
import org.openxmlformats.schemas.wordprocessingml.x2006.main.*;
import javax.xml.namespace.QName;
import java.math.BigInteger;
import java.util.GregorianCalendar;
import java.util.Locale;
public class CreateWordWithComments {
//a method for creating the CommentsDocument /word/comments.xml in the *.docx ZIP archive  
private static MyXWPFCommentsDocument createCommentsDocument(XWPFDocument document) throws Exception {
OPCPackage oPCPackage = document.getPackage();
PackagePartName partName = PackagingURIHelper.createPartName(&quot;/word/comments.xml&quot;);
PackagePart part = oPCPackage.createPart(partName, &quot;application/vnd.openxmlformats-officedocument.wordprocessingml.comments+xml&quot;);
MyXWPFCommentsDocument myXWPFCommentsDocument = new MyXWPFCommentsDocument(part);
String rId = document.addRelation(null, XWPFRelation.COMMENT, myXWPFCommentsDocument).getRelationship().getId();
return myXWPFCommentsDocument;
}
public static void main(String[] args) throws Exception {
XWPFDocument document = new XWPFDocument();
MyXWPFCommentsDocument myXWPFCommentsDocument = createCommentsDocument(document);
CTComments comments = myXWPFCommentsDocument.getComments();
CTComment ctComment;
XWPFParagraph paragraph;
XWPFRun run;
//first comment
BigInteger cId = BigInteger.ZERO;
ctComment = comments.addNewComment();
ctComment.setAuthor(&quot;Axel R&#237;chter&quot;);
ctComment.setInitials(&quot;AR&quot;);
ctComment.setDate(new GregorianCalendar(Locale.US));
ctComment.addNewP().addNewR().addNewT().setStringValue(&quot;The first comment.&quot;);
ctComment.setId(cId);
paragraph = document.createParagraph();
paragraph.getCTP().addNewCommentRangeStart().setId(cId); //comment range start is set before text run
run = paragraph.createRun();
run.setText(&quot;Paragraph with the first comment.&quot;);
paragraph.getCTP().addNewCommentRangeEnd().setId(cId); //comment range end is set after text run
paragraph.getCTP().addNewR().addNewCommentReference().setId(cId); 
//paragraph without comment
paragraph = document.createParagraph();
run = paragraph.createRun();
run.setText(&quot;Paragraph without comment.&quot;);
//second comment
cId = cId.add(BigInteger.ONE);
ctComment = comments.addNewComment();
ctComment.setAuthor(&quot;Axel R&#237;chter&quot;);
ctComment.setInitials(&quot;AR&quot;);
ctComment.setDate(new GregorianCalendar(Locale.US));
ctComment.addNewP().addNewR().addNewT().setStringValue(&quot;The second comment. Comments the word \&quot;second\&quot;.&quot;);
ctComment.setId(cId);
paragraph = document.createParagraph();
run = paragraph.createRun();
run.setText(&quot;Paragraph with the &quot;);
paragraph.getCTP().addNewCommentRangeStart().setId(cId); //comment range start is set before text run
run = paragraph.createRun();
run.setText(&quot;second&quot;);
paragraph.getCTP().addNewCommentRangeEnd().setId(cId); //comment range end is set after text run
run = paragraph.createRun();
run.setText(&quot; comment.&quot;);
paragraph.getCTP().addNewR().addNewCommentReference().setId(cId);
//write document
FileOutputStream out = new FileOutputStream(&quot;CreateWordWithComments.docx&quot;);
document.write(out);
out.close();
document.close();
}
//a wrapper class for the CommentsDocument /word/comments.xml in the *.docx ZIP archive
private static class MyXWPFCommentsDocument extends POIXMLDocumentPart {
private CTComments comments;
private MyXWPFCommentsDocument(PackagePart part) throws Exception {
super(part);
comments = CommentsDocument.Factory.newInstance().addNewComments();
}
private CTComments getComments() {
return comments;
}
@Override
protected void commit() throws IOException {
XmlOptions xmlOptions = new XmlOptions(DEFAULT_XML_OPTIONS);
xmlOptions.setSaveSyntheticDocumentElement(new QName(CTComments.type.getName().getNamespaceURI(), &quot;comments&quot;));
PackagePart part = getPackagePart();
OutputStream out = part.getOutputStream();
comments.save(out, xmlOptions);
out.close();
}
}
}

huangapple
  • 本文由 发表于 2020年9月17日 17:29:55
  • 转载请务必保留本文链接:https://go.coder-hub.com/63935140.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定