英文:
Publishing to MediaWiki from MS Word
问题
我继承了一个项目,旨在将内部的 MediaWiki 站点作为多媒体、搜索和归档存储库,用于最初在 Microsoft Word 中编写的文档。如今,这些 Word 文档被转换为 PDF,然后存储在文件服务器上。但它们不是直接可搜索的,更新起来很困难,而且现有的文件服务器本身很难与其他可能希望消耗并展示这些 PDF 内容的内部应用程序集成。此外,现在是 2023 年,而不是 2003 年。进入 MediaWiki。
我正在研究选项,以使内容制作者可以继续使用他们现有的内容生产过程。他们喜欢 Word。他们习惯使用它,它是公司批准的软件,有培训课程教如何有效使用 Word。它还可以嵌入多媒体。
有些人在 Windows 上使用它,有些人在 MacOS 上使用它。
我卡住的地方是如何在他们的流程末端添加一些东西,以便内容可以在 Word 中创建,但也发送一个所见即所得版本到 MediaWiki。这需要对新内容(新页面)进行创作和对现有内容进行编辑都能实现。在 Word 中创建/编辑 --> 发布到 Wiki。如此反复。
MediaWiki 页面上描述的 Word 宏 只是关于将其转换为 MW 标记。这完全有用,是解决方案的必要部分,但只解决了一半的问题。转换后的文档仍然需要发布。
我尝试使用 Open Office 作为中间层与 Sun 的 Wiki Publisher。但:A)这需要额外的两个软件来获得批准和安装(OO + 扩展和 Java 运行时);B)我认为该扩展的最新版本(2008!?)不再与当前版本的 Open Office 兼容。
MediaWiki 的 MS 插件 听起来不太可靠,而且我只读到它只适用于 Windows 安装。
采用 复制/粘贴方法 听起来是最不可取的想法,但对于内容非常丰富的文档仍然相当不理想。而且非常冗长。
有人成功地实现了从 Word 到 MediaWiki 的无缝发布吗?我很愿意了解更多。
英文:
I have inherited a project that intends to make an internal MediaWiki site the multimedia, search, and archiving repository for documents originally authored in Microsoft Word. Today, those Word documents are converted into PDF, and then stored on a file server. But they are not directly searchable, they have proven difficult to update, and the existing file server itself is difficult to integrate with other internal applications that might want to consume & expose the content in those PDFs. Plus this is 2023, not 2003. Enter MediaWiki.
I am sorting through the options that allow the content producers to stick with their existing content production process. They like Word. They are used to it, it's an approved company software, there is training on how to use Word effectively. It handles embedding of multimedia.
Some use it on Windows, and some on MacOS.
Where I'm stuck is how to bolt on something to the end of their process so that the content can be authored in Word, but also sends a WYSIWYG version to the MediaWiki. This would need to happen for both authoring new content (a new page), and editing existing content. Create/edit in word --> publish to wiki. Lather/rinse/repeat.
The Word macros described on the MediaWiki page are simply about converting into the MW markdown. Totally useful and a necessary part of the solution, but only half the problem. The converted doc still has to get published.
I tried using Open Office as a middle layer with Sun's Wiki Publisher. But: A) that involves two extra pieces of software to get approved & installed (OO + extension & a Java runtime); and B) I think the latest (2008!?) version of the extension is no longer compatible with the current versions of Open Office.
The MS add-in for MediaWiki sounds flaky, and I've read only works for Windows install.
Doing the copy/paste method sounds the least terrible idea, but is still pretty terrible for documents that can be very media-rich. And looooooong.
Has anyone successfully implemented frictionless publishing from Word to a MediaWiki? I'd love to learn more.
答案1
得分: 1
Word文档被保存在公司服务器的特定文件夹中。一个脚本将它们与描述文件的元数据的XML文件一起复制到MediaWiki服务器上。然后,每隔一分钟运行一个导入脚本,除非它已经在运行,将获取XML并处理所有Word文档。在完成后,将所有日志记录到Graylog并将文档移动到成功或错误文件夹。转换是使用Pandoc完成的。每天大约处理20-50个文档,对于较旧的文档,起初有超过200,000个。
更详细的版本,我认为通过聊天进行交流会更好。
英文:
Short version..
Word documents are saved in certain folder on the companies server. A script copies them to where the MediaWiki server is together with an XML file with the Meta data describing the files. There an import script that runs every minute, unless its already running, will pick up the xml and process all the word documents. Logs everything to Graylog and moves document to success or error folders when done. Conversion is done with Pandoc.
Daily around 20-50 documents are processed and it started with 200,000+ for older documents.
Long version, I think a chat would be better.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论