用 PHP 读/写 MS Word 文件
是否可以在不使用 COM 对象的情况下在 PHP 中读取和写入 Word(2003 和 2007)文件?我知道我可以:
Is it possible to read and write Word (2003 and 2007) files in PHP without using a COM object? I know that I can:
$file = fopen('c:file.doc', 'w+');
fwrite($file, $text);
fclose();
但 Word 会将其作为 HTML 文件而不是本机 .doc 文件读取.
but Word will read it as an HTML file not a native .doc file.
推荐答案
读取二进制 Word 文档将涉及根据已发布的 DOC 格式文件格式规范创建解析器.我认为这不是真正可行的解决方案.
Reading binary Word documents would involve creating a parser according to the published file format specifications for the DOC format. I think this is no real feasible solution.
您可以使用 Microsoft Office XML 格式来读取和写入 Word 文件 -这与 2003 和 2007 版本的 Word 兼容.为了阅读,您必须确保以正确的格式保存 Word 文档(在 Word 2007 中称为 Word 2003 XML 文档).对于编写,您只需遵循公开可用的 XML 模式.我从未使用这种格式从 PHP 写出 Office 文档,但我使用它来读取 Excel 工作表(自然保存为 XML-Spreadsheet 2003)并在网页上显示其数据.由于文件是简单的 XML 数据,因此在其中导航并找出如何提取所需的数据没有问题.
You could use the Microsoft Office XML formats for reading and writing Word files - this is compatible with the 2003 and 2007 version of Word. For reading you have to ensure that the Word documents are saved in the correct format (it's called Word 2003 XML-Document in Word 2007). For writing you just have to follow the openly available XML schema. I've never used this format for writing out Office documents from PHP, but I'm using it for reading in an Excel worksheet (naturally saved as XML-Spreadsheet 2003) and displaying its data on a web page. As the files are plainly XML data it's no problem to navigate within and figure out how to extract the data you need.
另一个选项 - 仅适用于 Word 2007 的选项(如果您的 Word 2003 中未安装 OpenXML 文件格式) - 将使用 OpenXML.正如 databyss 指出的那样 此处 DOCX 文件格式只是包含 XML 文件的 ZIP 存档.MSDN 上有很多关于 OpenXML 文件格式的资源,所以你应该能够弄清楚如何读取你想要的数据.我认为写作会复杂得多 - 这仅取决于您将投入多少时间.
The other option - a Word 2007 only option (if the OpenXML file formats are not installed in your Word 2003) - would be to ressort to OpenXML. As databyss pointed out here the DOCX file format is just a ZIP archive with XML files included. There are a lot of resources on MSDN regarding the OpenXML file format, so you should be able to figure out how to read the data you want. Writing will be much more complicated I think - it just depends on how much time you'll invest.
也许你可以看看 PHPExcel 这是一个能够编写的库到 Excel 2007 文件并使用 OpenXML 标准从 Excel 2007 文件中读取.在尝试读取和编写 OpenXML Word 文档时,您可以了解所涉及的工作.
Perhaps you can have a look at PHPExcel which is a library able to write to Excel 2007 files and read from Excel 2007 files using the OpenXML standard. You could get an idea of the work involved when trying to read and write OpenXML Word documents.
相关文章