全文索引从pdf文件流返回没有结果
我有一个在 Windows 8.1 x64 机器上的 SQL Server 2012 上运行的文件流表,其中已经存储了一些 PDF 和 TXT 文件,因此我决定使用以下命令创建全文索引来搜索这些文件:
I have a filestream table running on SQL Server 2012 on a Windows 8.1 x64 machine, which already have a few PDF and TXT files stored, so I decided to create a fulltext index to search through these files by using the following command:
CREATE FULLTEXT CATALOG FileStreamFTSCatalog AS DEFAULT;
CREATE FULLTEXT INDEX ON storage
(FileName Language 1046, File TYPE COLUMN FileExtension Language 1046)
KEY INDEX PK__storage__3214EC077DADCE3C
ON FileStreamFTSCatalog
WITH CHANGE_TRACKING AUTO;
然后我在阅读了一些和我有同样问题的人后发送了这些命令:
Then I sent these commands after reading some people having the same problem as me:
EXEC sp_fulltext_service @action='load_os_resources', @value=1;
EXEC sp_fulltext_service 'verify_signature', 0;
EXEC sp_fulltext_service 'update_languages';
Exec sp_fulltext_service 'ft_timeout', 600000;
Exec sp_fulltext_service 'ism_size',@value=16;
EXEC sp_fulltext_service 'restart_all_fdhosts';
EXEC sp_help_fulltext_system_components 'filter';
reconfigure with override
我可以看到配置的PDF IFilter
I can see the PDF IFilter configured
filter .pdf E8978DA6-047F-4E3D-9C78-CDBE46041603 C:Program FilesAdobeAdobe PDF iFilter 11 for 64-bit platformsinPDFFilter.dll 11.0.1.36 Adobe Systems, Inc.
我什至可以做一个
select * from storage
where contains(*, 'data')
但它只返回索引的 TXT 文件,所以我想知道:我还需要做些什么来开始索引我的 PDF 吗?或者是否有必要创建另一个表并重新插入我已经存储的所有这些 PDF,即使 TXT 文件已被索引?
but it's returning only the TXT files indexed, so I'm wondering: is there anything else I need to do to start indexing my PDFs? Or is it necessary to create another table and reinsert all these PDFs which I already had stored, even though the TXT files are getting indexed justfined?
更新 1:
打开 SQLFTXXX.LOG 我收到这条消息(对于 FileTable):
Opening the SQLFTXXX.LOG I get this message (for the FileTable):
2014-08-20 06:32:09.48 spid29s Warning: No appropriate filter was found during full-text index population for table or indexed view '[text_storage].[dbo].[storage_table]' (table or indexed view ID '355584405', database ID '7'), full-text key value '篰磧'. Some columns of the row were not indexed.
还有这个(对于 FileStream 表):
And this one (for the FileStream table):
2014-08-19 22:14:50.58 spid20s Warning: No appropriate filter was found during full-text index population for table or indexed view '[text_storage].[dbo].[storage]' (table or indexed view ID '674101442', database ID '7'), full-text key value '1797'. Some columns of the row were not indexed.
推荐答案
我终于找到了解决方案,在尝试了 Adobe 和 Foxit Ifilter 并出现相同的错误消息后,我发现了另一个名为PDFlib",我下载并关注了 其说明 使其可用于 SQL Server,重建索引,现在我的 pdf已编入索引并可搜索.
I've finally found a solution, after trying both Adobe and Foxit Ifilter with the same error message, I found this other Ifilter called "PDFlib", I downloaded it and followed its instructions to make it available to SQL Server, rebuilt the index and now my pdfs are indexed and can be searched.
我相信,如果我对其他 ifilter 遵循这些相同的说明,它们也能正常工作,我会在完成测试并更新结果后尝试这样做.
I believe that if I follow these same instructions for the other ifilters they will work as well, gonna try that after I'm done with my tests and update with the results.
相关文章