使用 PHP 的 NLP 编程工具?
自从大型 Web 应用程序问世以来,搜索数据(并以闪电般的速度和准确度进行)一直是 Web 应用程序中最重要的问题之一.有一段时间,我一直在使用 Lucene.NET,它是 Lucene 项目.
Since big web applications came into existence, searching for data (and doing it lightning fast and accurate) has been one of the most important problems in web applications. For a while, I've worked using Lucene.NET, which is a C# port of the Lucene project.
我还通过 Zend Framework 的 Lucene API 使用 PHP,这让我我的问题.大多数时候,为了提供良好的索引,我们需要执行一些 NLP 工具,例如 tokenizing、lemmatizing 等等,问题是:
I also work using PHP using Zend Framework's Lucene API, which brings me to my question. Most times for providing good indexing we need to perform some NLP tools like tokenizing, lemmatizing, and many more, the question is:
你知道任何使用 PHP 的优秀 NLP 编程框架/工具集吗?
Do you know of any good NLP programming framework/toolset using PHP?
PS:我非常了解用于 Lucene 的 Zend API,但是正确索引数据不仅仅是在 Lucene 中存储和依赖,您需要执行一些额外的任务,例如上面的那些.
PS: I'm very aware of the Zend API for Lucene, but indexing data properly is not just storing and relying in Lucene, you need to perform some extra tasks, like those above.
推荐答案
我建议你看看 Solr,这是 Lucene 的最佳实践实现.Solr 使用基于 REST 的 API,它也有一个非常好的 PHP 客户端.这将允许您利用 Lucene 的强大功能,而无需执行任何低级编程即可获得所需的 NLP 功能.此外,您可能想要获取 Solr 的主干版本,因为 NLP 开发现在非常活跃,并且每天都在添加新功能.
I would suggest that you look at Solr, which is a best practice implementation of Lucene. Solr uses a REST based API that also has a very good PHP client. This will allow you to leverage the power of Lucene without needing to perform any of the low level programming to get the NLP power that you want. Also, you would probably want to grab the trunk version of Solr as the NLP development is very active right now and new capabilities are being added every day.
相关文章