使用 JdbcDirectory 在数据库中创建 Lucene 索引

2022-01-15 00:00:00 postgresql lucene mysql java

我对 MySQL 和 Postgresql9.2 有问题,这就是问题所在:

I have a problem with MySQL and Postgresql9.2 this is the problem :

org.apache.lucene.store.jdbc.JdbcStoreException: Failed to execute sql [insert into             LUCENE_INDEX_TABLE (name_, value_, size_, lf_, deleted_) values ( ?, ?, ?, current_timestamp, ? )]; nested exception is org.postgresql.util.PSQLException: Les Large Objects ne devraient pas être utilisés en mode auto-commit.
org.postgresql.util.PSQLException: Les Large Objects ne devraient pas être utilisés en mode auto-commit.
at org.postgresql.largeobject.LargeObjectManager.createLO(LargeObjectManager.java:239)
at org.postgresql.largeobject.LargeObjectManager.createLO(LargeObjectManager.java:226)
at org.postgresql.jdbc2.AbstractJdbc2Statement.setBlob(AbstractJdbc2Statement.java:3048)
at org.apache.lucene.store.jdbc.index.AbstractJdbcIndexOutput$1.fillPrepareStatement(AbstractJdbcIndexOutput.java:55)
at org.apache.lucene.store.jdbc.support.JdbcTemplate.executeUpdate(JdbcTemplate.java:174)
at org.apache.lucene.store.jdbc.index.AbstractJdbcIndexOutput.close(AbstractJdbcIndexOutput.java:47)
at org.apache.lucene.store.jdbc.index.RAMAndFileJdbcIndexOutput.close(RAMAndFileJdbcIndexOutput.java:81)
at org.apache.lucene.util.IOUtils.close(IOUtils.java:141)
at org.apache.lucene.index.FieldsWriter.close(FieldsWriter.java:139)
at org.apache.lucene.index.StoredFieldsWriter.flush(StoredFieldsWriter.java:55)
at org.apache.lucene.index.DocFieldProcessor.flush(DocFieldProcessor.java:59)
at org.apache.lucene.index.DocumentsWriter.flush(DocumentsWriter.java:581)
at org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:3587)
at org.apache.lucene.index.IndexWriter.prepareCommit(IndexWriter.java:3376)
at org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:3485)
at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:3467)
at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:3451)
at test.lucene.chaima.JDBCIndexer.addIndex(JDBCIndexer.java:137)
at test.lucene.chaima.JDBCIndexer.index(JDBCIndexer.java:92)
at test.lucene.chaima.JDBCIndexer.createAndBuildIndex(JDBCIndexer.java:78)
at test.lucene.chaima.JDBCIndexer.buildIndex(JDBCIndexer.java:69)
at test.lucene.chaima.JDBCIndexer.main(JDBCIndexer.java:172)
org.apache.lucene.store.jdbc.JdbcStoreException: Failed to execute sql [insert into LUCENE_INDEX_TABLE (name_, value_, size_, lf_, deleted_) values ( ?, ?, ?, current_timestamp, ? )]; nested exception is org.postgresql.util.PSQLException: Les Large Objects ne devraient pas être utilisés en mode auto-commit.
org.postgresql.util.PSQLException: Les Large Objects ne devraient pas être utilisés en mode auto-commit.
at org.postgresql.largeobject.LargeObjectManager.createLO(LargeObjectManager.java:239)
at org.postgresql.largeobject.LargeObjectManager.createLO(LargeObjectManager.java:226)
at org.postgresql.jdbc2.AbstractJdbc2Statement.setBlob(AbstractJdbc2Statement.java:3048)
at org.apache.lucene.store.jdbc.index.AbstractJdbcIndexOutput$1.fillPrepareStatement(AbstractJdbcIndexOutput.java:55)
at org.apache.lucene.store.jdbc.support.JdbcTemplate.executeUpdate(JdbcTemplate.java:174)
at org.apache.lucene.store.jdbc.index.AbstractJdbcIndexOutput.close(AbstractJdbcIndexOutput.java:47)
at org.apache.lucene.store.jdbc.index.RAMAndFileJdbcIndexOutput.close(RAMAndFileJdbcIndexOutput.java:81)
at org.apache.lucene.store.ChecksumIndexOutput.close(ChecksumIndexOutput.java:61)
at org.apache.lucene.index.SegmentInfos.finishCommit(SegmentInfos.java:863)
at org.apache.lucene.index.IndexWriter.finishCommit(IndexWriter.java:3501)
at org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:3490)
at org.apache.lucene.index.IndexWriter.closeInternal(IndexWriter.java:1873)
at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:1812)
at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:1776)
at test.lucene.chaima.JDBCIndexer.index(JDBCIndexer.java:102)
at test.lucene.chaima.JDBCIndexer.createAndBuildIndex(JDBCIndexer.java:78)
at test.lucene.chaima.JDBCIndexer.buildIndex(JDBCIndexer.java:69)
at test.lucene.chaima.JDBCIndexer.main(JDBCIndexer.java:172)
org.apache.lucene.index.IndexNotFoundException: no segments* file found in test.lucene.chaima.MyJDBCDirectory@9506dc4 lockFactory=null: files: [write.lock]
at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:667)
at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:72)
at org.apache.lucene.index.IndexReader.open(IndexReader.java:256)
at test.lucene.chaima.JDBCSearcher.search(JDBCSearcher.java:56)
at test.lucene.chaima.JDBCIndexer.buildIndex(JDBCIndexer.java:70)
at test.lucene.chaima.JDBCIndexer.main(JDBCIndexer.java:172)

我找到了很多解决方案,但没有人解决我的问题如果有人可以帮助我,我需要解决方案谢谢.我把我的应用程序的源代码放在这里:我有 3 个类

i found many solution but no one solve my problem please i need the solution if any one can help me thanks. I put here the source code of my application : i have 3 classes

MyJDBCDirectory.java

MyJDBCDirectory.java

    package test.lucene.chaima;

import java.io.IOException;

import javax.sql.DataSource;

import org.apache.lucene.store.jdbc.JdbcDirectory;
import org.apache.lucene.store.jdbc.JdbcDirectorySettings;
import org.apache.lucene.store.jdbc.JdbcStoreException;
import org.apache.lucene.store.jdbc.dialect.Dialect;
import org.apache.lucene.store.jdbc.support.JdbcTable;

/**
 * The Class MyJDBCDirectory.
 *
 * @author prabhat.jha
 */
public class MyJDBCDirectory extends JdbcDirectory {

    /**
     * Instantiates a new my jdbc directory.
     *
     * @param dataSource
     *            the data source
     * @param dialect
     *            the dialect
     * @param settings
     *            the settings
     * @param tableName
     *            the table name
     */
    public MyJDBCDirectory(DataSource dataSource, Dialect dialect, JdbcDirectorySettings settings, String tableName) {
        super(dataSource, dialect, settings, tableName);
    }

    /**
     * Instantiates a new my jdbc directory.
     *
     * @param dataSource the data source
     * @param dialect the dialect
     * @param tableName the table name
     */
    public MyJDBCDirectory(DataSource dataSource, Dialect dialect, String tableName) {
        super(dataSource, dialect, tableName);
    }

    /**
     * Instantiates a new my jdbc directory.
     *
     * @param dataSource the data source
     * @param settings the settings
     * @param tableName the table name
     * @throws JdbcStoreException the jdbc store exception
     */
    public MyJDBCDirectory(DataSource dataSource, JdbcDirectorySettings settings, String tableName) throws JdbcStoreException {
        super(dataSource, settings, tableName);
    }

    /**
     * Instantiates a new my jdbc directory.
     *
     * @param dataSource the data source
     * @param table the table
     */
    public MyJDBCDirectory(DataSource dataSource, JdbcTable table) {
        super(dataSource, table);
    }

    /**
     * Instantiates a new my jdbc directory.
     *
     * @param dataSource the data source
     * @param tableName the table name
     * @throws JdbcStoreException the jdbc store exception
     */
    public MyJDBCDirectory(DataSource dataSource, String tableName) throws JdbcStoreException {
        super(dataSource, tableName);
    }

    /**
     * (non-Javadoc).
     *
     * @return the string[]
     * @throws IOException Signals that an I/O exception has occurred.
     * @see org.apache.lucene.store.Directory#listAll()
     */
    @Override
    public String[] listAll() throws IOException {
        return super.list();
    }

}

JDBCDatabaseUtil.java

JDBCDatabaseUtil.java

    package test.lucene.chaima;

import java.sql.Connection;
import java.sql.SQLException;

import javax.sql.DataSource;

import org.postgresql.ds.PGPoolingDataSource;
import org.postgresql.ds.PGSimpleDataSource;


//import com.mysql.jdbc.jdbc2.optional.MysqlDataSource;


/**
 * The Class JDBCDatabaseUtil.
 * @author prabhat.jha
 */
public class JDBCDatabaseUtil {
    /**
     * Gets the data source.
     *
     * @return the data source
     */
    public static DataSource getDataSource() {
        PGSimpleDataSource dataSource = new PGSimpleDataSource();
        dataSource.setUser("postgres"); 
        dataSource.setPassword("root"); 
        dataSource.setDatabaseName("postgres");
        /*MysqlDataSource dataSource = new MysqlDataSource();
        dataSource.setUser("root");
        dataSource.setPassword("root");
        dataSource.setEmulateLocators(true);
        dataSource.setUrl("jdbc:mysql://localhost:3306/lucene?emulateLocators=true&useUnicode=true&characterEncoding=UTF-8&useFastDateParsing=false");
        */
        return dataSource;
    }

    /**
     * Gets the connection.
     *
     * @return the connection
     * @throws SQLException
     *             the sQL exception
     */
    public static Connection getConnection() throws SQLException {
        //getDataSource().getConnection().setAutoCommit(false);
        return getDataSource().getConnection();
    }
}

JDBCIndexer.java

JDBCIndexer.java

    package test.lucene.chaima;

import java.io.IOException;
import java.sql.Connection;
import java.sql.PreparedStatement;
import java.sql.ResultSet;
import java.sql.SQLException;

import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.SimpleAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.index.CorruptIndexException;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.index.IndexWriterConfig;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.LockObtainFailedException;
import org.apache.lucene.store.jdbc.JdbcDirectory;
import org.apache.lucene.store.jdbc.dialect.PostgreSQLDialect;
import org.apache.lucene.util.Version;

/**
 * The Class JDBCIndexer.
 *
 * @author chaima
 */
public class JDBCIndexer {

    /** The jdbc directory. */
    private Directory   jdbcDirectory   = null;

    /**
     * Instantiates a new jDBC indexer.
     *
     * @param jdbcDirectory
     *            the jdbc directory
     */
    public JDBCIndexer(Directory jdbcDirectory) {
        super();
        this.jdbcDirectory = jdbcDirectory;
    }
    /**
     * Gets the jdbc directory.
     *
     * @return the jdbc directory
     */
    public Directory getJdbcDirectory() {
        if (jdbcDirectory == null) {
            throw new IllegalStateException("Index not yet build, rerun indexing");
        }
        return jdbcDirectory;
    }

    /**
     * Sets the jdbc directory.
     *
     * @param jdbcDirectory
     *            the new jdbc directory
     */
    public void setJdbcDirectory(Directory jdbcDirectory) {
        this.jdbcDirectory = jdbcDirectory;
    }

    /**
     * Builds the index.
     */
    public void buildIndex() {
        createAndBuildIndex();
    }

    /**
     * Creates the and build index.
     */
    private void createAndBuildIndex() {
       createIndexTable();
        index();
    }

    /**
     * Index.
     */
    private void index() {
        Analyzer analyzer = new SimpleAnalyzer(Version.LUCENE_36);
        IndexWriterConfig indexWriterConfig = new IndexWriterConfig(Version.LUCENE_36, analyzer);
        IndexWriter indexWriter = null;
        try {
            indexWriter = new IndexWriter(getJdbcDirectory(), analyzer,true, IndexWriter.MaxFieldLength.UNLIMITED);
            Boolean locked=indexWriter.isLocked(jdbcDirectory);
            addIndex(indexWriter);
        } catch (CorruptIndexException e) {
            e.printStackTrace();
        } catch (LockObtainFailedException e) {
            e.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
        } finally {
            if (indexWriter != null) {
                try {
                    indexWriter.close();
                } catch (CorruptIndexException e) {
                    e.printStackTrace();
                } catch (IOException e) {
                    e.printStackTrace();
                } finally {
                    indexWriter = null;
                }
            }
        }
    }

    /**
     * 
     *
     * @param indexWriter
     *            the index writer
     */
    private void addIndex(IndexWriter indexWriter) throws CorruptIndexException, IOException {
     try {
      Connection connection = JDBCDatabaseUtil.getConnection();
      connection.setAutoCommit(false);
      String query = "SELECT id, name, lastname FROM users";
      PreparedStatement pstmt = connection.prepareStatement(query);
      ResultSet resultSet = pstmt.executeQuery();
      while (resultSet.next()) {
       Document document = new Document();
       document.add(new Field("id", String.valueOf(resultSet.getInt(1)), Field.Store.YES, Field.Index.ANALYZED));
       document.add(new Field("name", String.valueOf(resultSet.getString(2)), Field.Store.YES, Field.Index.ANALYZED));
       document.add(new Field("lastname", String.valueOf(resultSet.getString(3)), Field.Store.YES, Field.Index.ANALYZED));

       indexWriter.addDocument(document);
       indexWriter.commit();
     }
     indexWriter.close();   
     } catch (SQLException e) {
      e.printStackTrace();
     }
    }

    /**
     * Creates the index table.
     */
    private void createIndexTable() {
        if (this.jdbcDirectory == null) {
            setJdbcDirectory(new MyJDBCDirectory(JDBCDatabaseUtil.getDataSource(), new PostgreSQLDialect(), "LUCENE_INDEX_TABLE"));
        }
        try {
            /**
             * No need to manually create index table, create method will
             * automatically create it.
             */
           boolean existe= ((JdbcDirectory) getJdbcDirectory()).tableExists();
           if(existe)
               System.out.println("table existe");
           else{
               System.out.println("table non existe");
               ((JdbcDirectory) getJdbcDirectory()).create(); 
           }

        } catch (IOException e) {
            e.printStackTrace();
        }
    }

    public static void main(String[] args) {
        new JDBCIndexer(null).buildIndex();
    }
}

和表用户:

    CREATE TABLE users
(
  id integer NOT NULL,
  name character(20),
  lastname character(20),
  CONSTRAINT pk_id PRIMARY KEY (id)
)
WITH (
  OIDS=FALSE
);
ALTER TABLE users
  OWNER TO postgres;

还有罐子:

    commons-logging-1.0.4.jar
compass-2.2.0.jar
lucene-core-3.6.1.jar
postgresql-9.2-1002.jdbc4.jar

推荐答案

我已经创建了自己的 JdbcDirectory 实现,尽管它会依赖 JEE6 来利用 @Singleton 注释.代码本身并不是太简单,无法粘贴到 StackOverflow 帖子中,它仍然有一些限制.关键部分是由于数据库锁定语义,您不能使用多个线程对单个事务执行多个操作.

I have created my own JdbcDirectory implementation, though it would rely on JEE6 to take advantage of the @Singleton annotation. The code itself isn't too trivial to paste into a StackOverflow post and it still has a few limitations. Key part being you cannot do multiple operations on a single transaction using multiple threads because of the database locking semantics.

https://github.com/trajano/doxdb/tree/jdbc-directory-example/doxdb-ejb/src/main/java/net/trajano/doxdb/search/lucene

查看您的实现,您似乎还保留了已删除的文件",这可能是因为它在数据库存储中的碎片较少,而我的已删除记录本身.

Looking at your implementation, it seems like you're also keeping the deleted "files" probably because it would have less fragmentation on the database store, whereas mine I had removed the record itself.

我已经标记了一个我正在使用的版本,它对于我的测试负载来说似乎足够稳定.随时提出意见或建议.

I have tagged a version that I am working with which seems stable enough for my test loads. Feel free to make comments or suggestions on it.

相关文章