Java搜索字符串并获取以下行(直到空行)

2022-07-19 00:00:00 search java bufferedreader

我正在搜索的文件如下所示:

            keyword: 
            ====================
            category1:
            ----------
            St2
            Dpe
            Tmot:
            Bnw
            category2:
            ----------
            Rer
            Loo


            keyword2:
            ====================
            .
            .
            .

我想做的事:

  1. 搜索包含关键字的行(追加":")
  2. 将以下所有行读入列表
  3. 行为空时停止

在我的示例中,我使用"关键字"调用我的搜索函数,它将把"="到"Loo"中的所有内容都添加到列表

我已经有了一个糟糕的解决方案,但如果搜索的关键字实际上不在文本文件中,它就会变得疯狂:

BufferedReader b = null;

try {
    b = new BufferedReader(new FileReader(txtfile));
} catch (FileNotFoundException e) {
    e.printStackTrace();
}

// search for the keyword and save the corresponding text block in a list
while ((readLine = b.readLine()) != null) 
{
    if(readLine.contains(keyword) && !(readLine.contains("_"+keyword)))
    {
        System.out.println("keyword is: " + readLine);
        while ((readLine = b.readLine()) != null) 
        {
            if(readLine.trim().isEmpty()) //stop at empyt line
            {
                break;
            } else {
                arr.add(readLine); // add element to list
            }
        }
    }
}

!(ReadLine.Containes("_"+Keyword)语句在那里是因为有时关键字也显示为"FUN_Keyword:",而我只想在"Keyword:"行

停止

问题:如果关键字不在文件中,我如何重写此函数以使其仍然正常工作(不向列表中添加任何内容)?


解决方案

您希望实现的目标不是很清楚。但我假设您希望得到如下结果:

myKeyword->myValueAssociatedToMyKeyWord

在我看来,您应该将任务分解为小块(函数)。比如读取你的文件,解析你的区块,找到一个关键字。您还应该定义什么是关键字(例如,以‘:’结尾,后跟至少包含6x‘=’的行)。不要忘记转义结果中不感兴趣的所有行(如‘-’,...)。

这是我的结果:

package com.example;

import java.io.BufferedReader;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.IOException;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.regex.Pattern;

public class Example3 {

    public static void main(String[] args) throws IOException {

        // Read file
        final List<String> lines = readWholeFile("text.txt");
        System.out.println("Read " + lines.size() + " lines.");

        // Extract block with keyword
        final Map<String, List<String>> result = mapBlockToKeyword(lines);

        // Print out the result
        for (Map.Entry<String, List<String>> entry : result.entrySet()) {
            String keyword = entry.getKey();
            entry.getValue().forEach(w -> System.out.println(keyword + " -> " + w));
        }

    }

    private static Map<String, List<String>> mapBlockToKeyword(final List<String> lines) {
        final Map<String, List<String>> result = new HashMap<>();
        String lastKeyword = "<undefined>";
        for (int i = 0; i < lines.size(); i++) {
            final String line = lines.get(i);

            // Is it a keyword?
            if(isKeyword(line, lines, i)){
                lastKeyword = line;
                if(result.get(lastKeyword) == null){
                    result.put(lastKeyword, new ArrayList<String>());
                }
                continue;
            } 

            // Is it a line we don't want to put in our result?
            if (  lineHasAtLeastNTimesConsequtiveSameChar(line, 6, '=') || //
                    lineHasAtLeastNTimesConsequtiveSameChar(line, 6, '-') || //
                    line.trim().isEmpty()) {
                    // We don't want '======' to be associate to a keyword,
                    // escape it.
                    continue;
            }

            // Is it a value to add to keyword ?
            if (result.get(lastKeyword) != null) {
                result.get(lastKeyword).add(line);
            } else {
                System.err.println("Try to associate a value to a non-existant keyword.");
            }
        }
        return result;
    }

    private static boolean isKeyword(final String currentLine, final List<String> lines, final int idxLine){
        final boolean hasNextLine = (lines.size() - 1 <= idxLine) ? false : true;
        if (hasNextLine) {
            final String nextLine = lines.get(idxLine + 1);
            // To be a keyword, it has to have a next line and ends with ':'
            if (    hasNextLine && //
                    stringEndsWithChar(currentLine, ':') && //
                    lineHasAtLeastNTimesConsequtiveSameChar(nextLine, 6, '=')) {
                return true;
            }
        }
        return false;
    }

    private static List<String> readWholeFile(final String path) {

        List<String> lines = new ArrayList<>();
        try (BufferedReader reader = new BufferedReader(new FileReader(path))) {
            String line = null;
            while ((line = reader.readLine()) != null) {
                lines.add(line);
            }
        } catch (FileNotFoundException e) {
            // Would be better in a logger
            System.err.println("Cannot find the file: " + e.getStackTrace());
            e.printStackTrace();
        } catch (IOException e) {
            // Would be better in a logger
            System.err.println("Cannot read the file: " + e.getStackTrace());
        }
        return lines;
    }

    private static boolean stringEndsWithChar(String line, char c) {
        if (line != null && line.length() > 1) {
            char lastLineChar = line.charAt(line.length() - 1);
            return lastLineChar == c;
        }
        return false;
    }

    private static boolean lineHasAtLeastNTimesConsequtiveSameChar(final String line, int nTimes, char c) {
        if (line != null && line.length() >= nTimes) {
            Pattern pattern = Pattern.compile("^.*("+c+"{"+nTimes+",}).*$");
            return pattern.matcher(line).find();
        }
        return false;
    }

}

结果:

> Read 19 lines. 
> keyword2: -> . 
> keyword2: -> . 
> keyword2: -> . 
> keyword: -> category1: 
> keyword: -> St2 
> keyword: -> Dpe 
> keyword: -> Tmot: 
> keyword: -> Bnw 
> keyword: -> category2: 
> keyword: -> Rer 
> keyword: -> Loo

我希望它能有所帮助。

使用此映射,您可以轻松地以所需格式打印。

仍需完成:

  • 单元测试(重要!)
  • 以正确的格式打印
  • 根据您的目标调整代码

相关文章