Java搜索字符串并获取以下行(直到空行)
我正在搜索的文件如下所示:
keyword:
====================
category1:
----------
St2
Dpe
Tmot:
Bnw
category2:
----------
Rer
Loo
keyword2:
====================
.
.
.
我想做的事:
- 搜索包含关键字的行(追加":")
- 将以下所有行读入列表
- 行为空时停止
在我的示例中,我使用"关键字"调用我的搜索函数,它将把"="到"Loo"中的所有内容都添加到列表
我已经有了一个糟糕的解决方案,但如果搜索的关键字实际上不在文本文件中,它就会变得疯狂:
BufferedReader b = null;
try {
b = new BufferedReader(new FileReader(txtfile));
} catch (FileNotFoundException e) {
e.printStackTrace();
}
// search for the keyword and save the corresponding text block in a list
while ((readLine = b.readLine()) != null)
{
if(readLine.contains(keyword) && !(readLine.contains("_"+keyword)))
{
System.out.println("keyword is: " + readLine);
while ((readLine = b.readLine()) != null)
{
if(readLine.trim().isEmpty()) //stop at empyt line
{
break;
} else {
arr.add(readLine); // add element to list
}
}
}
}
!(ReadLine.Containes("_"+Keyword)语句在那里是因为有时关键字也显示为"FUN_Keyword:",而我只想在"Keyword:"行
停止问题:如果关键字不在文件中,我如何重写此函数以使其仍然正常工作(不向列表中添加任何内容)?
解决方案
您希望实现的目标不是很清楚。但我假设您希望得到如下结果:
myKeyword->myValueAssociatedToMyKeyWord
在我看来,您应该将任务分解为小块(函数)。比如读取你的文件,解析你的区块,找到一个关键字。您还应该定义什么是关键字(例如,以‘:’结尾,后跟至少包含6x‘=’的行)。不要忘记转义结果中不感兴趣的所有行(如‘-’,...)。
这是我的结果:
package com.example;
import java.io.BufferedReader;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.IOException;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.regex.Pattern;
public class Example3 {
public static void main(String[] args) throws IOException {
// Read file
final List<String> lines = readWholeFile("text.txt");
System.out.println("Read " + lines.size() + " lines.");
// Extract block with keyword
final Map<String, List<String>> result = mapBlockToKeyword(lines);
// Print out the result
for (Map.Entry<String, List<String>> entry : result.entrySet()) {
String keyword = entry.getKey();
entry.getValue().forEach(w -> System.out.println(keyword + " -> " + w));
}
}
private static Map<String, List<String>> mapBlockToKeyword(final List<String> lines) {
final Map<String, List<String>> result = new HashMap<>();
String lastKeyword = "<undefined>";
for (int i = 0; i < lines.size(); i++) {
final String line = lines.get(i);
// Is it a keyword?
if(isKeyword(line, lines, i)){
lastKeyword = line;
if(result.get(lastKeyword) == null){
result.put(lastKeyword, new ArrayList<String>());
}
continue;
}
// Is it a line we don't want to put in our result?
if ( lineHasAtLeastNTimesConsequtiveSameChar(line, 6, '=') || //
lineHasAtLeastNTimesConsequtiveSameChar(line, 6, '-') || //
line.trim().isEmpty()) {
// We don't want '======' to be associate to a keyword,
// escape it.
continue;
}
// Is it a value to add to keyword ?
if (result.get(lastKeyword) != null) {
result.get(lastKeyword).add(line);
} else {
System.err.println("Try to associate a value to a non-existant keyword.");
}
}
return result;
}
private static boolean isKeyword(final String currentLine, final List<String> lines, final int idxLine){
final boolean hasNextLine = (lines.size() - 1 <= idxLine) ? false : true;
if (hasNextLine) {
final String nextLine = lines.get(idxLine + 1);
// To be a keyword, it has to have a next line and ends with ':'
if ( hasNextLine && //
stringEndsWithChar(currentLine, ':') && //
lineHasAtLeastNTimesConsequtiveSameChar(nextLine, 6, '=')) {
return true;
}
}
return false;
}
private static List<String> readWholeFile(final String path) {
List<String> lines = new ArrayList<>();
try (BufferedReader reader = new BufferedReader(new FileReader(path))) {
String line = null;
while ((line = reader.readLine()) != null) {
lines.add(line);
}
} catch (FileNotFoundException e) {
// Would be better in a logger
System.err.println("Cannot find the file: " + e.getStackTrace());
e.printStackTrace();
} catch (IOException e) {
// Would be better in a logger
System.err.println("Cannot read the file: " + e.getStackTrace());
}
return lines;
}
private static boolean stringEndsWithChar(String line, char c) {
if (line != null && line.length() > 1) {
char lastLineChar = line.charAt(line.length() - 1);
return lastLineChar == c;
}
return false;
}
private static boolean lineHasAtLeastNTimesConsequtiveSameChar(final String line, int nTimes, char c) {
if (line != null && line.length() >= nTimes) {
Pattern pattern = Pattern.compile("^.*("+c+"{"+nTimes+",}).*$");
return pattern.matcher(line).find();
}
return false;
}
}
结果:
> Read 19 lines.
> keyword2: -> .
> keyword2: -> .
> keyword2: -> .
> keyword: -> category1:
> keyword: -> St2
> keyword: -> Dpe
> keyword: -> Tmot:
> keyword: -> Bnw
> keyword: -> category2:
> keyword: -> Rer
> keyword: -> Loo
我希望它能有所帮助。
使用此映射,您可以轻松地以所需格式打印。
仍需完成:
- 单元测试(重要!)
- 以正确的格式打印
- 根据您的目标调整代码
相关文章