如何获取压缩文件(通过索引)并重新创建原始文件?(爪哇)

2022-01-08 00:00:00 compression hashtable hashmap java

问题背景

我一直在开发一些代码，首先关注的是读取字符串和创建文件.其次，将字符串拆分为数组.然后获取数组中每个单词的索引，最后删除重复项并将其打印到不同的文件中.我目前已经为此制作了代码，这是一个链接 https://pastebin.com/gqWH0x0 (有一个菜单系统)但它相当长，所以我没有在这个问题中实现它.

I have been developing some code that focuses on firstly, reading a string and creating a file. Secondly, spliting a string into an array. Then getting the indexes for each word in the array and finally, removing the duplicates and printing it to a different file. I currently have made the code for this here is a link https://pastebin.com/gqWH0x0 (there is a menu system as well) but it is rather long so I have refrained from implementing it in this question.

压缩方法是通过 hashmaps 完成的，获取数组的索引并将它们映射到相关的单词.这是一个例子:

The compression method is done via hashmaps, getting indexes of the array and mapping them to the relevant word. Here is an example:

原文:《海见海见海见》

Original: "sea sea see sea see see"

输出:见[2, 4, 5],sea[0, 1, 3],

Output: see[2, 4, 5],sea[0, 1, 3],

问题

下一阶段是将输出恢复到原始状态.我目前对 java 比较陌生，所以我不知道所需的技术.代码应该能够获取输出文件(如上所示)并将其放回原始文件.

The next stage is getting the output back into the original state. I am currently relatively new to java so I am not aware of the techniques required. The code should be able to take the output file (shown above) and put it back into the original.

我目前的想法是您只需重写此哈希图(如下).我这样想对吗?我想我应该先检查堆栈溢出！

My current thinking is that you would just rewrite this hashmap (below). Would I be correct in thinking this? I thought I should check with stack overflow first!

Map<String, Set<Integer>> seaMap = new HashMap<>(); //new hashmap for (int seaInt = 0; seaInt < sealist.length; seaInt++) { if (seaMap.keySet().contains(sealist[seaInt])) { Set<Integer> index = seaMap.get(sealist[seaInt]); index.add(seaInt); } else { Set<Integer> index = new HashSet<>(); index.add(seaInt); seaMap.put(sealist[seaInt], index); } } System.out.print("Compressed: "); seaMap.forEach((seawords, seavalues) -> System.out.print(seawords + seavalues + ",")); System.out.println(" ");

如果有人有任何好的想法/答案，请告诉我，我真的很渴望解决方案！

If anyone has any good ideas / answers then please let me know, I am really desperate for a solution!

链接到当前代码:https://pastebin.com/gqWH0x0K

推荐答案

首先，您必须使用您的示例将带有索引的单词与压缩行分开:

first you will have to separate the words with index(es) from your compressed line, using your example:

"see[2, 4, 5],sea[0, 1, 3],"

获取以下字符串:

"see[2, 4, 5]" and "sea[0, 1, 3]"

对于每个您必须阅读的索引，例如首先:

for each you must read the indexes, e.g. for first:

2, 4 and 5

现在只需在给定索引处的 ArrayList(或数组)中写入单词.

now just write the word in an ArrayList (or array) at the given index.

对于前两个步骤，您可以使用正则表达式来查找每个单词和索引列表.然后使用 String.split 和 Integer.parseInt 获取所有索引.

For the first two steps you can use a regular expression to find each word and the index list. Then use String.split and Integer.parseInt to get all indexes.

Pattern pattern = Pattern.compile("(.*?)\[(.*?)\],"); String line = "see[2, 4, 5],sea[0, 1, 3],"; Matcher matcher = pattern.matcher(line); while (matcher.find()) { String word = matcher.group(1); String[] indexes = matcher.group(2).split(", "); for (String str : indexes) { int index = Integer.parseInt(str);

现在只需检查结果列表是否足够大并将单词设置在找到的索引处.

Now just check that the result List is big enough and set the word at the found indexes.

相关文章