读取文本文件并存储出现的每个字符

2022-01-12 00:00:00 arrays char java

我想制作一个 java 程序来读取一个文本文件并存储每个出现的字符.所以它会考虑标点符号、字母、数字、大写、小写等.给定一个文本文件,例如:

I would like to make a java program that will read a text file and store every single character occurrence. So it will account for punctuation, letters, numbers,uppercase, lowercase ect. Given a text file like:

玫瑰是红色的,

紫罗兰色是蓝色的.

打印值如下所示:

R : 1

r : 3

我:1

, : 1

[等]

到目前为止,我能够读取文件并计算字数、行数和字符数.

So far I am able to read a file and count words, lines, chars.

package Exercise3;
import java.util.Scanner;
import java.util.StringTokenizer;
import java.io.*;
    public class StringTokenizerDemo1
    {
        public static void main(String[] args) throws IOException
        {
            Scanner keyboard = new Scanner(System.in);
            File file = new File("C://Users//guy//Desktop//Practice.txt");
            Scanner inputFile = new Scanner(file);
            String line, word;
            StringTokenizer token;
            int words = 0; //word count 
            int lines = 0; //line count
            int chars = 0; //char count 
            while (inputFile.hasNext())
            {
                lines++; //add one to line count 
                line = inputFile.nextLine();
                token = new StringTokenizer(line, " ");
                while (token.hasMoreTokens())
                {
                    words++; //add one word count 
                    word = token.nextToken();
                    chars+= word.length(); //add to char count 
                }
            }
        }
    }

我没有学过哈希图/表或树图;寻找一些关于如何使用数组、数组列表或链接列表存储所有字符类型及其出现的建议.

I have not learned hash maps/tables or treemaps; looking for some advice on how to store all char types and their occurrences either using an array,arraylist or linkedlist.

推荐答案

一个 char 是一个 16 位无符号值,如果你将它转换为一个 int,那么你会得到一个介于 0 和 65535 之间的值.这意味着你可以只使用一个数组来存储你的字符:

A char is a 16-bit unsigned value, and if you cast it to an int, then you'll get a value between 0 and 65535. That means that you can just use an array to store your characters:

int[] charCounts = new int[65536];

然后当你想记录 char c 的出现时:

and then when you want to record an occurrence of char c:

charCounts[(int) c]++;

当您想读取计数时:

for (int i=0; i<65536; i++)
    if (charCounts[i]>0)
        System.out.println((char)(i)+": "+charCounts[i]);

如果您想将其作为练习进行,则没有什么可以阻止您使用 HashMap<Character,Integer> 进行此操作,尽管它比为此需要的重量更大:p>

There is nothing to stop you doing it with a HashMap<Character,Integer> if you want to do it as an exercise, though it's more heavyweight than it needs to be for this:

HashMap<Character,Integer> map = new HashMap<Character,Integer>();

当你要记录char c的出现时:

if (!map.containsKey(c))
    map.put(c,1);
else
    map.put(c,map.get(c)+1);

当你想阅读时:

for (Map.Entry<Character,Integer> entry: map.entrySet())    
    System.out.println(entry.getKey()+": "+entry.getValue());

请注意,对于所有这些,我假设您只处理可打印字符.如果没有,当你打印出来时,你会想要做一些事情.

Note that for all of this I've assumed you're dealing only with printable characters. If not, you'll want to do something about that when you print them out.

相关文章