如何从字符串中取出数字?

2022-01-17 00:00:00 string numbers parsing java stringtokenizer

我正在使用 Java StreamTokenizer 来提取字符串的各种单词和数字,但遇到了一个问题,其中涉及包含逗号的数字,例如10,567 被读取为 10.0 和 ,567.

I'm using a Java StreamTokenizer to extract the various words and numbers of a String but have run into a problem where numbers which include commas are concerned, e.g. 10,567 is being read as 10.0 and ,567.

我还需要从可能出现的数字中删除所有非数字字符,例如$678.00 应该是 678.00 或 -87 应该是 87.

I also need to remove all non-numeric characters from numbers where they might occur, e.g. $678.00 should be 678.00 or -87 should be 87.

我相信这些可以通过 whiteSpace 和 wordChars 方法来实现,但是有人知道怎么做吗?

I believe these can be achieved via the whiteSpace and wordChars methods but does anyone have any idea how to do it?

目前基本的streamTokenizer代码是:

The basic streamTokenizer code at present is:

        BufferedReader br = new BufferedReader(new StringReader(text));
        StreamTokenizer st = new StreamTokenizer(br);
        st.parseNumbers();
        st.wordChars(44, 46); // ASCII comma, - , dot.
        st.wordChars(48, 57); // ASCII 0 - 9.
        st.wordChars(65, 90); // ASCII upper case A - Z.
        st.wordChars(97, 122); // ASCII lower case a - z.
        while (st.nextToken() != StreamTokenizer.TT_EOF) {
            if (st.ttype == StreamTokenizer.TT_WORD) {                    
                System.out.println("String: " + st.sval);
            }
            else if (st.ttype == StreamTokenizer.TT_NUMBER) {
                System.out.println("Number: " + st.nval);
            }
        }
        br.close(); 

或者有人可以建议使用 REGEXP 来实现这一目标吗?我不确定 REGEXP 在这里是否有用,因为在从字符串中读取标记后会进行任何处理.

Or could someone suggest a REGEXP to achieve this? I'm not sure if REGEXP is useful here given that any parding would take place after the tokens are read from the string.

谢谢

摩根先生.

推荐答案

StreamTokenizer 已过时,最好使用 Scanner,这是您的问题的示例代码:

StreamTokenizer is outdated, is is better to use Scanner, this is sample code for your problem:

    String s = "$23.24 word -123";
    Scanner fi = new Scanner(s);
    //anything other than alphanumberic characters, 
    //comma, dot or negative sign is skipped
    fi.useDelimiter("[^\p{Alnum},\.-]"); 
    while (true) {
        if (fi.hasNextInt())
            System.out.println("Int: " + fi.nextInt());
        else if (fi.hasNextDouble())
            System.out.println("Double: " + fi.nextDouble());
        else if (fi.hasNext())
            System.out.println("word: " + fi.next());
        else
            break;
    }

如果要使用逗号作为浮点分隔符,请使用 fi.useLocale(Locale.FRANCE);

If you want to use comma as a floating point delimiter, use fi.useLocale(Locale.FRANCE);

相关文章