使用 C++ boost::split 拆分字符串而不拆分带引号的文本

2021-12-24 00:00:00 split c++ boost

我正在使用

boost::split(strs, r_strCommandLine, boost::is_any_of("	 "));

将字符串吐出到用于解析简单脚本的标记中.到现在为止还挺好.但是,对于下面的字符串

command_name first_argument "第二个参数是一个带引号的字符串."

我希望我的代币成为

strs[0] = command_namestrs[1] = first_argumentstrs[2] = "第二个参数是带引号的字符串."

当然,我可以在标记的开头和结尾搜索引号字符,并使用"来分隔以引号开头的标记和以引号结尾的标记的出现之间的标记以重新创建带引号的字符串,但是我想知道是否有更有效/更优雅的方式来做到这一点.有什么想法吗?

解决方案

使用 的示例boost::tokenizer:

#include #include 使用 std::cout;使用 std::string;#include 使用 boost::tokenizer;使用 boost::escaped_list_separator;typedef tokenizer>so_tokenizer;int main(){string s("command_name first_argument"""第二个参数是带引号的字符串."");so_tokenizer tok(s, escaped_list_separator('\', ' ', '"'));for(so_tokenizer::iterator beg=tok.begin(); beg!=tok.end();++beg){cout<<*乞求<<"
";}返回0;}

输出:

<前>命令名称第一个参数第二个参数是带引号的字符串.

在 https://ideone.com/gwCpug 上查看演示.

I am using

boost::split(strs, r_strCommandLine, boost::is_any_of("	 "));

to spit a string into tokens for parsing a simple script. So far, so good. However, for the following string

command_name first_argument "Second argument which is a quoted string." 

i would like my tokens to be

strs[0] = command_name
strs[1] = first_argument
strs[2] = "Second argument which is a quoted string." 

Of course, I could search for quote characters at beginning and ending of tokens and merging using " " delimiters the tokens between the the occurrence of a token beginning with a quote and a token ending with a quote to recreate the quoted string but I am wondering if there is a more efficient/elegant way of doing this. Any ideas?

解决方案

Example using boost::tokenizer:

#include <string>
#include <iostream>
using std::cout;
using std::string;

#include <boost/tokenizer.hpp>
using boost::tokenizer;
using boost::escaped_list_separator;

typedef tokenizer<escaped_list_separator<char> > so_tokenizer;

int main()
{
    string s("command_name first_argument "
             ""Second argument which is a quoted string."");

    so_tokenizer tok(s, escaped_list_separator<char>('\', ' ', '"'));
    for(so_tokenizer::iterator beg=tok.begin(); beg!=tok.end(); ++beg)
    {
        cout << *beg << "
";
    }

    return 0;
}

Output:

command_name
first_argument
Second argument which is a quoted string.

See demo at https://ideone.com/gwCpug .

相关文章