如何正确解析 Boost.Xpressive 的胡子?
我尝试用出色的 mustache 解析器.html" rel="nofollow">Boost.XPressive 来自杰出的 Eric Niebler.但由于这是我的第一个解析器,我不熟悉编译器编写者的正常"方法和行话,并且在经过几天的反复试验后感到有点迷茫.所以我来到这里,希望有人能告诉我我的愚蠢方式是多么愚蠢;)
I have tried to write a mustache parser with the excellent Boost.XPressive from the brilliant Eric Niebler. But since this is my first parser I am not familiar with the "normal" approach and lingo of compiler writers and feel a bit lost after a few days of trial&error. So I come here and hope someone can tell me the foolishness of my n00bish ways ;)
这是我想提取的带有胡子模板的 HTML 代码 (http://mustache.github.io/):现在<bold>是{{#time}}gugus {{zeit}} oder nicht{{/time}} <i>为所有好人</i>来{007}帮助他们的</bold>{{国家}}.结果:{{#Res1}}零<b>est</b>mundi{{/Res1}}
This is the HTML code with the mustache templates that I want to extract (http://mustache.github.io/):
Now <bold>is the {{#time}}gugus {{zeit}} oder nicht{{/time}} <i>for all good men</i> to come to the {007} aid of their</bold> {{country}}. Result: {{#Res1}}Nullum <b>est</b> mundi{{/Res1}}
- 我编写的解析器不会打印任何内容,但也不会在编译时发出警告.我之前设法让它打印出部分 mustache 代码,但从来没有正确打印出来.
- 我不知道如何遍历所有代码以查找所有出现的事件,然后还可以像使用
smatch what;
变量一样访问它们.该文档仅显示如何使用what"查找第一次出现或如何使用迭代器"输出所有出现.- 实际上我需要两者的结合.因为一旦找到某些东西,我就需要质疑标签名称和标签之间的内容(什么"会提供但迭代器"不允许) - 并采取相应的行动.我想我可以使用动作",但如何使用?
- 我认为应该可以一次性完成标签查找和标签之间的内容",对吧?或者我需要为此解析 2 次 - 如果是,如何解析?
- The parser I wrote doesn't print out anything but also doesn't issue a warning at compile-time. I managed before to have it print out parts of the mustache code but never all of it correctly.
- I don't know how I can loop through all the code to find all occurrences but then also access them like with the
smatch what;
variable. The doc only shows how to find the first occurrence with "what" or how to output all the occurrences with the "iterator".- Actually I need a combination of both. Because once something is found I need to question the tags name and the content between the tags (which "what" would offer but the "iterator" won't allow) - and act accordingly. I guess I could use "actions" but how?
- I think that it should be possible to do the tag finding and "content between tags" in one swoop, right? Or do I need to parser 2 times for that - and if so how?
#include <boost/xpressive/xpressive_static.hpp> #include <boost/xpressive/match_results.hpp> typedef std::string::const_iterator It; using namespace boost::xpressive; std::string str = "Now <bold>is the {{#time}}gugus {{zeit}} oder nicht{{/time}} <i>for all good men</i> to come to the {007} aid of their</bold> {{country}}. Result: {{#Res1}}Nullum <b>est</b> mundi{{/Res1}}"; // Parser setup -------------------------------------------------------- mark_tag mtag (1), cond_mtag (2), user_str (3); sregex brackets = "{{" >> keep ( mtag = repeat<1, 20> (_w) ) >> "}}" ; sregex cond_brackets = "{{#" >> keep (cond_mtag = repeat<1, 20> (_w) ) >> "}}" >> * ( keep (user_str = + (*_s >> +alnum >> *_s) ) | by_ref (brackets) | by_ref (cond_brackets) ) >> "{{/" >> cond_mtag >> "}}" ; sregex mexpression = *( by_ref (cond_brackets) | by_ref (brackets) ); // Looping + catching the results -------------------------------------- smatch what2; std::cout << " regex_search: " << str << ' '; It strBegin = str.begin(), strEnd = str.end(); int ic = 0; do { if ( !regex_search ( strBegin, strEnd, what2, mexpression ) ) { std::cout << " >> Breakout of this life...! Exit after " << ic << " loop(s)." << std::endl; break; } else { std::cout << "**Loop Nr: " << ic << ' '; std::cout << " what2[0] " << what2[0] << ' '; // whole match std::cout << " what2[mtag] " << what2[mtag] << ' '; std::cout << " what2[cond_mtag] " << what2[cond_mtag] << ' '; std::cout << " what2[user_str] " << what2[user_str] << ' '; // display the nested results std::for_each ( what2.nested_results().begin(), what2.nested_results().end(), output_nested_results() // <--identical function from E.Nieblers documentation ); strBegin = what2[0].second; } ++ic; } while (ic < 6 || strBegin != str.end() );
推荐答案
这是来自@sehe 的正确完整代码,现在可以在 GCC >4.8 和 Linux 和 Windows 下的 CLANG 下工作.再次非常感谢伙伴提供的这个很棒的帮助,尽管这意味着我可以埋葬 XPressive :D
Here is the correct full code from @sehe that now works under GCC >4.8 and CLANG under Linux and Windows. Again many thanks mate for this awesome help, even though this means that I can bury XPressive :D
以下几行已更改或添加:
The following lines have changed or been added:
// -- #define BOOST_RESULT_OF_USE_DECLTYPE // -- struct to_string_f { template <typename T> std::string operator()(T const& v) const { return v.to_string(); }}; // -- section %= "{{" >> sense >> reference [ section_id = to_string(_1) ] >> "}}" >> sequence // contents > ("{{" >> ('/' >> lexeme [ lit(section_id) ]) >> "}}"); // -- phx::function<to_string_f> to_string;
//#define BOOST_SPIRIT_DEBUG #define BOOST_RESULT_OF_USE_DECLTYPE #define BOOST_SPIRIT_USE_PHOENIX_V3 #include <boost/fusion/adapted/struct.hpp> #include <boost/spirit/include/qi.hpp> #include <boost/spirit/include/phoenix.hpp> #include <boost/utility/string_ref.hpp> #include <functional> #include <map> namespace mustache { // any atom refers directly to source iterators for efficiency using boost::string_ref; template <typename Kind> struct atom { string_ref value; atom() { } atom(string_ref const& value) : value(value) { } friend std::ostream& operator<<(std::ostream& os, atom const& v) { return os << typeid(v).name() << "[" << v.value << "]"; } }; // the atoms using verbatim = atom<struct verbatim_tag>; using variable = atom<struct variable_tag>; using partial = atom<struct partial_tag>; // the template elements (any atom or a section) struct section; using melement = boost::variant< verbatim, variable, partial, // TODO comments and set-separators boost::recursive_wrapper<section> >; // the template: sequences of elements using sequence = std::vector<melement>; // section: recursively define to contain a template sequence struct section { bool sense; // positive or negative string_ref control; sequence content; }; } BOOST_FUSION_ADAPT_STRUCT(mustache::section, (bool, sense)(boost::string_ref, control)(mustache::sequence, content)) namespace qi = boost::spirit::qi; namespace phx= boost::phoenix; struct to_string_f { template <typename T> std::string operator()(T const& v) const { return v.to_string(); } }; template <typename Iterator> struct mustache_grammar : qi::grammar<Iterator, mustache::sequence()> { mustache_grammar() : mustache_grammar::base_type(sequence) { using namespace qi; static const _a_type section_id = {}; // local using boost::phoenix::construct; using boost::phoenix::begin; using boost::phoenix::size; sequence = *element; element = !(lit("{{") >> '/') >> // section-end ends the current sequence (partial | section | variable | verbatim); reference = raw [ lexeme [ +(graph - "}}") ] ] [ _val = construct<boost::string_ref>(&*begin(_1), size(_1)) ]; partial = qi::lit("{{") >> "> " >> reference >> "}}"; sense = ('#' > attr(true)) | ('^' > attr(false)); section %= "{{" >> sense >> reference [ section_id = to_string(_1) ] >> "}}" >> sequence // contents > ("{{" >> ('/' >> lexeme [ lit(section_id) ]) >> "}}"); variable = "{{" >> reference >> "}}"; verbatim = raw [ lexeme [ +(char_ - "{{") ] ] [ _val = construct<boost::string_ref>(&*begin(_1), size(_1)) ]; BOOST_SPIRIT_DEBUG_NODES( (sequence)(element)(partial)(variable)(section)(verbatim) (reference)(sense) ) } private: phx::function<to_string_f> to_string; qi::rule<Iterator, mustache::sequence()> sequence; qi::rule<Iterator, mustache::melement()> element; qi::rule<Iterator, mustache::partial()> partial; qi::rule<Iterator, mustache::section(), qi::locals<std::string> > section; qi::rule<Iterator, bool()> sense; // postive or negative qi::rule<Iterator, mustache::variable()> variable; qi::rule<Iterator, mustache::verbatim()> verbatim; qi::rule<Iterator, boost::string_ref()> reference; }; namespace Dumping { struct dumper : boost::static_visitor<std::ostream&> { std::ostream& operator()(std::ostream& os, mustache::sequence const& v) const { for(auto& element : v) boost::apply_visitor(std::bind(dumper(), std::ref(os), std::placeholders::_1), element); return os; } std::ostream& operator()(std::ostream& os, mustache::verbatim const& v) const { return os << v.value; } std::ostream& operator()(std::ostream& os, mustache::variable const& v) const { return os << "{{" << v.value << "}}"; } std::ostream& operator()(std::ostream& os, mustache::partial const& v) const { return os << "{{> " << v.value << "}}"; } std::ostream& operator()(std::ostream& os, mustache::section const& v) const { os << "{{" << (v.sense?'#':'^') << v.control << "}}"; (*this)(os, v.content); return os << "{{/" << v.control << "}}"; } }; } namespace ContextExpander { struct Nil { }; using Value = boost::make_recursive_variant< Nil, double, std::string, std::map<std::string, boost::recursive_variant_>, std::vector<boost::recursive_variant_> >::type; using Dict = std::map<std::string, Value>; using Array = std::vector<Value>; static inline std::ostream& operator<<(std::ostream& os, Nil const&) { return os << "#NIL#"; } static inline std::ostream& operator<<(std::ostream& os, Dict const& v) { return os << "#DICT(" << v.size() << ")#"; } static inline std::ostream& operator<<(std::ostream& os, Array const& v) { return os << "#ARRAY(" << v.size() << ")#"; } struct expander : boost::static_visitor<std::ostream&> { std::ostream& operator()(std::ostream& os, Value const& ctx, mustache::sequence const& v) const { for(auto& element : v) boost::apply_visitor(std::bind(expander(), std::ref(os), std::placeholders::_1, std::placeholders::_2), ctx, element); return os; } template <typename Ctx> std::ostream& operator()(std::ostream& os, Ctx const&/*ignored*/, mustache::verbatim const& v) const { return os << v.value; } std::ostream& operator()(std::ostream& os, Dict const& ctx, mustache::variable const& v) const { auto it = ctx.find(v.value.to_string()); if (it != ctx.end()) os << it->second; return os; } template <typename Ctx> std::ostream& operator()(std::ostream& os, Ctx const&, mustache::variable const&) const { return os; } std::ostream& operator()(std::ostream& os, Dict const& ctx, mustache::partial const& v) const { auto it = ctx.find(v.value.to_string()); if (it != ctx.end()) { static const mustache_grammar<std::string::const_iterator> p; auto const& subtemplate = boost::get<std::string>(it->second); std::string::const_iterator first = subtemplate.begin(), last = subtemplate.end(); mustache::sequence dynamic_template; if (qi::parse(first, last, p, dynamic_template)) return (*this)(os, Value{ctx}, dynamic_template); } return os << "#ERROR#"; } std::ostream& operator()(std::ostream& os, Dict const& ctx, mustache::section const& v) const { auto it = ctx.find(v.control.to_string()); if (it != ctx.end()) boost::apply_visitor(std::bind(do_section(), std::ref(os), std::placeholders::_1, std::cref(v)), it->second); else if (!v.sense) (*this)(os, Value{/*Nil*/}, v.content); return os; } template <typename Ctx, typename T> std::ostream& operator()(std::ostream& os, Ctx const&/* ctx*/, T const&/* element*/) const { return os << "[TBI:" << __PRETTY_FUNCTION__ << "]"; } private: struct do_section : boost::static_visitor<> { void operator()(std::ostream& os, Array const& ctx, mustache::section const& v) const { for(auto& item : ctx) expander()(os, item, v.content); } template <typename Ctx> void operator()(std::ostream& os, Ctx const& ctx, mustache::section const& v) const { if (v.sense == truthiness(ctx)) expander()(os, Value(ctx), v.content); } private: static bool truthiness(Nil) { return false; } static bool truthiness(double d) { return 0. == d; } template <typename T> static bool truthiness(T const& v) { return !v.empty(); } }; }; } int myMain() { std::cout << std::unitbuf; std::string input = "<ul>{{#time}} <li>{{> partial}}</li>{{/time}}</ul> " "<i>for all good men</i> to come to the {007} aid of " "their</bold> {{country}}. Result: {{^Res2}}(absent){{/Res2}}{{#Res2}}{{Res2}}{{/Res2}}" ; // Parser setup -------------------------------------------------------- typedef std::string::const_iterator It; static const mustache_grammar<It> p; It first = input.begin(), last = input.end(); try { mustache::sequence parsed_template; if (qi::parse(first, last, p, parsed_template)) { std::cout << "Parse success "; } else { std::cout << "Parse failed "; } if (first != last) { std::cout << "Remaing unparsed input: '" << std::string(first, last) << "' "; } std::cout << "Input: " << input << " "; std::cout << "Dump: "; Dumping::dumper()(std::cout, parsed_template) << " "; std::cout << "Evaluation: "; { using namespace ContextExpander; expander engine; Value const ctx = Dict { { "time", Array { Dict { { "partial", "gugus {{zeit}} (a.k.a. <u>{{title}}</u>)"}, { "title", "noon" }, { "zeit", "12:00" } }, Dict { { "partial", "gugus {{zeit}} (a.k.a. <u>{{title}}</u>)"}, { "title", "evening" }, { "zeit", "19:30" } }, Dict { { "partial", "gugus <u>{{title}}</u> (expected at around {{zeit}})"}, { "title", "dawn" }, { "zeit", "06:00" } }, } }, { "country", "ESP" }, { "Res3", "unused" } }; engine(std::cout, ctx, parsed_template); } } catch(qi::expectation_failure<It> const& e) { std::cout << "Unexpected: '" << std::string(e.first, e.last) << "' "; } }
相关文章