输出比多态文本存档更多的东西

2021-12-24 00:00:00 c++ c++11 boost

我正在使用 Shark 机器学习库,并使用 boost::archive::polymorphic_text_(io)archive 类将其分类器输出到文件中.

I'm using the Shark machine learning library, and it outputs its classifiers to file by using the boost::archive::polymorphic_text_(io)archive classes.

我正在创建一个词袋模型,我也需要将其写入文件(使用我自己的代码),并且我还需要将其输出到文件中.

I'm creating a bag of words model that I also need to write to file (using my own code) and I need to also output that to file.

理想情况下,我想将其输出到与分类器相同的文件中.当使用多态文本存档时,是否可以将内容写入同一文件?仅在存档开始时传递 fstream 就足够了吗?

I would ideally like to output this to the same file as the classifier. Is is possible to write things to the same file as when a polymorphic text archive is used? Is it enough to just pass the fstream at the point the archive begins?

稍微清楚一点:Boost 是否支持我将其他内容与这些档案一起放在一个文件中?

Just to be slightly clearer: Does Boost support me putting other things in a file alongside these archives?

推荐答案

首先:流不是档案.

First Off: Streams Are Not Archives.

我的第一反应是你试过了吗".但是,我很感兴趣,在文档中找不到任何关于此的信息,所以我自己做了一些测试:

My first reaction would be "have you tried". But, I was intrigued and couldn't find anything about this in the documentation, so I did a few tests myself:

  • 答案似乎是否",不受支持
  • 它似乎适用于二进制档案
  • 它似乎崩溃了,因为 xml/text 档案在输入缓冲区中留下了尾随 0xa 字符.如果要读取的下一个"存档也是文本,则这些不会造成问题,但显然会破坏二进制存档.
  • the answer seems to be "No", it's not supported
  • it seems to work for binary archives
  • it seems to break down because the xml/text archives leave trailing 0xa characters in the input buffer. These will not pose a problem if the "next" archive to be read is text as well, but obviously break binary archives.

这是我的测试员:

生活在 Coliru

#include <boost/archive/binary_iarchive.hpp>
#include <boost/archive/text_iarchive.hpp>
#include <boost/archive/xml_iarchive.hpp>
#include <boost/archive/binary_oarchive.hpp>
#include <boost/archive/text_oarchive.hpp>
#include <boost/archive/xml_oarchive.hpp>

int data = 42;

template <typename Ar>
void some_output(std::ostream& os)
{
    std::cout << "Writing archive at " << os.tellp() << "
";
    Ar ar(os);
    ar << BOOST_SERIALIZATION_NVP(data);
}

template <typename Ar>
void roundtrip(std::istream& is)
{
    data = -1;
    std::cout << "Reading archive at " << is.tellg() << "
";
    Ar ar(is);
    ar >> BOOST_SERIALIZATION_NVP(data);
    assert(data == 42);
}

#include <sstream>

int main()
{
    std::stringstream ss;

    //some_output<boost::archive::text_oarchive>(ss); // this derails the binary archive that follows
    some_output<boost::archive::binary_oarchive>(ss);
    some_output<boost::archive::xml_oarchive>(ss);
    some_output<boost::archive::text_oarchive>(ss);

    //roundtrip<boost::archive::text_iarchive>(ss);
    roundtrip<boost::archive::binary_iarchive>(ss);
    roundtrip<boost::archive::xml_iarchive>(ss);
    roundtrip<boost::archive::text_iarchive>(ss);

    // just to prove that there's remaining whitespace
    std::cout << "remaining: ";
    char ch;
    while (ss>>std::noskipws>>ch)
        std::cout << " " << std::showbase << std::hex << ((int)(ch));
    std::cout << "
";

    // of course, anything else will fail:
    try {
        roundtrip<boost::archive::text_iarchive>(ss);
    } catch(boost::archive::archive_exception const& e)
    {
        std::cout << "Can't deserialize from a stream a EOF: " << e.what();
    }
}

打印:

Writing archive at 0
Writing archive at 44
Writing archive at 242
Reading archive at 0
Reading archive at 44
Reading archive at 240
remaining:  0xa
Reading archive at 0xffffffffffffffff
Can't deserialize from a stream a EOF: input stream error

相关文章