Armadillo的Boost序列化稀疏矩阵

2022-04-13 00:00:00 sparse-matrix serialization c++ boost

我正在尝试使用Armadillo中的稀疏矩阵功能,但在序列化它时遇到了一些问题。我处理的矩阵非常大,组件中的大部分都是零,所以使用sp_mat是有意义的。代码如下:

#include <iostream>
#include <fstream>
#include <boost/archive/binary_oarchive.hpp>
#include <boost/archive/binary_iarchive.hpp>
#include <armadillo>
#include <boost/serialization/split_member.hpp>

BOOST_SERIALIZATION_SPLIT_FREE(arma::sp_mat)

namespace boost { 
namespace serialization {

template<class Archive>
void save(Archive & ar, const arma::sp_mat &t, unsigned int version)
{
    ar & t.n_rows;
    ar & t.n_cols;
    const double *data = t.memptr();
    for(int K=0; K<t.n_elem; ++K)
        ar & data[K];

}

template<class Archive>
void load(Archive & ar, arma::sp_mat &t, unsigned int version)
{
    int rows, cols;
    ar & rows;
    ar & cols;
    t.set_size(rows, cols);
    double *data = t.memptr();
    for(int K=0; K<t.n_elem; ++K)
        ar & data[K];}
}}
int main() {

  arma::mat C(3,3, arma::fill::randu);
  C(1,1) = 0; //example so that a few of the components are u
  C(1,2) = 0;
  C(0,0) = 0;
  C(2,1) = 0;
  C(2,0) = 0;
  arma::sp_mat A = arma::sp_mat(C);

  std::ofstream outputStream;
  outputStream.open("bin.dat");
  std::ostringstream oss;
  boost::archive::binary_oarchive oa(outputStream);
  oa & A;
  outputStream.close();

  arma::sp_mat B;
  std::ifstream inputStream;
  inputStream.open("bin.dat", std::ifstream::in);
  boost::archive::binary_iarchive ia(inputStream);
  ia & B;
  return 0;
}

当前的问题是,sp_mat没有emptr()成员,因此序列化完成的组件(例如,在第10-12行)对sp_mat不起作用。我很好奇有没有人知道解决办法?我发现奇怪的是,当我单独打印A的所有组件时,即使稀疏矩阵忽略了零,即使零仍然在内存中。例如,我打印了A(1,1),我得到了0。以下也是A打印时的外观:

[matrix size: 3x3; n_nonzero: 4; density: 44.44%]

     (1, 0)         0.2505
     (0, 1)         0.9467
     (0, 2)         0.2513
     (2, 2)         0.5206

解决方案

矩阵中的元素数始终为n × m,与存储策略(稀疏或密集)无关。

因此,您不应对能够读取单元格感到惊讶-它们可能不会存储,但很明显它们对计算很重要,因此您应该能够检索它们的值。

鉴于此,您的草图(使用memptr(),我认为它是从特定于非稀疏矩阵的代码复制/粘贴的)将始终存储非稀疏数据(迭代所有n_elems)。但是data不能指向某个连续的存储,因为除非内存布局与矩阵的维度直接匹配(密集存储、行为主或列为主),否则矩阵如何知道这些单元是什么。

根据Returning locations and values of a sparse matrix in armadillo c++中的信息,这里有一个固定的实现:

  • 不尝试使用未记录的实施详细信息
  • 使用文档记录的接口(it.ol(),it.row())进行稀疏序列化
  • 工作

完整代码(在我的机器上测试):

#include <armadillo>
#include <boost/archive/binary_iarchive.hpp>
#include <boost/archive/binary_oarchive.hpp>
#include <boost/serialization/split_member.hpp>
#include <fstream>
#include <iostream>

BOOST_SERIALIZATION_SPLIT_FREE(arma::sp_mat)

namespace boost { namespace serialization {

    template<class Archive>
    void save(Archive & ar, const arma::sp_mat &t, unsigned) {
        ar & t.n_rows & t.n_cols & t.n_nonzero;

        for (auto it = t.begin(); it != t.end(); ++it) {
            ar & it.row() & it.col() & *it;
        }
    }

    template<class Archive>
    void load(Archive & ar, arma::sp_mat &t, unsigned) {
        uint64_t r, c, nz;
        ar & r & c & nz;

        t.zeros(r, c);
        while (nz--) {
            double v;
            ar & r & c & v;
            t(r, c) = v;
        }
    }
}} // namespace boost::serialization

int main() {

    arma::mat C(3, 3, arma::fill::randu);
    C(0, 0) = 0;
    C(1, 1) = 0; // example so that a few of the components are u
    C(1, 2) = 0;
    C(2, 0) = 0;
    C(2, 1) = 0;

    {
        arma::sp_mat const A = arma::sp_mat(C);
        assert(A.n_nonzero == 4);

        A.print("A: ");
        std::ofstream outputStream("bin.dat", std::ios::binary);
        boost::archive::binary_oarchive oa(outputStream);
        oa& A;
    }

    {
        std::ifstream inputStream("bin.dat", std::ios::binary);
        boost::archive::binary_iarchive ia(inputStream);

        arma::sp_mat B(3,3);
        B(0,0) = 77; // some old data should be cleared

        ia& B;

        B.print("B: ");
    }
}

打印

A:
[matrix size: 3x3; n_nonzero: 4; density: 44.44%]

     (1, 0)         0.2505
     (0, 1)         0.9467
     (0, 2)         0.2513
     (2, 2)         0.5206

B:
[matrix size: 3x3; n_nonzero: 4; density: 44.44%]

     (1, 0)         0.2505
     (0, 1)         0.9467
     (0, 2)         0.2513
     (2, 2)         0.5206

相关文章