SequelizeJS 中的慢关联

2022-01-19 00:00:00 postgresql node.js mysql sequelize.js

我正在尝试诊断使用 SequlizeJS 作为 ORM 的 Express 应用程序出现某些速度下降的原因.我有一个模型有 2x hasMany &与其他 2 个模型的 hasOne 关系:

I am trying to diagnose the cause of some slow downs in my Express app which is using SequlizeJS as the ORM. I have a model that has a 2x hasMany & a hasOne relation with 2 other models:

更新:我已经使用 classMethods#associate 函数在定义调用中建立了关联.

Update: I've made the associations within the define call using the classMethods#associate function.

// Model1
classMethods: {
    associate: function(models) {
        Model1.hasMany(models.Model2);
        Model1.hasMany(models.Model3);
        Model1.hasOne(models.Model2, {as: 'next', foreignKey: 'model2_next'});
    }
}

// Model2
classMethods: {
    associate: function(models) {
        Model2.belongsTo(models.Model1, {foreignKey: 'model2_next'});
    }
}

如果我通过以下方式查询它们:

if I query them in the following manner:

db.Model1.find({
    where: { /* Simple where statement */ },
    include: [
        db.Model2,
        db.Model3,
        { model: db.Model2, as: 'next' },
    ]
}).complete(function(err, data) {
    res.json(data);
});

响应可能需要 8-12 秒.但是,如果我单独查询 Model2 并使用 async & 手动合并它们lodash 库:

It can take between 8-12seconds to respond. However, if I query Model2 separately and merge them manually using the async & lodash libraries:

async.parallel({
    model2: function(callback) {
        db.Model2.findAll({
            where: { /* Simple where statement */ }
        }).complete(callback)
    },
    model1: function(callback) {
        db.Model1.find({
            where: { /* Simple where statement */ },
            include: [
                db.Model3,
                { model: db.Model2, as: 'next' },
            ]
        }).complete(callback);
    }
}, function(err, data) {
    var response = data.model1.values;
    response.Model2 = data.model2.map(function(Model2){ return Model2.values });

    res.json(response);
})

需要 60-100 毫秒.

it takes between 60-100ms.

我已经尝试从 MySQL 切换到 PostgreSQL,虽然 PostgreSQL 稍微快了一点,但只有 2-3%!

I've tried switching from MySQL to PostgreSQL and whilst PostgreSQL was fractionally faster, it was a matter of 2-3%!

是什么导致 Sequelize 花费的时间比拆分查询要长得多,有什么方法可以加快速度吗?

What is causing Sequelize to take so much longer than the split query and is there a way I can speed this up?

推荐答案

将 :M 关系添加到包含时,Sequelize 会变慢.:M 关系会导致您的 sql 结果中出现重复的行,因此我们必须花时间对其进行重复数据删除并将其解析为模型.

Sequelize slows down when you add :M relations to your include. :M relations result in duplicate rows in your sql result, so we have to spend time deduplicating that and parsing it into models.

为了获得最佳性能,您可以将 :1 关系保留在包含中,但在单独的查询中执行 :M.

For optimal performance you could leave your :1 relations in your include but do the :M in seperate queries.

当然查询本身也可能很慢,但很可能是 Sequelize 开销的结果 - 尝试直接在数据库上运行查询.

Of course the query itself could also be slow, but most likely it's the result of Sequelize overhead - Try running the query directly on the database.

(免责声明:Sequelize 核心开发者)

(Disclaimer: Sequelize core developer)

你运行的是什么版本?您报告的初始数字听起来很高,但我们在进行一些优化之前已经听说过这些数字,请尝试针对最新的 git master 进行测试.

What version are you running on? The initial number you reported sounds high, but we have heard of those numbers before we did some optimizations, try testing against the latest git master.

我们一直致力于针对这些场景优化代码,但是将 20.000 行重复数据删除到 5.000 行将始终需要一些 cpu 周期.

We're always working on optimizing the code for these scenarios, but deduplicating 20.000 rows to 5.000 rows will always require some cpu cycles.

相关文章