用于 Web 照片库的正确 NoSQL 数据架构

2022-01-13 00:00:00 python sql nosql amazon-dynamodb

问题描述

我正在寻找为照片库的 NoSQL 存储构建合适的数据结构.在我的 Web 应用程序中,一张照片可以是 1 个或多个相册的一部分.我有使用 MySQL 的经验,但几乎没有使用过键值存储.

I'm looking to build an appropriate data structure for NoSQL storage of a photo gallery. In my web application, a photo can be part of 1 or more albums. I have experience with MySQL, but almost none with key-value storage.

使用 MySQL,我将设置 (3) 个表,如下所示:

With MySQL, I would have set up (3) tables as follows:

photos (photo_id, title, date_uploaded, filename)
albums (album_id, title, photo_id)
album_photo_map (photo_id, album_id)

然后,要检索 5 张最新照片的列表(带有相册数据),查询如下:

And then, to retrieve a list of the 5 latest photos (with album data), a query like this:

SELECT *
FROM albums, photos, album_photo_map
WHERE albums.album_id = album_photo_map.album_id AND
                photos.photo_id = album_photo_map.photo_id
ORDER BY photos.date_uploaded DESC LIMIT 5;

如何使用 NoSQL 键值对数据库完成类似的查询?(特别是亚马逊的 DynamoDB.)存储会是什么样子?索引如何工作?

How would I accomplish a similar query using a NoSQL key-value pair database? (Specifically, Amazon's DynamoDB.) What would the storage look like? How would the indexing work?


解决方案

使用 mongodb lingo,您的集合可能如下所示:

Using mongodb lingo, your collections could look like this:

photos = [
    {
        _id: ObjectId(...),
        title: "...",
        date_uploaded: Date(...),
        albums: [
            ObjectId(...),
            ...
        ]
    },
    ...
]

albums = [
    {
        _id: ObjectId(...),
        title: "..."
    }
]

查找 5 张最新照片的方法如下:

Finding the 5 newest photos would be done like this:

> var latest = db.photos.find({}).sort({date_uploaded:1}).limit(5);

mongo 中没有服务器端连接,因此您必须像这样获取所有最新专辑:

There's no server-side joins in mongo, so you'd have to fetch all the latest albums like this:

> var latest_albums = latest.find({}, {albums: 1});

当然,那你必须把它归结为一个集合.

Of course, then you have to boil this down into a set.

如果您只是将相册嵌入照片文档中,实际上会更容易,因为它们很小:

It's actually easier if you just embed the album inside the photo documents, since they're small:

photos = [
    {
        _id: ObjectId(...),
        title: "...",
        date_uploaded: Date(...),
        albums: [
            {name: "family-vacation-2011", title: "My family vacation in 2010"},
            ...
        ]
    },
    ...
]

那么查询也是一样的,只是你不必加入.查找相册中的所有照片如下所示:

Then querying is the same, but you don't have to join. Finding all photos in an album looks like:

> db.photos.find({albums:{$elemMatch:{name: "family-vacation-2011"}}});

相关文章