如何使用 Julia 在矩阵中查找连通分量

2022-01-23 00:00:00 python matrix algorithm julia graph

问题描述

Assume I have the following matrix (defined here in Julia language):

    mat = [1 1 0 0 0 ; 1 1 0 0 0 ; 0 0 0 0 1 ; 0 0 0 1 1]

Considering as a "component" a group of neighbour elements that have value '1', how to identify that this matrix has 2 components and which vertices compose each one?

For the matrix mat above I would like to find the following result:

Component 1 is composed by the following elements of the matrix (row,column):

    (1,1)
    (1,2)
    (2,1)
    (2,2)

Component 2 is composed by the following elements:

    (3,5)
    (4,4)
    (4,5)

I can use Graph algorithms like this to identify components in square matrices. However such algorithms can not be used for non-square matrices like the one I present here.

Any idea will be much appreciated.

I am open if your suggestion involves the use of a Python library + PyCall. Although I would prefer to use a pure Julia solution.

Regards

解决方案

Using Image.jl's label_components is indeed the easiest way to solve the core problem. However, your loop over 1:maximum(labels) may not be efficient: it's O(N*n), where N is the number of elements in labels and n the maximum, because you visit each element of labels n times.

You'd be much better off just visiting each element of labels just twice: once to determine the maximum, and once to assign each non-zero element to its proper group:

using Images

function collect_groups(labels)
    groups = [Int[] for i = 1:maximum(labels)]
    for (i,l) in enumerate(labels)
        if l != 0
            push!(groups[l], i)
        end
    end
    groups
end

mat = [1 1 0 0 0 ; 1 1 0 0 0 ; 0 0 0 0 1 ; 0 0 0 1 1]

labels = label_components(mat)
groups = collect_groups(labels)

Output on your test matrix:

2-element Array{Array{Int64,1},1}:
 [1,2,5,6] 
 [16,19,20]

Calling library functions like find can occasionally be useful, but it's also a habit from slower languages that's worth leaving behind. In julia, you can write your own loops and they will be fast; better yet, often the resulting algorithm is much easier to understand. collect(zip(ind2sub(size(mat),find( x -> x == value, mat))...)) does not exactly roll off the tongue.

相关文章