使用 Javascript 检测两个字符串之间的差异

2022-01-25 00:00:00 string compare javascript

使用 Javascript,我想检查两个字符串之间有多少差异.

With Javascript, I want to check how many differences there are between two strings.

类似:

var oldName = "Alec";
var newName = "Alexander";
var differences = getDifference(oldName, newName) // differences = 6

  • 添加到名称中的任何字母都应计为每个字母一个变化.
  • 更改一个字母应计为每个字母的更改.交换两个
  • 字母应该算作两次更改,因为您真正更改了每个字母
    信.
  • 但是,移动一个字母并插入另一个字母只能算作一次更改.
  • 例如:

    将Alex"更改为Alexander"将进行 5 次更改,因为添加了 5 个字母

    Changing "Alex" to "Alexander" would be 5 changes as 5 letters have been added

    将Alex"更改为Allex"只是一个更改,因为您添加了一个l"并将其余部分转移但没有更改它们

    Changing "Alex" to "Allex" would only be one change as you added an "l" and shifted the rest over but didnt change them

    将Alexander"更改为Allesander"将进行 2 次更改(添加l"并将x"更改为s").

    Changing "Alexander" to "Allesander"would be 2 changes (adding the "l" and changing "x" to a "s").

    我可以将每个名称拆分成一个字母数组,然后像在这个 jsFiddle 具有以下功能:

    I can split each name into an array of letters and compare them easy enough like in this jsFiddle with the below function:

    function compareNames(){
        var oldName = $('#old').val().split("");
        var newName = $('#new').val().split("");
        var changeCount = 0;
        var testLength = 0;
        if(oldName.length > newName.length){
            testLength=oldName.length;    
        }
        else testLength=newName.length;
        for(var i=0;i<testLength;i++){
            if(oldName[i]!=newName[i]) {
               changeCount++;           
            }
        }
        alert(changeCount);
    }
    

    但我怎么能解释字母的移动不算作变化呢?

    更新:这是我的工作原理

    Levenshtein distance 正是我所需要的.感谢彼得!

    Levenshtein distance was exactly what I needed. Thanks to Peter!

    工作中的 jsFiddle

    $(function () {
        $('#compare').click(function () {
            var oldName = $('.compare:eq(0)').val();
            var newName = $('.compare:eq(1)').val();
            var count = levDist(oldName, newName);
            $('#display').html('There are ' + count + ' differences present');
        });
    });
    
    function levDist(s, t) {
        var d = []; //2d matrix
    
        // Step 1
        var n = s.length;
        var m = t.length;
    
        if (n == 0) return m;
        if (m == 0) return n;
    
        //Create an array of arrays in javascript (a descending loop is quicker)
        for (var i = n; i >= 0; i--) d[i] = [];
    
        // Step 2
        for (var i = n; i >= 0; i--) d[i][0] = i;
        for (var j = m; j >= 0; j--) d[0][j] = j;
    
        // Step 3
        for (var i = 1; i <= n; i++) {
            var s_i = s.charAt(i - 1);
    
            // Step 4
            for (var j = 1; j <= m; j++) {
    
                //Check the jagged ld total so far
                if (i == j && d[i][j] > 4) return n;
    
                var t_j = t.charAt(j - 1);
                var cost = (s_i == t_j) ? 0 : 1; // Step 5
    
                //Calculate the minimum
                var mi = d[i - 1][j] + 1;
                var b = d[i][j - 1] + 1;
                var c = d[i - 1][j - 1] + cost;
    
                if (b < mi) mi = b;
                if (c < mi) mi = c;
    
                d[i][j] = mi; // Step 6
    
                //Damerau transposition
                if (i > 1 && j > 1 && s_i == t.charAt(j - 2) && s.charAt(i - 2) == t_j) {
                    d[i][j] = Math.min(d[i][j], d[i - 2][j - 2] + cost);
                }
            }
        }
        // Step 7
        return d[n][m];
    }

    <script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.0/jquery.min.js"></script>
    <input type="button" id="compare" value="Compare" /><br><br>
    <input type="text" id="old" class="compare" value="Alec" />
    <input type="text" id="new" class="compare" value="Alexander" />
    <br>
    <br>
    <span id="display"></span>

    感谢 James Westgate 提供的功能:

    Credit to James Westgate for the function:

    Jame 的帖子展示了这个功能

    推荐答案

    我手头没有 Javascript 实现本身,但你正在做一些存在完善算法的事情.具体来说,我相信您正在寻找两个字符串之间的Levenshtein 距离"——即插入、替换和删除的数量(假设您将删除视为更改).

    I don't have a Javascript implementation on hand per se, but you're doing something for which well-established algorithms exist. Specifically, I believe you're looking for the "Levenshtein distance" between two strings -- i.e. the number of insertions, substitutions and deletions (assuming you are treating a deletion as a change).

    Levenshtein distance 的维基百科页面 有各种伪代码实现,您可以从中开始,以及对您也有帮助的参考资料.

    The wikipedia page for Levenshtein distance has various pseudo-code implementations from which you could start, and references which may also help you.

相关文章