Youtube I.D 解析新 URL 格式

2022-01-15 00:00:00 regex format youtube php

以前有人问过这个问题,我发现了这个:

This question has been asked before and I found this:

YouTube 链接的正则表达式

但我正在寻找一些稍微不同的东西.

but I'm looking for something slightly different.

我需要匹配与所有可能的 youtube 链接格式兼容的 Youtube I.D 本身.不只是从 youtube.com 开始.

I need to match the Youtube I.D itself compatible with all the possible youtube link formats. Not exclusively beginning with youtube.com.

例如:

http://www.youtube.com/watch?v=-wtIMTCHWuI

http://www.youtube.com/v/-wtIMTCHWuI?version=3&autohide=1

http://youtu.be/-wtIMTCHWuI

http://www.youtube.com/oembed?url=http%3A//www.youtube.com/watch?v%3D-wtIMTCHWuI&format=json

http://s.ytimg.com/yt/favicon-wtIMTCHWuI.ico

http://i2.ytimg.com/vi/-wtIMTCHWuI/hqdefault.jpg

我可以使用一个聪明的策略来匹配与所有这些格式兼容的视频 ID -wtIMTCHWuI.我在考虑字符计数和匹配 = ? / . & 个字符.

is there a clever strategy I can use to match the video I.D -wtIMTCHWuI compatible with all these formats. I'm thinking character counting and matching = ? / . & characters.

推荐答案

我不得不为我几周前编写的一个 PHP 类处理这个问题,最后得到一个匹配任何类型字符串的正则表达式:有或没有 URL方案,带或不带子域、youtube.com URL 字符串、youtu.be URL 字符串以及处理各种参数排序.您可以查看 在 GitHub或者直接复制粘贴下面的代码块:

I had to deal with this for a PHP class I wrote a few weeks ago and ended up with a regex that matches any kind of strings: With or without URL scheme, with or without subdomain, youtube.com URL strings, youtu.be URL strings and dealing with all kind of parameter sorting. You can check it out at GitHub or simply copy and paste the code block below:

/**
 *  Check if input string is a valid YouTube URL
 *  and try to extract the YouTube Video ID from it.
 *  @author  Stephan Schmitz <eyecatchup@gmail.com>
 *  @param   $url   string   The string that shall be checked.
 *  @return  mixed           Returns YouTube Video ID, or (boolean) false.
 */
function parse_yturl($url)
{
    $pattern = '#^(?:https?://|//)?(?:www.|m.)?(?:youtu.be/|youtube.com/(?:embed/|v/|watch?v=|watch?.+&v=))([w-]{11})(?![w-])#';
    preg_match($pattern, $url, $matches);
    return (isset($matches[1])) ? $matches[1] : false;
}

测试用例:https://3v4l.org/GEDT0
JavaScript 版本:https://stackoverflow.com/a/10315969/624466

为了解释正则表达式,这里有一个拆分版本:

To explain the regex, here's a split up version:

/**
 *  Check if input string is a valid YouTube URL
 *  and try to extract the YouTube Video ID from it.
 *  @author  Stephan Schmitz <eyecatchup@gmail.com>
 *  @param   $url   string   The string that shall be checked.
 *  @return  mixed           Returns YouTube Video ID, or (boolean) false.
 */
function parse_yturl($url)
{
    $pattern = '#^(?:https?://|//)?' # Optional URL scheme. Either http, or https, or protocol-relative.
             . '(?:www.|m.)?'      #  Optional www or m subdomain.
             . '(?:'                 #  Group host alternatives:
             .   'youtu.be/'        #    Either youtu.be,
             .   '|youtube.com/'    #    or youtube.com
             .     '(?:'             #    Group path alternatives:
             .       'embed/'        #      Either /embed/,
             .       '|v/'           #      or /v/,
             .       '|watch?v='    #      or /watch?v=,
             .       '|watch?.+&v=' #      or /watch?other_param&v=
             .     ')'               #    End path alternatives.
             . ')'                   #  End host alternatives.
             . '([w-]{11})'         # 11 characters (Length of Youtube video ids).
             . '(?![w-])#';         # Rejects if overlong id.
    preg_match($pattern, $url, $matches);
    return (isset($matches[1])) ? $matches[1] : false;
}

相关文章