使用正则表达式验证 Youtube URL
我正在尝试验证我的应用程序的 YouTube 网址.
I'm trying to validate YouTube URLs for my application.
到目前为止,我有以下几点:
So far I have the following:
// Set the youtube URL
$youtube_url = "www.youtube.com/watch?v=vpfzjcCzdtCk";
if (preg_match("/((http://){0,}(www.){0,}(youtube.com){1} || (youtu.be){1}(/watch?v=[^s]){1})/", $youtube_url) == 1)
{
echo "Valid";
else
{
echo "Invalid";
}
我希望验证 Youtube 网址的以下变体:
I wish to validate the following variations of Youtube Urls:
- 有和没有 http://
- 有和没有 www.
- 使用网址 youtube.com 和 youtu.be
- 必须有/watch?v=
- 必须具有唯一的视频字符串(在上面的示例中为vpfzjcCzdtCk")
但是,我认为我的逻辑不正确,因为出于某种原因它返回 true 用于:www.youtube.co/watch?v=vpfzjcCzdtCk代码>(注意我用
.co
而不是 .com
写错了)
However, I don't think I've got my logic right, because for some reason it returns true for: www.youtube.co/watch?v=vpfzjcCzdtCk
(Notice I've written it incorrectly with .co
and not .com
)
推荐答案
在你的这个正则表达式中有很多冗余(还有 倾斜牙签综合症).不过,这应该会产生结果:
There are a lot of redundancies in this regular expression of yours (and also, the leaning toothpick syndrome). This, though, should produce results:
$rx = '~
^(?:https?://)? # Optional protocol
(?:www[.])? # Optional sub-domain
(?:youtube[.]com/watch[?]v=|youtu[.]be/) # Mandatory domain name (w/ query string in .com)
([^&]{11}) # Video id of 11 characters as capture group 1
~x';
$has_match = preg_match($rx, $url, $matches);
// if matching succeeded, $matches[1] would contain the video ID
一些注意事项:
- 使用波浪号
~
作为分隔符,避免 LTS - 使用
[.]
而不是.
来提高视觉易读性并避免 LTS.(特殊"字符 - 例如点.
- 在字符类中无效(方括号内)) - 为了使正则表达式更可读",您可以使用
x
修饰符(这有进一步的含义;请参阅 关于模式修饰符的文档),它还允许在正则表达式中添加注释 - 可以使用非捕获组来抑制捕获:
(?:
.这使表达式更有效.)
- use the tilde character
~
as delimiter, to avoid LTS - use
[.]
instead of.
to improve visual legibility and avoid LTS. ("Special" characters - such as the dot.
- have no effect in character classes (within square brackets)) - to make regular expressions more "readable" you can use the
x
modifier (which has further implications; see the docs on Pattern modifiers), which also allows for comments in regular expressions - capturing can be suppressed using non-capturing groups:
(?: <pattern> )
. This makes the expression more efficient.
或者,要从(或多或少完整的)URL 中提取值,您可能需要使用 parse_url()
:
Optionally, to extract values from a (more or less complete) URL, you might want to make use of parse_url()
:
$url = 'http://youtube.com/watch?v=VIDEOID';
$parts = parse_url($url);
print_r($parts);
输出:
Array
(
[scheme] => http
[host] => youtube.com
[path] => /watch
[query] => v=VIDEOID
)
验证域名和提取视频 ID 留给读者作为练习.
Validating the domain name and extracting the video ID is left as an exercise to the reader.
我屈服于下面的评论战;感谢 Toni Oriol,正则表达式现在也适用于短 (youtu.be) 网址.
I gave in to the comment war below; thanks to Toni Oriol, the regular expression now works on short (youtu.be) URLs as well.
相关文章