使用 JAVA 将 window.open(Hyperlink) Javascript 代码转换为纯绝对 url

2022-01-09 00:00:00 onclick javascript java jsoup window.open

我在一个网站上使用 JAVA Jsoup 库来提取一些超链接

I work on a website with JAVA Jsoup Library to extract some hyperlinks

Document doc = Jsoup.connect("http://www.saudisale.com/SS_a_mpg.aspx").get();
Elements script = doc.select("script") ;  

for(Element elementary :doc.select("table"))
{
System.out.println(""+elementary.select("tbody").select("tr").select("td").select("input").attr("onClick")+"");

样本输出:-

window.open('http://saudisale.com/arPrivatePage.aspx?id=21871638','_blank','channelmode =1,scrollbars=1,status=0,titlebar=0,toolbar=0,resizable=1');
window.open('http://saudisale.com/arPrivatePage.aspx?id=21871638','_blank','channelmode =1,scrollbars=1,status=0,titlebar=0,toolbar=0,resizable=1');
window.open('http://saudisale.com/arPrivatePage.aspx?id=21871638','_blank','channelmode =1,scrollbars=1,status=0,titlebar=0,toolbar=0,resizable=1');
window.open('http://ads.saudisale.com/dyaralez.html ','_blank','channelmode =1,scrollbars=1,status=0,titlebar=0,toolbar=0,resizable=1');
window.open('http://ads.saudisale.com/dyaralez.html ','_blank','channelmode =1,scrollbars=1,status=0,titlebar=0,toolbar=0,resizable=1');


window.open('http://ads.saudisale.com/dalel.html','_blank','channelmode =1,scrollbars=1,status=0,titlebar=0,toolbar=0,resizable=1');

window.open('http://ads.saudisale.com/dalel.html','_blank','channelmode =1,scrollbars=1,status=0,titlebar=0,toolbar=0,resizable=1');
    window.open('SS_a_car.aspx?carid=37240','_blank','channelmode =1,scrollbars=1,status=0,titlebar=0,toolbar=0,resizable=1');
    window.open('SS_a_car.aspx?carid=37240','_blank','channelmode =1,scrollbars=1,status=0,titlebar=0,toolbar=0,resizable=1');

基于 Jsoup 不支持 javascript,所以我必须做一些手动 java 代码来将 window.open(hyperlink) javascript 代码转换为绝对超链接

Based on the fact that Jsoup does not support javascript, so I have to do some manual java code to convert window.open(hyperlink ) javascript code to absolute hyperlink

例如下面的输出 JavaScript 代码必须被转换

For example the following output JavaScript code has to be converted

window.open('http://saudisale.com/arPrivatePage.aspx?id=21871638','_blank','channelmode=1,scrollbars=1,status=0,titlebar=0,toolbar=0,resizable=1')

到:http://saudisale.com/arPrivatePage.aspx?id=21871638

window.open('SS_a_car.aspx?carid=37149','_blank','channelmode =1,scrollbars=1,status=0,titlebar=0,toolbar=0,resizable=1'); 

到http://www.saudisale.com/SS_a_car.aspx?carid=37149

有人可以指导我如何使用 JAVA 完成这项任务吗?

Could someone guide me how to accomplish this task with JAVA?

推荐答案

使用正则表达式.这会做你想做的事:

Use a regex. This will do what you want:

String input = "window.open('http://saudisale.com/arPrivatePage.aspx?id=21871638','_blank','channelmode =1,scrollbars=1,status=0,titlebar=0,toolbar=0,resizable=1');";

String regex = "window.open\(['"]*(.*?)(\s*['"]*,.*?)";
Pattern pattern = Pattern.compile(regex); 
Matcher matcher = pattern.matcher(input);
while (matcher.find()) {

    String output = (matcher.group().replaceAll(regex, "$1"));
    System.out.println(output);
}

您的最后两个网址是相对,因此您必须将它们转换为绝对网址,如 这里.

Your last two URLs are relative, so you have to convert them to absolute URLs as described here.

相关文章