168

我试图找到一种相对简单可靠的方法，使用JavaScript（或jQuery）从字符串变量中提取基本URL。

例如，给出如下所示：

http://www.sitename.com/article/2009/09/14/this-is-an-article/

我想得到：

http://www.sitename.com/

正则表达式是最好的选择吗？如果是这样，我可以使用什么语句将从给定字符串中提取的基本URL分配给新变量？

我已经对此进行了一些搜索，但是我在JavaScript世界中发现的所有内容似乎都围绕着使用location.host或类似名称从实际文档URL收集此信息。

— 搞糟
source

现在的答案应该是下面的

— davidmpaz

205

编辑：有人抱怨说它没有考虑到协议。因此，我决定升级代码，因为它被标记为答案。对于那些喜欢单行代码的人...非常抱歉，这就是为什么我们使用代码最小化器，代码应该是人类可读的，并且这种方式更好...我认为。

var pathArray = "https://somedomain.com".split( '/' );
var protocol = pathArray[0];
var host = pathArray[2];
var url = protocol + '//' + host;

或从下面使用Davids解决方案。

— 伊察
source

6

感谢您的答复，但我还是想从字符串中提取基本URL，而不是实际的文档URL。我认为这不会帮助我-尽管如果我错了，请纠正我。

— Bungle，

2

pathArray = String（“ YourHost.com/url/nic/or/not").split（'/ '）; host = pathArray [2];

4

知道了-感谢Rafal和daddywoodland！我最终使用：url =' sitename.com/article/ 2009/ 09/14/this-is-an-article '; pathArray =（url）.split（'/'）; host ='http：//'+ pathArray [2]; 我认为Rafal的示例只是省略了我正在处理的所有字符串中都存在的“ http：//”，在这种情况下，pathArray [2]是您需要的。没有前缀“ http：//”，pathArray [0]将是其中之一。再次感谢。

— Bungle，

4

为什么要全部声明变量？url = 'sitename.com/article/2009/09/14/this-is-an-article'; newurl = 'http://' + url.split('/')[0];

— ErikE 2010年

1

pathArray = window.location.href.split（'/'）; 协议= pathArray [0]; 主机= pathArray [2]; url =协议+'：//'+主机; //now url === "http:://stackoverflow.com" 结帐::

154

基于WebKit的浏览器，Firefox从21版本开始以及Internet Explorer的当前版本（IE 10和11）实现location.origin。

location.origin包括协议，域以及URL 的端口（可选）。

例如，location.originURL http://www.sitename.com/article/2009/09/14/this-is-an-article/是http://www.sitename.com。

要针对不支持此功能的浏览器location.origin使用以下简洁的polyfill：

if (typeof location.origin === 'undefined')
    location.origin = location.protocol + '//' + location.host;

— 大卫
source

36

window.location.hostname如果给出端口号，它将丢失端口号，请使用window.location.host。因此，包括尾部斜杠在内的完整“基本名称”将为：window.location.protocol+"//"+window.location.host + "/";

— sroebuck 2011年

4

实际上，如果像我一样需要提供其他端口号，则window.location.hostname仍然有用。

— Darrell Brogdon 2012年

44

不需要使用jQuery，只需使用

location.hostname

— 达迪伍德兰
source

5

谢谢-我不能将其与字符串一起使用，可以吗？我的理解是，仅适用于文档URL。

— Bungle，

2

这将不包括协议和端口。

— 大卫，

32

没有理由进行拆分以从作为链接的字符串中获取路径，主机名等。您只需要使用一个链接

//create a new element link with your link
var a = document.createElement("a");
a.href="http://www.sitename.com/article/2009/09/14/this-is-an-article/";

//hide it from view when it is added
a.style.display="none";

//add it
document.body.appendChild(a);

//read the links "features"
alert(a.protocol);
alert(a.hostname)
alert(a.pathname)
alert(a.port);
alert(a.hash);

//remove it
document.body.removeChild(a);

您可以通过jQuery附加元素并读取其attr来轻松实现。

— epascarello
source

6

当您演示了如何在几字节中没有jQuery的情况下添加50K jQuery时，为什么还要添加呢？

— Tim Down

13

因为张贴者说他们正在使用jQuery。

— epascarello

1

是的，足够公平。虽然如此简单，但使用jQuery会增加额外的抽象层却没有任何价值。

— Tim Down

2

在这种情况下，我们假设整个站点都在jqUERY上运行，kquery确实可以简化事情。

— trusktr 2011年

2

Ewww ...这不是执行此操作的最佳方法...如果从window.location.href中提取，请使用window.location。否则，请使用正则表达式。

— BMiner 2011年

21

var host = location.protocol + '//' + location.host + '/';

— ta
source

2

应该认为这是正确的答案-保留了协议

— Katai

16

String.prototype.url = function() {
  const a = $('<a />').attr('href', this)[0];
  // or if you are not using jQuery 👇🏻
  // const a = document.createElement('a'); a.setAttribute('href', this);
  let origin = a.protocol + '//' + a.hostname;
  if (a.port.length > 0) {
    origin = `${origin}:${a.port}`;
  }
  const {host, hostname, pathname, port, protocol, search, hash} = a;
  return {origin, host, hostname, pathname, port, protocol, search, hash};

}

然后：

'http://mysite:5050/pke45#23'.url()
 //OUTPUT : {host: "mysite:5050", hostname: "mysite", pathname: "/pke45", port: "5050", protocol: "http:",hash:"#23",origin:"http://mysite:5050"}

根据您的要求，您需要：

 'http://mysite:5050/pke45#23'.url().origin

评论07-2017：它也可以更优雅并且具有更多功能

const parseUrl = (string, prop) =>  {
  const a = document.createElement('a'); 
  a.setAttribute('href', string);
  const {host, hostname, pathname, port, protocol, search, hash} = a;
  const origin = `${protocol}//${hostname}${port.length ? `:${port}`:''}`;
  return prop ? eval(prop) : {origin, host, hostname, pathname, port, protocol, search, hash}
}

然后

parseUrl('http://mysite:5050/pke45#23')
// {origin: "http://mysite:5050", host: "mysite:5050", hostname: "mysite", pathname: "/pke45", port: "5050"…}


parseUrl('http://mysite:5050/pke45#23', 'origin')
// "http://mysite:5050"

凉！

— 阿登努尔·图米
source

12

如果您使用的是jQuery，则这是一种在不添加DOM的情况下操作javascript中的元素的不错方法：

var myAnchor = $("<a />");

//set href    
myAnchor.attr('href', 'http://example.com/path/to/myfile')

//your link's features
var hostname = myAnchor.attr('hostname'); // http://example.com
var pathname = myAnchor.attr('pathname'); // /path/to/my/file
//...etc

— 韦恩
source

1

我认为应该是myAnchor.prop('hostname')。我猜测jQuery在过去5年中发生了变化...感谢您的回答！

— 德利2015年

11

Douglas Crockford的regexp规则是从URL的字符串表示形式获取基本值的一种简便而完整的方法：

var yourUrl = "http://www.sitename.com/article/2009/09/14/this-is-an-article/";
var parse_url = /^(?:([A-Za-z]+):)?(\/{0,3})([0-9.\-A-Za-z]+)(?::(\d+))?(?:\/([^?#]*))?(?:\?([^#]*))?(?:#(.*))?$/;
var parts = parse_url.exec( yourUrl );
var result = parts[1]+':'+parts[2]+parts[3]+'/' ;

如果您正在寻找功能更强大的URL操作工具包，请尝试URI.js它支持getter，setter，url规范化等，所有这些都带有一个不错的可链接api。

如果您正在寻找jQuery插件，那么jquery.url.js应该可以为您提供帮助

一个简单的方法是使用锚元素，如@epascarello建议的那样。这具有必须创建DOM元素的缺点。但是，可以将其缓存在闭包中，并重复用于多个网址：

var parseUrl = (function () {
  var a = document.createElement('a');
  return function (url) {
    a.href = url;
    return {
      host: a.host,
      hostname: a.hostname,
      pathname: a.pathname,
      port: a.port,
      protocol: a.protocol,
      search: a.search,
      hash: a.hash
    };
  }
})();

像这样使用它：

paserUrl('http://google.com');

— 亚历山大·托普利采努
source

10

好的，URL API对象避免了手动拆分和构造url的情况。

 let url = new URL('/programming/1420881');
 alert(url.origin);

— devansvd
source

8

如果要从window.location.href（地址栏）提取信息，请使用以下代码获取http://www.sitename.com/：

var loc = location;
var url = loc.protocol + "//" + loc.host + "/";

如果您有一个字符串，str即任意URL（不是window.location.href），请使用正则表达式：

var url = str.match(/^(([a-z]+:)?(\/\/)?[^\/]+\/).*$/)[1];

我像宇宙中的每个人一样，讨厌阅读正则表达式，因此我将其分解为英文：

查找零个或多个字母字符，后跟一个冒号（协议，可以省略）
后跟//（也可以省略）
后跟除/（主机名和端口）以外的任何字符
其次是 /
紧随其后的是什么（路径，减去开头/）。

无需创建DOM元素或进行任何疯狂的操作。

— 矿工
source

7

我使用一个简单的正则表达式从网址中提取主机：

function get_host(url){
    return url.replace(/^((\w+:)?\/\/[^\/]+\/?).*$/,'$1');
}

像这样使用

var url = 'http://www.sitename.com/article/2009/09/14/this-is-an-article/'
var host = get_host(url);

请注意，如果url不以结束/该host不会在结束/。

以下是一些测试：

describe('get_host', function(){
    it('should return the host', function(){
        var url = 'http://www.sitename.com/article/2009/09/14/this-is-an-article/';
        assert.equal(get_host(url),'http://www.sitename.com/');
    });
    it('should not have a / if the url has no /', function(){
        var url = 'http://www.sitename.com';
        assert.equal(get_host(url),'http://www.sitename.com');
    });
    it('should deal with https', function(){
        var url = 'https://www.sitename.com/article/2009/09/14/this-is-an-article/';
        assert.equal(get_host(url),'https://www.sitename.com/');
    });
    it('should deal with no protocol urls', function(){
        var url = '//www.sitename.com/article/2009/09/14/this-is-an-article/';
        assert.equal(get_host(url),'//www.sitename.com/');
    });
    it('should deal with ports', function(){
        var url = 'http://www.sitename.com:8080/article/2009/09/14/this-is-an-article/';
        assert.equal(get_host(url),'http://www.sitename.com:8080/');
    });
    it('should deal with localhost', function(){
        var url = 'http://localhost/article/2009/09/14/this-is-an-article/';
        assert.equal(get_host(url),'http://localhost/');
    });
    it('should deal with numeric ip', function(){
        var url = 'http://192.168.18.1/article/2009/09/14/this-is-an-article/';
        assert.equal(get_host(url),'http://192.168.18.1/');
    });
});

— 迈克尔·沙夫
source

6

您可以使用以下代码获取当前网址的不同参数

alert("document.URL : "+document.URL);
alert("document.location.href : "+document.location.href);
alert("document.location.origin : "+document.location.origin);
alert("document.location.hostname : "+document.location.hostname);
alert("document.location.host : "+document.location.host);
alert("document.location.pathname : "+document.location.pathname);

— Nimesh07
source

4

function getBaseURL() {
    var url = location.href;  // entire url including querystring - also: window.location.href;
    var baseURL = url.substring(0, url.indexOf('/', 14));


    if (baseURL.indexOf('http://localhost') != -1) {
        // Base Url for localhost
        var url = location.href;  // window.location.href;
        var pathname = location.pathname;  // window.location.pathname;
        var index1 = url.indexOf(pathname);
        var index2 = url.indexOf("/", index1 + 1);
        var baseLocalUrl = url.substr(0, index2);

        return baseLocalUrl + "/";
    }
    else {
        // Root Url for domain name
        return baseURL + "/";
    }

}

然后，您可以像这样使用它...

var str = 'http://en.wikipedia.org/wiki/Knopf?q=1&t=2';
var url = str.toUrl();

url的值将是...

{
"original":"http://en.wikipedia.org/wiki/Knopf?q=1&t=2",<br/>"protocol":"http:",
"domain":"wikipedia.org",<br/>"host":"en.wikipedia.org",<br/>"relativePath":"wiki"
}

“ var url”还包含两种方法。

var paramQ = url.getParameter('q');

在这种情况下，paramQ的值为1。

var allParameters = url.getParameters();

allParameters的值将仅是参数名称。

["q","t"]

在IE，chrome和firefox上进行了测试。

— 赛克
source

1

我想我缺少了一些东西……toUrl来自哪里？

— thomasf1

3

不必考虑window.location.protocol和window.location.origin以及可能缺少指定的端口号等问题，只需抓取第三个“ /”之前的所有内容即可：

// get nth occurrence of a character c in the calling string
String.prototype.nthIndex = function (n, c) {
    var index = -1;
    while (n-- > 0) {
        index++;
        if (this.substring(index) == "") return -1; // don't run off the end
        index += this.substring(index).indexOf(c);
    }
    return index;
}

// get the base URL of the current page by taking everything up to the third "/" in the URL
function getBaseURL() {
    return document.URL.substring(0, document.URL.nthIndex(3,"/") + 1);
}

— 索瓦
source

2

这有效：

location.href.split(location.pathname)[0];

— 阿兰·波伏瓦（Alain Beauvois）
source

1

在以下情况下失败location.pathname = '/'

— Mido

1

您可以使用正则表达式进行操作：

/(http:\/\/)?(www)[^\/]+\//i

合身吗？

— 克莱门特·赫雷曼（Clement Herreman）
source

1

嗯，从我有限的正则表达式技巧来看，似乎至少是这样。我将在问题中添加更多信息，以查看是否可以帮助缩小最佳正则表达式的范围。

— Bungle，

1

我最终在字符串上使用了.split（'/'），只是因为它对我来说是一个更简单的解决方案。不过，感谢您的帮助！

— Bungle，

2

https网址？主机名不是以www开头吗？为什么仍要捕获www？

— Tim Down

1

我不知道，OP询问如何捕获URL，在他的示例中有http＆www。

— 克莱门特·埃雷曼

1

为了获得任何url的来源，包括网站（/my/path）或schemaless（//example.com/my/path）或full（http://example.com/my/path）中的路径，我组合了一个快速函数。

在下面的代码段中，所有三个调用都应记录为https://stacksnippets.net。

function getOrigin(url)
{
  if(/^\/\//.test(url))
  { // no scheme, use current scheme, extract domain
    url = window.location.protocol + url;
  }
  else if(/^\//.test(url))
  { // just path, use whole origin
    url = window.location.origin + url;
  }
  return url.match(/^([^/]+\/\/[^/]+)/)[0];
}

console.log(getOrigin('https://stacksnippets.net/my/path'));
console.log(getOrigin('//stacksnippets.net/my/path'));
console.log(getOrigin('/my/path'));

展开摘要

— 汤姆·凯
source

0

这对我有用：

var getBaseUrl = function (url) {
  if (url) {
    var parts = url.split('://');
    
    if (parts.length > 1) {
      return parts[0] + '://' + parts[1].split('/')[0] + '/';
    } else {
      return parts[0].split('/')[0] + '/';
    }
  }
};

展开摘要

— 阿贝拉贝斯纳比
source

0

var tilllastbackslashregex = new RegExp(/^.*\//);
baseUrl = tilllastbackslashregex.exec(window.location.href);

window.location.href提供浏览器地址栏中的当前URL地址

它可以是 https://stackoverflow.com/abc/xyz或https://www.google.com/search?q=abc tilllastbackslashregex.exec（）运行regex并重新匹配匹配的字符串直到最后一个反斜杠，即https分别分别：//：//stackoverflow.com/abc/或https://www.google.com/

— 哈西卜·乌拉·汗
source

5

请添加简短说明。

— Preet

6

在审核队列中：我可以要求您在源代码周围添加一些上下文。仅代码的答案很难理解。如果您可以在帖子中添加更多信息，它将对提问者和将来的读者都有帮助。

— RBT

0

一个好的方法是使用JavaScript本机api URL对象。这提供了许多有用的网址部分。

例如：

const url = '/programming/1420881/how-to-extract-base-url-from-a-string-in-javascript'

const urlObject = new URL(url);

console.log(urlObject);


// RESULT: 
//________________________________
hash: "",
host: "stackoverflow.com",
hostname: "stackoverflow.com",
href: "/programming/1420881/how-to-extract-base-url-from-a-string-in-javascript",
origin: "https://stackoverflow.com",
password: "",
pathname: "/questions/1420881/how-to-extract-base-url-from-a-string-in-javaript",
port: "",
protocol: "https:",
search: "",
searchParams: [object URLSearchParams]
... + some other methods

如您所见，您可以访问所需的任何内容。

例如： console.log(urlObject.host); // "stackoverflow.com"

网址文件

— 五桑伯
source

如何从JavaScript中的字符串中提取基本URL？

评论07-2017：它也可以更优雅并且具有更多功能