URL
自 v0.10.0 版本开始新增
源代码: lib/url.js
The node:url module provides utilities for URL resolution and parsing. It can
be accessed using:
MJS
CJS
URL strings and URL objects
A URL string is a structured string containing multiple meaningful components. When parsed, a URL object is returned containing properties for each of these components.
The node:url module provides two APIs for working with URLs: a legacy API that
is Node.js specific, and a newer API that implements the same
WHATWG URL Standard used by web browsers.
A comparison between the WHATWG and Legacy APIs is provided below. Above the URL
'https://user:pass@sub.example.com:8080/p/a/t/h?query=string#hash', properties
of an object returned by the legacy url.parse() are shown. Below it are
properties of a WHATWG URL object.
WHATWG URL's origin property includes protocol and host, but not
username or password.
TEXT
Parsing the URL string using the WHATWG API:
JS
Parsing the URL string using the Legacy API:
MJS
CJS
Constructing a URL from component parts and getting the constructed string
It is possible to construct a WHATWG URL from component parts using either the property setters or a template literal string:
JS
JS
To get the constructed URL string, use the href property accessor:
JS
The WHATWG URL API
C URL
历史
| 版本 | 历史变更 |
|---|---|
| v10.0.0 | The class is now available on the global object. |
| v7.0.0, v6.13.0 | 自 v7.0.0, v6.13.0 版本开始新增 |
Browser-compatible URL class, implemented by following the WHATWG URL
Standard. Examples of parsed URLs may be found in the Standard itself.
The URL class is also available on the global object.
In accordance with browser conventions, all properties of URL objects
are implemented as getters and setters on the class prototype, rather than as
data properties on the object itself. Thus, unlike legacy urlObjects,
using the delete keyword on any properties of URL objects (e.g. delete
myURL.protocol, delete myURL.pathname, etc) has no effect but will still
return true.
M new URL(input[, base])
inputstringThe absolute or relative input URL to parse. Ifinputis relative, thenbaseis required. Ifinputis absolute, thebaseis ignored. Ifinputis not a string, it is converted to a string first.basestringThe base URL to resolve against if theinputis not absolute. Ifbaseis not a string, it is converted to a string first.
Creates a new URL object by parsing the input relative to the base. If
base is passed as a string, it will be parsed equivalent to new URL(base).
JS
The URL constructor is accessible as a property on the global object. It can also be imported from the built-in url module:
MJS
CJS
A TypeError will be thrown if the input or base are not valid URLs. Note
that an effort will be made to coerce the given values into strings. For
instance:
JS
Unicode characters appearing within the host name of input will be
automatically converted to ASCII using the Punycode algorithm.
JS
This feature is only available if the node executable was compiled with
ICU enabled. If not, the domain names are passed through unchanged.
In cases where it is not known in advance if input is an absolute URL
and a base is provided, it is advised to validate that the origin of
the URL object is what is expected.
JS
M url.hash
Gets and sets the fragment portion of the URL.
JS
Invalid URL characters included in the value assigned to the hash property
are percent-encoded. The selection of which characters to
percent-encode may vary somewhat from what the url.parse() and
url.format() methods would produce.
M url.host
Gets and sets the host portion of the URL.
JS
Invalid host values assigned to the host property are ignored.
M url.hostname
Gets and sets the host name portion of the URL. The key difference between
url.host and url.hostname is that url.hostname does not include the
port.
JS
Invalid host name values assigned to the hostname property are ignored.
M url.href
Gets and sets the serialized URL.
JS
Getting the value of the href property is equivalent to calling
url.toString().
Setting the value of this property to a new value is equivalent to creating a
new URL object using new URL(value). Each of the URL
object's properties will be modified.
If the value assigned to the href property is not a valid URL, a TypeError
will be thrown.
M url.origin
Gets the read-only serialization of the URL's origin.
JS
JS
M url.password
Gets and sets the password portion of the URL.
JS
Invalid URL characters included in the value assigned to the password property
are percent-encoded. The selection of which characters to
percent-encode may vary somewhat from what the url.parse() and
url.format() methods would produce.
M url.pathname
Gets and sets the path portion of the URL.
JS
Invalid URL characters included in the value assigned to the pathname
property are percent-encoded. The selection of which characters
to percent-encode may vary somewhat from what the url.parse() and
url.format() methods would produce.
M url.port
Gets and sets the port portion of the URL.
The port value may be a number or a string containing a number in the range
0 to 65535 (inclusive). Setting the value to the default port of the
URL objects given protocol will result in the port value becoming
the empty string ('').
The port value can be an empty string in which case the port depends on the protocol/scheme:
| protocol | port |
|---|---|
| "ftp" | 21 |
| "file" | |
| "http" | 80 |
| "https" | 443 |
| "ws" | 80 |
| "wss" | 443 |
Upon assigning a value to the port, the value will first be converted to a
string using .toString().
If that string is invalid but it begins with a number, the leading number is
assigned to port.
If the number lies outside the range denoted above, it is ignored.
JS
Numbers which contain a decimal point, such as floating-point numbers or numbers in scientific notation, are not an exception to this rule. Leading numbers up to the decimal point will be set as the URL's port, assuming they are valid:
JS
M url.protocol
Gets and sets the protocol portion of the URL.
JS
Invalid URL protocol values assigned to the protocol property are ignored.
Special schemes
The WHATWG URL Standard considers a handful of URL protocol schemes to be
special in terms of how they are parsed and serialized. When a URL is
parsed using one of these special protocols, the url.protocol property
may be changed to another special protocol but cannot be changed to a
non-special protocol, and vice versa.
For instance, changing from http to https works:
JS
However, changing from http to a hypothetical fish protocol does not
because the new protocol is not special.
JS
Likewise, changing from a non-special protocol to a special protocol is also not permitted:
JS
According to the WHATWG URL Standard, special protocol schemes are ftp,
file, http, https, ws, and wss.
M url.search
Gets and sets the serialized query portion of the URL.
JS
Any invalid URL characters appearing in the value assigned the search
property will be percent-encoded. The selection of which
characters to percent-encode may vary somewhat from what the url.parse()
and url.format() methods would produce.
M url.searchParams
Gets the URLSearchParams object representing the query parameters of the
URL. This property is read-only but the URLSearchParams object it provides
can be used to mutate the URL instance; to replace the entirety of query
parameters of the URL, use the url.search setter. See
URLSearchParams documentation for details.
Use care when using .searchParams to modify the URL because,
per the WHATWG specification, the URLSearchParams object uses
different rules to determine which characters to percent-encode. For
instance, the URL object will not percent encode the ASCII tilde (~)
character, while URLSearchParams will always encode it:
JS
M url.username
Gets and sets the username portion of the URL.
JS
Any invalid URL characters appearing in the value assigned the username
property will be percent-encoded. The selection of which
characters to percent-encode may vary somewhat from what the url.parse()
and url.format() methods would produce.
M url.toString()
- Returns:
string
The toString() method on the URL object returns the serialized URL. The
value returned is equivalent to that of url.href and url.toJSON().
M url.toJSON()
- Returns:
string
The toJSON() method on the URL object returns the serialized URL. The
value returned is equivalent to that of url.href and
url.toString().
This method is automatically called when an URL object is serialized
with JSON.stringify().
JS
M URL.createObjectURL(blob)
自 v16.7.0 版本开始新增
Creates a 'blob:nodedata:...' URL string that represents the given Blob
object and can be used to retrieve the Blob later.
JS
The data stored by the registered Blob will be retained in memory until
URL.revokeObjectURL() is called to remove it.
Blob objects are registered within the current thread. If using Worker
Threads, Blob objects registered within one Worker will not be available
to other workers or the main thread.
M URL.revokeObjectURL(id)
自 v16.7.0 版本开始新增
idstringA'blob:nodedata:...URL string returned by a prior call toURL.createObjectURL().
Removes the stored Blob identified by the given ID. Attempting to revoke a
ID that isn't registered will silently fail.
C URLSearchParams
历史
| 版本 | 历史变更 |
|---|---|
| v10.0.0 | The class is now available on the global object. |
| v7.5.0, v6.13.0 | 自 v7.5.0, v6.13.0 版本开始新增 |
The URLSearchParams API provides read and write access to the query of a
URL. The URLSearchParams class can also be used standalone with one of the
four following constructors.
The URLSearchParams class is also available on the global object.
The WHATWG URLSearchParams interface and the querystring module have
similar purpose, but the purpose of the querystring module is more
general, as it allows the customization of delimiter characters (& and =).
On the other hand, this API is designed purely for URL query strings.
JS
M new URLSearchParams()
Instantiate a new empty URLSearchParams object.
M new URLSearchParams(string)
stringstringA query string
Parse the string as a query string, and use it to instantiate a new
URLSearchParams object. A leading '?', if present, is ignored.
JS
M new URLSearchParams(obj)
自 v7.10.0, v6.13.0 版本开始新增
objObjectAn object representing a collection of key-value pairs
Instantiate a new URLSearchParams object with a query hash map. The key and
value of each property of obj are always coerced to strings.
Unlike querystring module, duplicate keys in the form of array values are
not allowed. Arrays are stringified using array.toString(), which simply
joins all array elements with commas.
JS
M new URLSearchParams(iterable)
自 v7.10.0, v6.13.0 版本开始新增
iterableIterableAn iterable object whose elements are key-value pairs
Instantiate a new URLSearchParams object with an iterable map in a way that
is similar to Map's constructor. iterable can be an Array or any
iterable object. That means iterable can be another URLSearchParams, in
which case the constructor will simply create a clone of the provided
URLSearchParams. Elements of iterable are key-value pairs, and can
themselves be any iterable object.
Duplicate keys are allowed.
JS
M urlSearchParams.append(name, value)
Append a new name-value pair to the query string.
M urlSearchParams.delete(name)
namestring
Remove all name-value pairs whose name is name.
M urlSearchParams.entries()
- Returns:
Iterator
Returns an ES6 Iterator over each of the name-value pairs in the query.
Each item of the iterator is a JavaScript Array. The first item of the Array
is the name, the second item of the Array is the value.
Alias for urlSearchParams[@@iterator]().
M urlSearchParams.forEach(fn[, thisArg])
fnFunctionInvoked for each name-value pair in the querythisArgObjectTo be used asthisvalue for whenfnis called
Iterates over each name-value pair in the query and invokes the given function.
JS
M urlSearchParams.get(name)
Returns the value of the first name-value pair whose name is name. If there
are no such pairs, null is returned.
M urlSearchParams.getAll(name)
namestring- Returns: string[]
Returns the values of all name-value pairs whose name is name. If there are
no such pairs, an empty array is returned.
M urlSearchParams.has(name)
Returns true if there is at least one name-value pair whose name is name.
M urlSearchParams.keys()
- Returns:
Iterator
Returns an ES6 Iterator over the names of each name-value pair.
JS
M urlSearchParams.set(name, value)
Sets the value in the URLSearchParams object associated with name to
value. If there are any pre-existing name-value pairs whose names are name,
set the first such pair's value to value and remove all others. If not,
append the name-value pair to the query string.
JS
M urlSearchParams.sort()
自 v7.7.0, v6.13.0 版本开始新增
Sort all existing name-value pairs in-place by their names. Sorting is done with a stable sorting algorithm, so relative order between name-value pairs with the same name is preserved.
This method can be used, in particular, to increase cache hits.
JS
M urlSearchParams.toString()
- Returns:
string
Returns the search parameters serialized as a string, with characters percent-encoded where necessary.
M urlSearchParams.values()
- Returns:
Iterator
Returns an ES6 Iterator over the values of each name-value pair.
M urlSearchParams[Symbol.iterator]()
- Returns:
Iterator
Returns an ES6 Iterator over each of the name-value pairs in the query string.
Each item of the iterator is a JavaScript Array. The first item of the Array
is the name, the second item of the Array is the value.
Alias for urlSearchParams.entries().
JS
M url.domainToASCII(domain)
自 v7.4.0, v6.13.0 版本开始新增
Returns the Punycode ASCII serialization of the domain. If domain is an
invalid domain, the empty string is returned.
It performs the inverse operation to url.domainToUnicode().
This feature is only available if the node executable was compiled with
ICU enabled. If not, the domain names are passed through unchanged.
MJS
CJS
M url.domainToUnicode(domain)
自 v7.4.0, v6.13.0 版本开始新增
Returns the Unicode serialization of the domain. If domain is an invalid
domain, the empty string is returned.
It performs the inverse operation to url.domainToASCII().
This feature is only available if the node executable was compiled with
ICU enabled. If not, the domain names are passed through unchanged.
MJS
CJS
M url.fileURLToPath(url)
自 v10.12.0 版本开始新增
urlURL|stringThe file URL string or URL object to convert to a path.- Returns:
stringThe fully-resolved platform-specific Node.js file path.
This function ensures the correct decodings of percent-encoded characters as well as ensuring a cross-platform valid absolute path string.
MJS
CJS
M url.format(URL[, options])
自 v7.6.0 版本开始新增
URLURLA WHATWG URL objectoptionsObjectauthbooleantrueif the serialized URL string should include the username and password,falseotherwise. Default:true.fragmentbooleantrueif the serialized URL string should include the fragment,falseotherwise. Default:true.searchbooleantrueif the serialized URL string should include the search query,falseotherwise. Default:true.unicodebooleantrueif Unicode characters appearing in the host component of the URL string should be encoded directly as opposed to being Punycode encoded. Default:false.
- Returns:
string
Returns a customizable serialization of a URL String representation of a
WHATWG URL object.
The URL object has both a toString() method and href property that return
string serializations of the URL. These are not, however, customizable in
any way. The url.format(URL[, options]) method allows for basic customization
of the output.
MJS
CJS
M url.pathToFileURL(path)
自 v10.12.0 版本开始新增
This function ensures that path is resolved absolutely, and that the URL
control characters are correctly encoded when converting into a File URL.
MJS
CJS
M url.urlToHttpOptions(url)
自 v15.7.0, v14.18.0 版本开始新增
urlURLThe WHATWG URL object to convert to an options object.- Returns:
ObjectOptions objectprotocolstringProtocol to use.hostnamestringA domain name or IP address of the server to issue the request to.hashstringThe fragment portion of the URL.searchstringThe serialized query portion of the URL.pathnamestringThe path portion of the URL.pathstringRequest path. Should include query string if any. E.G.'/index.html?page=12'. An exception is thrown when the request path contains illegal characters. Currently, only spaces are rejected but that may change in the future.hrefstringThe serialized URL.portnumberPort of remote server.authstringBasic authentication i.e.'user:password'to compute an Authorization header.
This utility function converts a URL object into an ordinary options object as
expected by the http.request() and https.request() APIs.
MJS
CJS
Legacy URL API
Legacy urlObject
The legacy urlObject (require('node:url').Url or
import { Url } from 'node:url') is
created and returned by the url.parse() function.
M urlObject.auth
The auth property is the username and password portion of the URL, also
referred to as userinfo. This string subset follows the protocol and
double slashes (if present) and precedes the host component, delimited by @.
The string is either the username, or it is the username and password separated
by :.
For example: 'user:pass'.
M urlObject.hash
The hash property is the fragment identifier portion of the URL including the
leading # character.
For example: '#hash'.
M urlObject.host
The host property is the full lower-cased host portion of the URL, including
the port if specified.
For example: 'sub.example.com:8080'.
M urlObject.hostname
The hostname property is the lower-cased host name portion of the host
component without the port included.
For example: 'sub.example.com'.
M urlObject.href
The href property is the full URL string that was parsed with both the
protocol and host components converted to lower-case.
For example: 'http://user:pass@sub.example.com:8080/p/a/t/h?query=string#hash'.
M urlObject.path
The path property is a concatenation of the pathname and search
components.
For example: '/p/a/t/h?query=string'.
No decoding of the path is performed.
M urlObject.pathname
The pathname property consists of the entire path section of the URL. This
is everything following the host (including the port) and before the start
of the query or hash components, delimited by either the ASCII question
mark (?) or hash (#) characters.
For example: '/p/a/t/h'.
No decoding of the path string is performed.
M urlObject.port
The port property is the numeric port portion of the host component.
For example: '8080'.
M urlObject.protocol
The protocol property identifies the URL's lower-cased protocol scheme.
For example: 'http:'.
M urlObject.query
The query property is either the query string without the leading ASCII
question mark (?), or an object returned by the querystring module's
parse() method. Whether the query property is a string or object is
determined by the parseQueryString argument passed to url.parse().
For example: 'query=string' or {'query': 'string'}.
If returned as a string, no decoding of the query string is performed. If returned as an object, both keys and values are decoded.
M urlObject.search
The search property consists of the entire "query string" portion of the
URL, including the leading ASCII question mark (?) character.
For example: '?query=string'.
No decoding of the query string is performed.
M urlObject.slashes
The slashes property is a boolean with a value of true if two ASCII
forward-slash characters (/) are required following the colon in the
protocol.
M url.format(urlObject)
历史
| 版本 | 历史变更 |
|---|---|
| v17.0.0 | Now throws an `ERR_INVALID_URL` exception when Punycode conversion of a hostname introduces changes that could cause the URL to be re-parsed differently. |
| v15.13.0, v14.17.0 | Deprecation revoked. Status changed to "Legacy". |
| v11.0.0 | The Legacy URL API is deprecated. Use the WHATWG URL API. |
| v7.0.0 | URLs with a `file:` scheme will now always use the correct number of slashes regardless of `slashes` option. A falsy `slashes` option with no protocol is now also respected at all times. |
| v0.1.25 | 自 v0.1.25 版本开始新增 |
urlObjectObject|stringA URL object (as returned byurl.parse()or constructed otherwise). If a string, it is converted to an object by passing it tourl.parse().
The url.format() method returns a formatted URL string derived from
urlObject.
JS
If urlObject is not an object or a string, url.format() will throw a
TypeError.
The formatting process operates as follows:
- A new empty string
resultis created. - If
urlObject.protocolis a string, it is appended as-is toresult. - Otherwise, if
urlObject.protocolis notundefinedand is not a string, anErroris thrown. - For all string values of
urlObject.protocolthat do not end with an ASCII colon (:) character, the literal string:will be appended toresult. - If either of the following conditions is true, then the literal string
//will be appended toresult:urlObject.slashesproperty is true;urlObject.protocolbegins withhttp,https,ftp,gopher, orfile;
- If the value of the
urlObject.authproperty is truthy, and eitherurlObject.hostorurlObject.hostnameare notundefined, the value ofurlObject.authwill be coerced into a string and appended toresultfollowed by the literal string@. - If the
urlObject.hostproperty isundefinedthen:- If the
urlObject.hostnameis a string, it is appended toresult. - Otherwise, if
urlObject.hostnameis notundefinedand is not a string, anErroris thrown. - If the
urlObject.portproperty value is truthy, andurlObject.hostnameis notundefined:- The literal string
:is appended toresult, and - The value of
urlObject.portis coerced to a string and appended toresult.
- The literal string
- If the
- Otherwise, if the
urlObject.hostproperty value is truthy, the value ofurlObject.hostis coerced to a string and appended toresult. - If the
urlObject.pathnameproperty is a string that is not an empty string:- If the
urlObject.pathnamedoes not start with an ASCII forward slash (/), then the literal string'/'is appended toresult. - The value of
urlObject.pathnameis appended toresult.
- If the
- Otherwise, if
urlObject.pathnameis notundefinedand is not a string, anErroris thrown. - If the
urlObject.searchproperty isundefinedand if theurlObject.queryproperty is anObject, the literal string?is appended toresultfollowed by the output of calling thequerystringmodule'sstringify()method passing the value ofurlObject.query. - Otherwise, if
urlObject.searchis a string:- If the value of
urlObject.searchdoes not start with the ASCII question mark (?) character, the literal string?is appended toresult. - The value of
urlObject.searchis appended toresult.
- If the value of
- Otherwise, if
urlObject.searchis notundefinedand is not a string, anErroris thrown. - If the
urlObject.hashproperty is a string:- If the value of
urlObject.hashdoes not start with the ASCII hash (#) character, the literal string#is appended toresult. - The value of
urlObject.hashis appended toresult.
- If the value of
- Otherwise, if the
urlObject.hashproperty is notundefinedand is not a string, anErroris thrown. resultis returned.
M url.parse(urlString[, parseQueryString[, slashesDenoteHost]])
历史
| 版本 | 历史变更 |
|---|---|
| v15.13.0, v14.17.0 | Deprecation revoked. Status changed to "Legacy". |
| v11.14.0 | The `pathname` property on the returned URL object is now `/` when there is no path and the protocol scheme is `ws:` or `wss:`. |
| v11.0.0 | The Legacy URL API is deprecated. Use the WHATWG URL API. |
| v9.0.0 | The `search` property on the returned URL object is now `null` when no query string is present. |
| v0.1.25 | 自 v0.1.25 版本开始新增 |
urlStringstringThe URL string to parse.parseQueryStringbooleanIftrue, thequeryproperty will always be set to an object returned by thequerystringmodule'sparse()method. Iffalse, thequeryproperty on the returned URL object will be an unparsed, undecoded string. Default:false.slashesDenoteHostbooleanIftrue, the first token after the literal string//and preceding the next/will be interpreted as thehost. For instance, given//foo/bar, the result would be{host: 'foo', pathname: '/bar'}rather than{pathname: '//foo/bar'}. Default:false.
The url.parse() method takes a URL string, parses it, and returns a URL
object.
A TypeError is thrown if urlString is not a string.
A URIError is thrown if the auth property is present but cannot be decoded.
url.parse() uses a lenient, non-standard algorithm for parsing URL
strings. It is prone to security issues such as host name spoofing
and incorrect handling of usernames and passwords.
url.parse() is an exception to most of the legacy APIs. Despite its security
concerns, it is legacy and not deprecated because it is:
- Faster than the alternative WHATWG
URLparser. - Easier to use with regards to relative URLs than the alternative WHATWG
URLAPI. - Widely relied upon within the npm ecosystem.
Use with caution.
M url.resolve(from, to)
历史
| 版本 | 历史变更 |
|---|---|
| v15.13.0, v14.17.0 | Deprecation revoked. Status changed to "Legacy". |
| v11.0.0 | The Legacy URL API is deprecated. Use the WHATWG URL API. |
| v6.6.0 | The `auth` fields are now kept intact when `from` and `to` refer to the same host. |
| v6.5.0, v4.6.2 | The `port` field is copied correctly now. |
| v6.0.0 | The `auth` fields is cleared now the `to` parameter contains a hostname. |
| v0.1.25 | 自 v0.1.25 版本开始新增 |
The url.resolve() method resolves a target URL relative to a base URL in a
manner similar to that of a web browser resolving an anchor tag.
JS
To achieve the same result using the WHATWG URL API:
JS
Percent-encoding in URLs
URLs are permitted to only contain a certain range of characters. Any character falling outside of that range must be encoded. How such characters are encoded, and which characters to encode depends entirely on where the character is located within the structure of the URL.
Legacy API
Within the Legacy API, spaces (' ') and the following characters will be
automatically escaped in the properties of URL objects:
TEXT
For example, the ASCII space character (' ') is encoded as %20. The ASCII
forward slash (/) character is encoded as %3C.
WHATWG API
The WHATWG URL Standard uses a more selective and fine grained approach to selecting encoded characters than that used by the Legacy API.
The WHATWG algorithm defines four "percent-encode sets" that describe ranges of characters that must be percent-encoded:
The C0 control percent-encode set includes code points in range U+0000 to U+001F (inclusive) and all code points greater than U+007E.
The fragment percent-encode set includes the C0 control percent-encode set and code points U+0020, U+0022, U+003C, U+003E, and U+0060.
The path percent-encode set includes the C0 control percent-encode set and code points U+0020, U+0022, U+0023, U+003C, U+003E, U+003F, U+0060, U+007B, and U+007D.
The userinfo encode set includes the path percent-encode set and code points U+002F, U+003A, U+003B, U+003D, U+0040, U+005B, U+005C, U+005D, U+005E, and U+007C.
The userinfo percent-encode set is used exclusively for username and passwords encoded within the URL. The path percent-encode set is used for the path of most URLs. The fragment percent-encode set is used for URL fragments. The C0 control percent-encode set is used for host and path under certain specific conditions, in addition to all other cases.
When non-ASCII characters appear within a host name, the host name is encoded using the Punycode algorithm. Note, however, that a host name may contain both Punycode encoded and percent-encoded characters:
JS