Uniform Resource Identiriers (URI): Generic Syntax（3）

   The URI syntax is dependent upon the scheme.  In general, absolute
   URI are written as follows:

       <scheme>:<scheme-specific-part>

URIの構文はスキームに依存する。通常、絶対URIは次のように記述する。

   An absolute URI contains the name of the scheme being used (<scheme>)
   followed by a colon (":") and then a string (the <scheme-specific-
   part>) whose interpretation depends on the scheme.

絶対URIは、使われるスキーム名（<scheme>）から始まり、その後にコロン（":"）を付け、更にその後にスキームに依存した表現となる文字列（<scheme-specific-part>）が続く。

   The URI syntax does not require that the scheme-specific-part have
   any general structure or set of semantics which is common among all
   URI.  However, a subset of URI do share a common syntax for
   representing hierarchical relationships within the namespace.  This
   "generic URI" syntax consists of a sequence of four main components:

      <scheme>://<authority><path>?<query>

URI構文は、全てのURIで共通となる基本的な構造や意味をscheme-specific-partに求めない。しかしURIの部分集合は、名前空間における階層的な関係を表現する共通構文を共有している。この一般的なURIの構文は4個の主となるコンポーネントからなる。

   each of which, except <scheme>, may be absent from a particular URI.
   For example, some URI schemes do not allow an <authority> component,
   and others do not use a <query> component.

      absoluteURI   = scheme ":" ( hier_part | opaque_part )

<scheme>を除く各部は、全てのURIで必須となるわけではない。例えば、<authority>コンポーネントが不要なURIスキームもあるし、<query>コンポーネントを用いないURIスキームもある。

   URI that are hierarchical in nature use the slash "/" character for
   separating hierarchical components.  For some file systems, a "/"
   character (used to denote the hierarchical structure of a URI) is the
   delimiter used to construct a file name hierarchy, and thus the URI
   path will look similar to a file pathname.  This does NOT imply that
   the resource is a file or that the URI maps to an actual filesystem
   pathname.

      hier_part     = ( net_path | abs_path ) [ "?" query ]

      net_path      = "//" authority [ abs_path ]

      abs_path      = "/"  path_segments

階層構造を持つURIでは、各階層コンポーネントを分割する文字としてスラッシュ"/"を用いる。ファイルシステムによっては、（URIの階層構造を示すために用いる）"/"はファイル名の階層を構成するために用いる区切り文字であり、このためURIのパスはファイルのパス名に類似していることがわかるだろう。ただしこのことが、リソースがファイルであることを意味するわけではないし、そのURIが実際のファイルシステムで用いられるパス名の写像であることを意味するわけでもない。

   URI that do not make use of the slash "/" character for separating
   hierarchical components are considered opaque by the generic URI
   parser.

      opaque_part   = uric_no_slash *uric

      uric_no_slash = unreserved | escaped | ";" | "?" | ":" | "@" |
                      "&" | "=" | "+" | "$" | ","

階層コンポーネントの区切り文字にスラッシュ"/"を用いないURIは、一般的なURIパーサから不透明であるとみなされる。

   We use the term <path> to refer to both the <abs_path> and
   <opaque_part> constructs, since they are mutually exclusive for any
   given URI and can be parsed as a single component.

当文書では、<path>という用語を<abs_path>と<opaque-part>から構成されるものと定義する。なぜならば、<abs_path>と<opaque-part>はURIの中で相互に排他的であり、それぞれ単一のコンポーネントとして解析可能なためである。

3.1. Scheme Component

   Just as there are many different methods of access to resources,
   there are a variety of schemes for identifying such resources.  The
   URI syntax consists of a sequence of components separated by reserved
   characters, with the first component defining the semantics for the
   remainder of the URI string.

リソースにアクセスする方法が多様なように、リソースを識別するために用いるスキームも多様である。URI構文は予約文字で区切られたコンポーネントの列で構成され、最初のコンポーネントはURI文字列の意味を暗示するものと定義する。

   Scheme names consist of a sequence of characters beginning with a
   lower case letter and followed by any combination of lower case
   letters, digits, plus ("+"), period ("."), or hyphen ("-").  For
   resiliency, programs interpreting URI should treat upper case letters
   as equivalent to lower case in scheme names (e.g., allow "HTTP" as
   well as "http").

      scheme        = alpha *( alpha | digit | "+" | "-" | "." )

スキーム名は英小文字で始まり、英小文字、数字、プラス記号（"+"）、ピリオド（"."）、ハイフン（"-"）を組み合わせた文字列で構成される。汎用性をもたせるため、URIを解釈するプログラムはスキーム名にある大英文字を英小文字として処理すべきである。例えば、"HTTP"を"http"と同じスキーム名であるとみなすべきである。

   Relative URI references are distinguished from absolute URI in that
   they do not begin with a scheme name.  Instead, the scheme is
   inherited from the base URI, as described in Section 5.2.

スキーム名から始まっていないURI参照は相対URI参照とみなされ、絶対URIと区別される。代わりに相対URI参照のスキームは基底URIから継承され、詳細は5.2. 相対参照を絶対形式へ解決する方法に記す。

3.2. Authority Component

   Many URI schemes include a top hierarchical element for a naming
   authority, such that the namespace defined by the remainder of the
   URI is governed by that authority.  This authority component is
   typically defined by an Internet-based server or a scheme-specific
   registry of naming authorities.

      authority     = server | reg_name

多くのURIスキームは命名機関としての最上位階層の要素を含み、URIのその他の部分で定義される名前空間はその命名機関によって管理される。この機関のコンポーネントは、インターネットベースのサーバやスキームごとの命名機関の登記によって定義される。

   The authority component is preceded by a double slash "//" and is
   terminated by the next slash "/", question-mark "?", or by the end of
   the URI.  Within the authority component, the characters ";", ":",
   "@", "?", and "/" are reserved.

機関を示すコンポーネントは2つのスラッシュ"//"で始まり、次のスラッシュか疑問符"?"、又はURIの末尾で終了する。機関を示すコンポーネントの中で、";"、":"、"@"、"?"、"/"は予約済みである。

   An authority component is not required for a URI scheme to make use
   of relative references.  A base URI without an authority component
   implies that any relative reference will also be without an authority
   component.

機関を示すコンポーネントは相対参照を用いるURIスキームでは必要とされない。機関を示すコンポーネントを持たないURIを基底とする相対参照は、機関を示すコンポーネントを持たないことを意味する。

3.2.1. Registry-based Naming Authority

   The structure of a registry-based naming authority is specific to the
   URI scheme, but constrained to the allowed characters for an
   authority component.

      reg_name      = 1*( unreserved | escaped | "$" | "," |
                          ";" | ":" | "@" | "&" | "=" | "+" )

命名機関によって登記されたURIスキームの構造は特殊であるが、機関を示すコンポーネントに許された文字に制限される。

3.2.2. Server-based Naming Authority

   URL schemes that involve the direct use of an IP-based protocol to a
   specified server on the Internet use a common syntax for the server
   component of the URI's scheme-specific data:

      <userinfo>@<host>:<port>

   where <userinfo> may consist of a user name and, optionally, scheme-
   specific information about how to gain authorization to access the
   server.  The parts "<userinfo>@" and ":<port>" may be omitted.

      server        = [ [ userinfo "@" ] hostport ]

インターネット上の特定のサーバに対してIPベースのプロトコルを直接用いるURLスキームは、URIのスキームに依存するサーバコンポーネントに共通の構文を用いる。

ここで、<userinfo>にはユーザ名と、任意でサーバへアクセスするための認証に関するスキームに依存した情報を含めることができる。また、"<userinfo>@"の部分と":<port>"の部分は省略しても良い。

   The user information, if present, is followed by a commercial at-sign
   "@".

      userinfo      = *( unreserved | escaped |
                         ";" | ":" | "&" | "=" | "+" | "$" | "," )

ユーザ情報がある場合、その後ろにアットマーク"@"を付ける。

   Some URL schemes use the format "user:password" in the userinfo
   field. This practice is NOT RECOMMENDED, because the passing of
   authentication information in clear text (such as URI) has proven to
   be a security risk in almost every case where it has been used.

URLには、ユーザ情報フィールドにて"ユーザ名:パスワード"形式を用いるものがある。認証情報の（URIのような）平文を用いてのやり取りがセキュリティ上危険であることは周知の事実であるため、この習慣は推奨されない。

   The host is a domain name of a network host, or its IPv4 address as a
   set of four decimal digit groups separated by ".".  Literal IPv6
   addresses are not supported.

      hostport      = host [ ":" port ]
      host          = hostname | IPv4address
      hostname      = *( domainlabel "." ) toplabel [ "." ]
      domainlabel   = alphanum | alphanum *( alphanum | "-" ) alphanum
      toplabel      = alpha | alpha *( alphanum | "-" ) alphanum
      IPv4address   = 1*digit "." 1*digit "." 1*digit "." 1*digit
      port          = *digit

ホストはネットワークホストを表すドメイン名か、4個の10進数を"."で区切って表されるIPv4アドレスのどちらかである。IPv6アドレスはサポートされていない。

   Hostnames take the form described in Section 3 of [RFC1034] and
   Section 2.1 of [RFC1123]: a sequence of domain labels separated by
   ".", each domain label starting and ending with an alphanumeric
   character and possibly also containing "-" characters.  The rightmost
   domain label of a fully qualified domain name will never start with a
   digit, thus syntactically distinguishing domain names from IPv4
   addresses, and may be followed by a single "." if it is necessary to
   distinguish between the complete domain name and any local domain.
   To actually be "Uniform" as a resource locator, a URL hostname should
   be a fully qualified domain name.  In practice, however, the host
   component may be a local domain literal.

      Note: A suitable representation for including a literal IPv6
      address as the host part of a URL is desired, but has not yet been
      determined or implemented in practice.

ホスト名は、[RFC1034]の3章と[RFC1123]の2.1章にて記されている形式をとる。つまり、ドメイン名はピリオド"."で区切られたドメインラベルの列からなり、各ドメインラベルの最初と最後は英数字からなり、途中にはハイフン"-"を含むこともある。右端のFQDNドメインラベルが数字で始まることは決してないため、ドメイン名をIPv4アドレスと構文的に区別することができる。また、完全なドメイン名と何らかのローカルドメインを区別する必要がある場合、ドメイン名の右端に単独のピリオド"."を付けることによって完全なドメイン名であることを明示できる。リソースの位置を示す本当の意味での統一書式であれば、URLのホスト名はFQDNであるべきだが、現実にはホストコンポーネントにローカルドメインを示す文字が用いられることもある。

注記：URLのホスト部分にIPv6アドレスを含める適切な表記法が望まれるが、実際にはまだ確定もされていなければ実装もされていない。

   The port is the network port number for the server.  Most schemes
   designate protocols that have a default port number.  Another port
   number may optionally be supplied, in decimal, separated from the
   host by a colon.  If the port is omitted, the default port number is
   assumed.

ポートはサーバのネットワークポート番号を表す。スキームが指定するプロトコルの殆どはデフォルトポート番号を持つが、任意で別のポート番号を指定することができる。ホスト部分との区切り文字にはコロンを使用し、ポート番号は10進数で表記する。ポート番号が省略された場合、デフォルトポート番号を用いると仮定される。

3.3. Path Component

   The path component contains data, specific to the authority (or the
   scheme if there is no authority component), identifying the resource
   within the scope of that scheme and authority.

      path          = [ abs_path | opaque_part ]

      path_segments = segment *( "/" segment )
      segment       = *pchar *( ";" param )
      param         = *pchar

      pchar         = unreserved | escaped |
                      ":" | "@" | "&" | "=" | "+" | "$" | ","

パスコンポーネントは、機関（機関コンポーネントを有しない場合はスキーム）特有で、且つスキームと機関の範囲内でリソースを識別するデータからなる。

   The path may consist of a sequence of path segments separated by a
   single slash "/" character.  Within a path segment, the characters
   "/", ";", "=", and "?" are reserved.  Each path segment may include a
   sequence of parameters, indicated by the semicolon ";" character.
   The parameters are not significant to the parsing of relative
   references.

パスは、単一のスラッシュ"/"で区切られた各パスセグメントの列で構成される。パスセグメントの中では、"/"、";"、"="、"?"は予約されている。パスセグメントはせれぞれパラメータ列を含むことができ、それを示すためにセミコロン";"が用いられる。パラメータは相対参照の解析の際には無視される。

URI共通構文（3）

原文

URI共通構文（和訳）

3. URI Syntactic Components

3.1. Scheme Component

3.2. Authority Component

3.2.1. Registry-based Naming Authority

3.2.2. Server-based Naming Authority

3.3. Path Component

3.4. Query Component

このページに関するご案内