Uniform Resource Identiriers (URI): Generic Syntax（5）

   It is often the case that a group or "tree" of documents has been
   constructed to serve a common purpose; the vast majority of URI in
   these documents point to resources within the tree rather than
   outside of it.  Similarly, documents located at a particular site are
   much more likely to refer to other resources at that site than to
   resources at remote sites.

共通の目的を提供するために、複数の文書が集団又はツリーを構成することが少なくない。これらの文書中のURIは、同一ツリー外への指定に比べて同一ツリー内のリソースを指定する場合が大多数を占める。同様に、ある特定のサイトにある文書は、外部サイトのリソースを参照するよりも同一サイト内の別のリソースを参照することが多い。

   Relative addressing of URI allows document trees to be partially
   independent of their location and access scheme.  For instance, it is
   possible for a single set of hypertext documents to be simultaneously
   accessible and traversable via each of the "file", "http", and "ftp"
   schemes if the documents refer to each other using relative URI.
   Furthermore, such document trees can be moved, as a whole, without
   changing any of the relative references.  Experience within the WWW
   has demonstrated that the ability to perform relative referencing is
   necessary for the long-term usability of embedded URI.

URIを相対指定とすることにより、場所とアクセススキームから文書ツリーが受ける制限はある程度緩和される。例えば、一組のハイパーテキスト文書が互いの参照に相対URIを用いていれば、"file"や"http"、"ftp"などスキームに関らず互いにアクセスし合うことができる。更に、そのような文書ツリーは相対参照を変更することなく移動することができる。WWWでの経験上、文書に埋め込まれたURIが長期間に渡り有効であるためには、相対参照機能が不可欠であると証明されている。

   The syntax for relative URI takes advantage of the <hier_part> syntax
   of <absoluteURI> (Section 3) in order to express a reference that is
   relative to the namespace of another hierarchical URI.

      relativeURI   = ( net_path | abs_path | rel_path ) [ "?" query ]

   A relative reference beginning with two slash characters is termed a
   network-path reference, as defined by <net_path> in Section 3.  Such
   references are rarely used.

2つのスラッシュから始まる相対参照は3. URI構文の構成要素で<net_path>と定義され、ネットワークパス参照と名付けられている。ただし、このような参照はほとんど用いられない。

   A relative reference beginning with a single slash character is
   termed an absolute-path reference, as defined by <abs_path> in
   Section 3.

   A relative reference that does not begin with a scheme name or a
   slash character is termed a relative-path reference.

      rel_path      = rel_segment [ abs_path ]

      rel_segment   = 1*( unreserved | escaped |
                          ";" | "@" | "&" | "=" | "+" | "$" | "," )

   Within a relative-path reference, the complete path segments "." and
   ".." have special meanings: "the current hierarchy level" and "the
   level above this hierarchy level", respectively.  Although this is
   very similar to their use within Unix-based filesystems to indicate
   directory levels, these path components are only considered special
   when resolving a relative-path reference to its absolute form
   (Section 5.2).

相対パス参照の中で、完全パスセグメント"."と".."は特殊な意味を持ち、"."は現在の階層を表し、".."は現在の階層の1つ上の階層を表す。これらの表記は、UNIXベースのファイルシステムでディレクトリの階層を表す際の表記に酷似しているが、これらのパスコンポーネントは相対パス参照を絶対形式に解決する際にのみ重要となる。

   Authors should be aware that a path segment which contains a colon
   character cannot be used as the first segment of a relative URI path
   (e.g., "this:that"), because it would be mistaken for a scheme name.

URIの作成者はコロンが含まれるパスセグメントを相対URIパスの最初のセグメントに用いることはできないことを覚えておくと良い（例えば、this:that）。これは、スキーム名と誤って解釈されるのを防ぐためである。

   It is therefore necessary to precede such segments with other
   segments (e.g., "./this:that") in order for them to be referenced as
   a relative path.

   It is not necessary for all URI within a given scheme to be
   restricted to the <hier_part> syntax, since the hierarchical
   properties of that syntax are only necessary when relative URI are
   used within a particular document.  Documents can only make use of
   relative URI when their base URI fits within the <hier_part> syntax.
   It is assumed that any document which contains a relative reference
   will also have a base URI that obeys the syntax.  In other words,
   relative URI cannot be used within a document that has an unsuitable
   base URI.

相対URIが個々の文書内で使用される際にのみ、<hier_part>構文の階層的な特性が必要とされるので、必ずしもあるスキームに属する全てのURIが<hier_part>構文によって制限されるわけではない。<hier_part>構文に適合する基底URIを持つ文書のみが、相対URIを用いることができる。したがって、相対参照を含む文書は、この構文に則った規定URIを持つと仮定される。言い換えれば、適切な基底URIを持たない文書内で相対URIを用いることはできない。

   Some URI schemes do not allow a hierarchical syntax matching the
   <hier_part> syntax, and thus cannot use relative references.

5.1. Establishing a Base URI

   The term "relative URI" implies that there exists some absolute "base
   URI" against which the relative reference is applied.  Indeed, the
   base URI is necessary to define the semantics of any relative URI
   reference; without it, a relative reference is meaningless.  In order
   for relative URI to be usable within a document, the base URI of that
   document must be known to the parser.

相対URIという用語は、相対参照を用いる対象となる絶対的な基底URIが存在することを暗示している。確かに、基底URIはある相対URI参照の意味を定義するために必要となり、基底URIがなければ相対参照は意味をなさない。相対URIを文書内で用いるためには、その文書の基底URIをパーサに通知しなければならない。

   The base URI of a document can be established in one of four ways,
   listed below in order of precedence.  The order of precedence can be
   thought of in terms of layers, where the innermost defined base URI
   has the highest precedence.  This can be visualized graphically as:

      .----------------------------------------------------------.
      |  .----------------------------------------------------.  |
      |  |  .----------------------------------------------.  |  |
      |  |  |  .----------------------------------------.  |  |  |
      |  |  |  |  .----------------------------------.  |  |  |  |
      |  |  |  |  |       <relative_reference>       |  |  |  |  |
      |  |  |  |  `----------------------------------'  |  |  |  |
      |  |  |  | (5.1.1) Base URI embedded in the       |  |  |  |
      |  |  |  |         document's content             |  |  |  |
      |  |  |  `----------------------------------------'  |  |  |
      |  |  | (5.1.2) Base URI of the encapsulating entity |  |  |
      |  |  |         (message, document, or none).        |  |  |
      |  |  `----------------------------------------------'  |  |
      |  | (5.1.3) URI used to retrieve the entity            |  |
      |  `----------------------------------------------------'  |
      | (5.1.4) Default Base URI is application-dependent        |
      `----------------------------------------------------------'

文書の基底URIは、以下で示す順に優先される4つの方法の内の1つで確立される。優先順位は階層の観念で考えることができ、もっとも内側で定義される基底URIの優先順位がもっとも高い。これは次のように視覚的に示される。

5.1.1. 文章内容内部の基底URI
5.1.2. カプセル化実体からの基底URI
5.1.3. 検索URIからの基底URI
5.1.4. 既定となる基底URI

5.1.1. Base URI within Document Content

   Within certain document media types, the base URI of the document can
   be embedded within the content itself such that it can be readily
   obtained by a parser.  This can be useful for descriptive documents,
   such as tables of content, which may be transmitted to others through
   protocols other than their usual retrieval context (e.g., E-Mail or
   USENET news).

あるメディアタイプの文書内では、文書の基底URIをパーサがすぐに得られるようその文書内に基底URIを埋め込むことができる。これは、目次のような解説文書を通常用いる取得手段とは異なるプロトコル（例えば、電子メールやUSENET）を使い伝送する際に役立つ。

   It is beyond the scope of this document to specify how, for each
   media type, the base URI can be embedded.  It is assumed that user
   agents manipulating such media types will be able to obtain the
   appropriate syntax from that media type's specification.  An example
   of how the base URI can be embedded in the Hypertext Markup Language
   (HTML) [RFC1866] is provided in Appendix D.

各メディアタイプに基底URIをどう埋め込むかについては当文書の範囲を超えるものである。そのようなメディアタイプを処理するユーザエージェントは、そのメディアタイプの仕様書から適切な構文を入手することが望ましい。HTML[RFC1866]への基底URI埋め込み方法の例はD. HTML文書中の基底URIの埋め込みにて触れている。

   A mechanism for embedding the base URI within MIME container types
   (e.g., the message and multipart types) is defined by MHTML
   [RFC2110].  Protocols that do not use the MIME message header syntax,
   but which do allow some form of tagged metainformation to be included
   within messages, may define their own syntax for defining the base
   URI as part of a message.

基底URIをMIMEコンテナ型（例：メッセージやマルチパート型）に埋め込む機構は、MHTML[RFC2110]で定義されている。MIMEメッセージのヘッダ構文は用いないが、メッセージ中に任意の形式で意味付けされたメタ情報を含むことが許されているプロトコルは、基底URIをメッセージの一部と定義するための独自構文を定義するとよい。

5.1.2. Base URI from the Encapsulating Entity

   If no base URI is embedded, the base URI of a document is defined by
   the document's retrieval context.  For a document that is enclosed
   within another entity (such as a message or another document), the
   retrieval context is that entity; thus, the default base URI of the
   document is the base URI of the entity in which the document is
   encapsulated.

基底URIが埋め込まれていない場合、その文書の基底URIは該当の文書の取得に使われた文脈によって定義される。ある文書が（メッセージや文書のような）他の実体に同封されている場合、取得に使われた文脈とはその実体を指すことになる。したがってその文書の基底URIは、その文書をカプセル化している実体の基底URIとなる。

5.1.3. Base URI from the Retrieval URI

   If no base URI is embedded and the document is not encapsulated
   within some other entity (e.g., the top level of a composite entity),
   then, if a URI was used to retrieve the base document, that URI shall
   be considered the base URI.  Note that if the retrieval was the
   result of a redirected request, the last URI used (i.e., that which
   resulted in the actual retrieval of the document) is the base URI.

基底URIが埋め込まれておらず、更にその文書が他の何らかの実体（例えば、構成実体の最上位層）によってカプセル化されていない場合、基底文書の取得にURIが利用されているのであれば、そのURIを基底URIと考えるのが妥当である。ただし、取得の結果、リクエストがリダイレクトされていれば、最後に用いられたURI（実際にその文書を取得したURI）が基底URIとなる点に注意すべきである。

5.1.4. Default Base URI

   If none of the conditions described in Sections 5.1.1--5.1.3 apply,
   then the base URI is defined by the context of the application.
   Since this definition is necessarily application-dependent, failing
   to define the base URI using one of the other methods may result in
   the same content being interpreted differently by different types of
   application.

5.1.1から5.1.3章で説明された条件に当てはまらない場合、基底URIはアプリケーションによって定義される。この決定方法は必然的にアプリケーション依存となるため、他の手段を用いての基底URIの確定に失敗した場合、種類の違うアプリケーションによっては同一の内容であっても違った解釈をされる恐れがある。

   It is the responsibility of the distributor(s) of a document
   containing relative URI to ensure that the base URI for that document
   can be established.  It must be emphasized that relative URI cannot
   be used reliably in situations where the document's base URI is not
   well-defined.

相対URIを含む文書の規定URIの確定を保証するのは、文書作成者の責任である。文書の基底URIが明確でない状況では、相対URIを安全に用いることができないことは強調されなければならない。

5.2. Resolving Relative References to Absolute Form

   This section describes an example algorithm for resolving URI
   references that might be relative to a given base URI.

   The base URI is established according to the rules of Section 5.1 and
   parsed into the four main components as described in Section 3.  Note
   that only the scheme component is required to be present in the base
   URI; the other components may be empty or undefined.  A component is
   undefined if its preceding separator does not appear in the URI
   reference; the path component is never undefined, though it may be
   empty.  The base URI's query component is not used by the resolution
   algorithm and may be discarded.

基底URIは、5.1. 基底URIの確立の規則に従い確定され、3. URI構文の構成要素で触れた4つの主なコンポーネントとして解釈される。特筆すべきは、基底URIの中で必須とされているのがスキームコンポーネントだけという点であり、即ち他のコンポーネントは空でもよいし未定義でもよい。コンポーネントが未定義であるとは、URI参照の中にそれに先行するセパレータが出現しないことであり、パスコンポーネントは例え空であっても未定義とはならない。基底URIのクエリコンポーネントは解決アルゴリズムでは用いないので破棄してもよい。

   For each URI reference, the following steps are performed in order:

   1) The URI reference is parsed into the potential four components and
      fragment identifier, as described in Section 4.3.

   2) If the path component is empty and the scheme, authority, and
      query components are undefined, then it is a reference to the
      current document and we are done.  Otherwise, the reference URI's
      query and fragment components are defined as found (or not found)
      within the URI reference and not inherited from the base URI.

2）パスコンポーネントが空であり、スキーム、オーソリティ、クエリの各コンポーネントが未定義であれば、それは現在の文書を参照していることに他ならず、解析は終了する。そうでない場合、参照URIのクエリコンポーネントとフラグメント識別子は、そのURI参照内に存在する（又は存在しない）通りに定義され、基底URIからは継承しない。

   3) If the scheme component is defined, indicating that the reference
      starts with a scheme name, then the reference is interpreted as an
      absolute URI and we are done.  Otherwise, the reference URI's
      scheme is inherited from the base URI's scheme component.

      Due to a loophole in prior specifications [RFC1630], some parsers
      allow the scheme name to be present in a relative URI if it is the
      same as the base URI scheme.  Unfortunately, this can conflict
      with the correct parsing of non-hierarchical URI.  For backwards
      compatibility, an implementation may work around such references
      by removing the scheme if it matches that of the base URI and the
      scheme is known to always use the <hier_part> syntax.  The parser
      can then continue with the steps below for the remainder of the
      reference components.  Validating parsers should mark such a
      misformed relative reference as an error.

3）スキームコンポーネントが定義されている場合、つまり参照がスキーム名から始まっているのであれば、その参照は絶対URIと解釈され、解析は終了する。そうでない場合、参照URIのスキームは基底URIのスキームコンポーネントから継承される。

既知の仕様[RFC1630]における抜け道のため、パーサには、基底URIのスキームと同じであれば相対URIの中にもスキーム名の存在を認めるものがある。残念なことに、このことは非階層URIを正しく解析する際に矛盾が生じ得る。スキームが常に<hier_part>構文を用い、且つ基底URIのスキームと同じであれば、処理系はそのような参照を迂回して処理してもよい。これにより、パーサは参照のコンポーネントの残りの部分に対して以下のステップを継続することができる。妥当性の検証するパーサは、このような正しくない形式の参照をエラーと判断すべきである。

   4) If the authority component is defined, then the reference is a
      network-path and we skip to step 7.  Otherwise, the reference
      URI's authority is inherited from the base URI's authority
      component, which will also be undefined if the URI scheme does not
      use an authority component.

4）オーソリティコンポーネントが定義されているのであれば、その参照はネットワークパスであり、ステップ7を実行する。そうでない場合、参照URIのオーソリティは、基底URIのオーソリティコンポーネントから継承される。そのURIスキームがオーソリティコンポーネントを用いないのであれば、参照URIのオーソリティコンポーネントもまた未定義である。

   5) If the path component begins with a slash character ("/"), then
      the reference is an absolute-path and we skip to step 7.

   6) If this step is reached, then we are resolving a relative-path
      reference.  The relative path needs to be merged with the base
      URI's path.  Although there are many ways to do this, we will
      describe a simple method using a separate string buffer.

6）当ステップに辿り着くのは相対パス参照を解釈する場合である。相対パスは基底URIのパスと結合しなければならない。それには多くの方法があるが、ここでは文字列バッファの分割を用いる単純な方法を説明する。

      a) All but the last segment of the base URI's path component is
         copied to the buffer.  In other words, any characters after the
         last (right-most) slash character, if any, are excluded.

a）基底URIパスの最後のセグメント以外をバッファにコピーする。言い換えれば、最後（もっとも右）のスラッシュ以降にある文字が除去されることとなる。

      b) The reference's path component is appended to the buffer
         string.

      c) All occurrences of "./", where "." is a complete path segment,
         are removed from the buffer string.

      d) If the buffer string ends with "." as a complete path segment,
         that "." is removed.

      e) All occurrences of "<segment>/../", where <segment> is a
         complete path segment not equal to "..", are removed from the
         buffer string.  Removal of these path segments is performed
         iteratively, removing the leftmost matching pattern on each
         iteration, until no matching pattern remains.

".."ではない完全なパスセグメント<segment>で定義される"<segment>/../"がある場合、バッファ文字列からそれらの文字列を除去する。これらのパスの除去はもっとも左のマッチングパターンから繰り返し行われ、マッチングパターンがなくなるまで続けられる。

      f) If the buffer string ends with "<segment>/..", where <segment>
         is a complete path segment not equal to "..", that
         "<segment>/.." is removed.

      g) If the resulting buffer string still begins with one or more
         complete path segments of "..", then the reference is
         considered to be in error.  Implementations may handle this
         error by retaining these components in the resolved path (i.e.,
         treating them as part of the final URI), by removing them from
         the resolved path (i.e., discarding relative levels above the
         root), or by avoiding traversal of the reference.

残りのバッファ文字列がなお1つ以上の".."という完全なパスセグメントで始まっているのであれば、その参照はエラーとみなされる。処理系はこのエラーを、解決したパス中のコンポーネントとして保持する（つまりそれらを最終的なURIの一部として扱う）、解決したパスから除去する（つまりルートより上の相対レベルを取り除く）、その参照への移動を無効とする、の何れかのの方法で扱ってもよい。

      h) The remaining buffer string is the reference URI's new path
         component.

   7) The resulting URI components, including any inherited from the
      base URI, are recombined to give the absolute form of the URI
      reference.  Using pseudocode, this would be

         result = ""

         if scheme is defined then
             append scheme to result
             append ":" to result

         if authority is defined then
             append "//" to result
             append authority to result

         append path to result

         if query is defined then
             append "?" to result
             append query to result

         if fragment is defined then
             append "#" to result
             append fragment to result

         return result

    result = ""

    もしスキームが定義されているのであれば
        スキームをresultの最後に追加する
        ":"をresultの最後に追加する

    もしオーソリティが定義されているのであれば
        "//"をresultの最後に追加する
        オーソリティをresultの最後に追加する

    パスをresultの最後に追加する

    もしクエリが定義されているのであれば
        "?"をresultの最後に追加する
        クエリをresultの最後に追加する

    もしフラグメント識別子が定義されているのであれば
        "#"をresultの最後に追加する
        フラグメント識別子をresultの最後に追加する

    return result

      Note that we must be careful to preserve the distinction between a
      component that is undefined, meaning that its separator was not
      present in the reference, and a component that is empty, meaning
      that the separator was present and was immediately followed by the
      next component separator or the end of the reference.

注意すべき点として、未定義コンポーネントと空のコンポーネントの区別を維持するよう気を付けなければならない。ここで、未定義コンポーネントはセパレータが参照中にないことを意味し、空のコンポーネントはセパレータがあり且つその後が次のコンポーネントのセパレータ又は参照の末尾であることを意味する。

   The above algorithm is intended to provide an example by which the
   output of implementations can be tested -- implementation of the
   algorithm itself is not required.  For example, some systems may find
   it more efficient to implement step 6 as a pair of segment stacks
   being merged, rather than as a series of string pattern replacements.

上記のアルゴリズムは、実装の出力をテストする例の提供を意図しており、アルゴリズムそのものの実装を要求するものではない。例えば、システムによってはステップ6を実装する際に置換を繰り返すのではなく、2つのセグメントスタックを併合する方法がより効果的な場合もある。

      Note: Some WWW client applications will fail to separate the
      reference's query component from its path component before merging
      the base and reference paths in step 6 above.  This may result in
      a loss of information if the query component contains the strings
      "/../" or "/./".

注：WWWクライアントアプリケーションによっては、上記ステップ6に規定された基底URIのパスと参照パスの併合を行う前に参照のクエリコンポーネントを分離できない場合がある。そのため、クエリコンポーネントに"../"や"/./"が含まれていた場合はその情報が失われることとなる。

URI共通構文（5）

原文

URI共通構文（和訳）

5. Relative URI References