Uniform Resource Identiriers (URI): Generic Syntax（4）

   The term "URI-reference" is used here to denote the common usage of a
   resource identifier.  A URI reference may be absolute or relative,
   and may have additional information attached in the form of a
   fragment identifier.  However, "the URI" that results from such a
   reference includes only the absolute URI after the fragment
   identifier (if any) is removed and after any relative URI is resolved
   to its absolute form.  Although it is possible to limit the
   discussion of URI syntax and semantics to that of the absolute
   result, most usage of URI is within general URI references, and it is
   impossible to obtain the URI from such a reference without also
   parsing the fragment and resolving the relative form.

      URI-reference = [ absoluteURI | relativeURI ] [ "#" fragment ]

当文書では、URI参照という用語をリソース識別子の共通する使い方と定義する。URI参照には絶対参照と相対参照があり、追加情報がフラグメント識別子形式で付加されている場合もある。しかし、（もしあれば）フラグメント識別子を削除し、更に相対URIを絶対URI形式に解釈した後、結果となるURIは絶対URIである。URIの構文と意味についての議論を絶対URIに限定することもできるが、URIの多くは一般的なURI参照の中で用いられるため、フラグメント識別子の構文解析と相対形式の解決なしで一般的なURI参照からURIを得ることは不可能である。

   The syntax for relative URI is a shortened form of that for absolute
   URI, where some prefix of the URI is missing and certain path
   components ("." and "..") have a special meaning when, and only when,
   interpreting a relative path.  The relative URI syntax is defined in
   Section 5.

相対URIの構文は、絶対URIを短縮した書式である。その際絶対URIの前方部分が省略され、決められたパスコンポーネント（"."と".."）が相対パスの解釈時にのみ特別な意味を持つ。相対URIの構文については5. 相対URI参照で定義する。

4.1. Fragment Identifier

   When a URI reference is used to perform a retrieval action on the
   identified resource, the optional fragment identifier, separated from
   the URI by a crosshatch ("#") character, consists of additional
   reference information to be interpreted by the user agent after the
   retrieval action has been successfully completed.  As such, it is not
   part of a URI, but is often used in conjunction with a URI.

      fragment      = *uric

URI参照が識別されたリソースの取得動作を行う際、URIを"#"で区切ることによってフラグメント識別子を任意で付けることができる。フラグメント識別子は、リソースの取得動作が成功した後にユーザエージェントが解釈する付加的な参照情報である。このため、フラグメント識別子はURIではないが、URIと共に用いられることが少なくない。

   The semantics of a fragment identifier is a property of the data
   resulting from a retrieval action, regardless of the type of URI used
   in the reference.  Therefore, the format and interpretation of
   fragment identifiers is dependent on the media type [RFC2046] of the
   retrieval result.  The character restrictions described in Section 2
   for URI also apply to the fragment in a URI-reference.  Individual
   media types may define additional restrictions or structure within
   the fragment for specifying different types of "partial views" that
   can be identified within that media type.

フラグメント識別子の意味は、参照の際に用いられたURIの型に関らず、取得動作の結果得られたデータの持つ属性である。そのため、フラグメント識別子の形式と解釈は、得られた結果のメディアタイプ[RFC2046]に依存する。第2章にて触れたURIにおける文字制限は、参照するURIの中のフラグメント識別子にも適用される。各メディアタイプは、そのメディアタイプ内で識別できる異なる型の"部分参照"指定のために、フラグメント識別子に付加的な制限あるいは構造を定義することができる。

   A fragment identifier is only meaningful when a URI reference is
   intended for retrieval and the result of that retrieval is a document
   for which the identified fragment is consistently defined.

URI参照が取得に際したもので、取得の結果得られる文書内でそれぞれの部分が一貫した定義で識別される場合のみ、フラグメント識別子は有益なものとなる。

4.2. Same-document References

   A URI reference that does not contain a URI is a reference to the
   current document.  In other words, an empty URI reference within a
   document is interpreted as a reference to the start of that document,
   and a reference containing only a fragment identifier is a reference
   to the identified fragment of that document.  Traversal of such a
   reference should not result in an additional retrieval action.
   However, if the URI reference occurs in a context that is always
   intended to result in a new request, as in the case of HTML's FORM
   element, then an empty URI reference represents the base URI of the
   current document and should be replaced by that URI when transformed
   into a request.

URIを含まないURI参照は、現在の文書内の参照である。つまり、文書中での空のURIを参照することはその文書の先頭を参照することと解釈され、フラグメント識別子のみを含む参照は、同一文書内の定義された部分を参照することと解釈される。このような参照の際、既に取得している文書を再び取得するような動作を行うべきではない。ただし、HTMLのFORM要素のように、そのURI参照が常に新たなリクエストを行うよう暗示する文脈で用いられた場合は、空のURI参照は同一文書の基底URIの参照を表し、リクエストの際に基底URIへのリクエストと置き換えた解釈を行うべきである。

4.3. Parsing a URI Reference

   A URI reference is typically parsed according to the four main
   components and fragment identifier in order to determine what
   components are present and whether the reference is relative or
   absolute.  The individual components are then parsed for their
   subparts and, if not opaque, to verify their validity.

URI参照は通常、コンポーネントがあるかどうか、参照に関連するかどうか、絶対参照かどうかを決めるために、4つの主なコンポーネントとフラグメント識別子によって解析される。その後、各コンポーネントは部分ごとに解釈され、不透明でなければその正当性が確認される。

   Although the BNF defines what is allowed in each component, it is
   ambiguous in terms of differentiating between an authority component
   and a path component that begins with two slash characters.  The
   greedy algorithm is used for disambiguation: the left-most matching
   rule soaks up as much of the URI reference string as it is capable of
   matching.  In other words, the authority component wins.

各コンポーネントで認められる形式はBNFで定義しているが、機関コンポーネントと2つのスラッシュで始まるパスコンポーネントを区別する点があいまいである。このため、マッチング規則がURI参照文字列の左端からの最長マッチとなるような、より厳密なアルゴリズムが用いられる。言い換えれば、機関コンポーネントが優先されるべきである。

   Readers familiar with regular expressions should see Appendix B for a
   concrete parsing example and test oracle.

正規表現に明るいのであれば、構文解析の具体例と試験的なアルゴリズムについて記述したB. 正規表現を用いたURI参照の構文解析を参照のこと。

URI共通構文（4）

原文

URI共通構文（和訳）

4. URI References

4.1. Fragment Identifier

4.2. Same-document References

4.3. Parsing a URI Reference

このページに関するご案内