General

Scope

This document describes a set of rules for creating and maintaining documentations or some other kinds of materials, which is not project-specific. This document is also self-conforming to these rules.

Notation

Specific formats may be used. The visual output may depend on the method of rendering.

Texts intended to be handled differently (for example, portions of program source code) may be in some specific format. Other texts are considered normal.

Hyperlinks may be used in normal texts pointing to external references, local pages, or anchors in specific documents.

Normal text empasized in general are in the specific format, usually (visually) bold.

Local terms in the normal text are emphasized at first appearence in the specific format, usually (visually) italic.

For terms used globally in this document and any other derivations, see below.

Terms and definitions

Resources

Contents of materials are split as resources (e.g. files) in possibly nested namespaces (e.g. directories). A namespace is also considered as a resource for convenience.

Paths and identifiers

A path is used to identifying or locating a resource, which can be in various forms (e.g. filesystem path or URL).

A path may have several components denoting different levels of namespace or the last level non-namespace resource.

An empty path is a path without any components.

A path with more than one components shall have syntactic separators (e.g. a slash(/) or whitespace) to split different components.

An identifier is a path with exactly one components without any separators, which can be used to differentiate resources in the same namespace or to collectively name some sets of resources in various namespaces.

A resource may be denoted with not necessarily the unique identifier or path. However, all resources this document discussed below are named.

Languages

Rules of natural languages are specified in this subclause. They have effects on normal texts.

Normal text of noun phrases may have embedded translations for different natural languages or more detailed descriptions following its first occurence, in parentheses (( and )).

Different letter cases (if appropriate) may be used for sentences, acronyms and words in the titles of clauses.

Editions in languages

A set of documentation may be in one (natural) language. The IETF language tag with at least one subtag and an additional prefix dot(.) shall be placed in the end of identifier of the resource before the dot and the extension name (if any). Otherwise the documentation shall be in multiple languages or without text contents (e.g. containing only ideographic images), and no language code shall be in the identifier of the resource.

When the additional dot and tag is removed, all different resource with same names shall refer to the same set of contents only in different languages, or at least one of them shall be incomplete which means to be completed as in former case. The resource is one edition in the specific language of the documentation.

Unless explicitly specified, when the meaning is in conflict for multiple editions in different languages, the complete one shall be valid over others. If there is not only one complete edition, the validity is specified in following order:

  • en-US
  • en
  • zh-CN
  • zh

If no one edition in above languages is complete, the documentation is defective.

A language tag may be used to annotate one or more words in text. An annotation of such use is a language tag annotation, which consists of a tag combined with one pair of enclosing parentheses (namely, ( and )).

Hyperlinks in pages should preferrably link to localized contents corresponding to the language or one of the major languages used in the page (if any) when suitable. If contents of the linked target is in other languages (esp. when there are more than one semantically identical editions in multiple languages), at least one language tag for majority of the contents should be noted subsequent to the hyperlink; otherwise, the tag should be omitted.

For compatibility of client programs, each link of URI should be encoded in form of normalized Percent-Encoding in RFC 3986.

Additionally, several hyperlinks are normalized with the same form for a specific language. Currently the rule consists of following cases:

In English

Stylistic usage of letter cases shall be respected in the following precedence:

  1. All uppercase should not be used normally.
  2. Acronyms and other proper noun (pharses) shall be in the appropriate styles.
  3. The title case style shall be used for page or document titles.
  4. Either the title case or the sentence case shall be used in the titles in a page. This shall be consistent within a document.
  5. Either the title case or the sentence case shall be used in the detailed descriptions for acronyms in parentheses. This may vary in the same page.
  6. Detailed descriptions for acronyms in parentheses may use title case or sentence case.
  7. All lowercase style shall be used for words in the embedded translations or detailed descriptions in parentheses in other cases.
  8. Sentence case should be used otherwise.

English wording documentation is intended to be conforming to the ISO/IEC directive, part 3. Note the use of modal verbs is distinct with RFC 2119. For RFC documents, RFC 2119 is preferred, but not necessary with the case clarification (i.e. RFC 8174) for documents published earlier than RFC 8174 due to compatibility issues.

The following grammartical forms of English (with en or en-US tags) are considered idiomatic and application of such forms may be preferred:

  • answer ellipsis to elide the subject in the summary of commit messages where a question for the topic of the log message is assumed
  • bare passive clause omitting the auxiliary verb for short descriptive notes (e.g. commit messages in repositories and assertions messages in programs)
  • null subject and pronoun dropping in imperative forms
  • zero article for singular form of a countable noun denoting a specialized term being referenced, usually used in a terse-style title or in a list term (like this line)

Informative notes: The tense and mood used in the logs in version control systems are opinion-based. However, the implied rules are choosed here to avoid imperative forms by default, because:

  • First, it should be respected same in all information processing system: to make sure who are the messages in the logs serve to.
    • Version control systems are capable for reading and writing operations on the version history, with asymmetric operational frequency in general.
      • For most stakeholders to a repository in most cases, read-only accesses of the version history are more frequent compared to changing opertions.
      • This is also consistent with the idiom pattern used in programming: do not abuse imperative updates with side effects.
    • For most users, commit logs are entries of journal of the version history.
      • They do not and should not care about imperative changes in the logical perspective.
  • Unconstrained changes in the version history as effectful operations can make messes easily.
    • They are usually only well-behaved enough within some local context (e.g. in a single branch of a reliable instance of the version history).
    • They often make troubles in other cases (e.g. when stripped as patches possibly reordered).
  • Messages in the logs may be cooperated with other instances of version history.
    • No imperative mood can essentially assume the changes described will always be applied in the exactly same way.
    • As mentioned above, out-of-order changes make messes. If the messages are precise, they also make messes like other changed contents.
  • In general, messages in the logs work for distributed repositories.
    • There is simply no standpoint for the global view of the universe of the version history by default.
    • Messages should be ready to be audited by random accesses, besides being applied subsequently in some replays.
    • These facts further undermines the necessity of imperative changes.

Format-specific rules

Text files

Unless otherwise specified, all text files should be encoded as UTF-8 with BOM enabled.

Any use of encoding which may not be converted verbatim and losslessly in binary form to UTF-8 shall be explicit specified in documantation.

BOM should be omitted for text files dedicated to tools without capability of properly handling it. Otherwise, BOM shall be used as possible when it can clarify the encoding being used.

Unless definitely intended and explictly specified in documentation, newlines shall be consistent. Default use of newline is CR+LF.

Two subsequent newlines indicate an EOF logically. Subsequent newlines out of verbatim quoted text (including source code) should only be used at EOF.

Except at the first of line, each word which consists of alphanumeric characters should be seperated by a single space character(U+0020) with other words.

No space characters should be at EOL.

To keep the semantics rules clear, use horizontal tab character(U+0009) instead of other spaces(i.e. U+0020) to indent, unless the text is verbatim quoted.

Markdown

Names of markdown files should be with .md extension.

Dialects

Unless explicitly specified elsewhere, only common dialects are to be used. Currently this should be GFM. And if the content may be presented on Bitbucket wiki, more strict rules applies, notably there is currently no inline HTML support.

Syntactic restrictions

As text files, markdown files shall obey the same rules above. The indentation rule is necessary to avoid some compatibility issues, e.g. this.

As specified, reserved characters defined by RFC 3986 should be percentage-encoded. Notably, the parentheses(()) in hyperlinks shall be encoded to make it more fault-tolerent for some editors.

Headers should be prefixed by #s.

For sake of compact annotation representation, there should be no redundant characters allowed between the annotated words and annotation (esp. whitespace characters), even there are whitespaces in the words. The annotation in this rule includes any language tag annotation defined in previous subclause.

The whitespace rules in the language annotation is also

Cross references

This document is used by the YSLib project. It may be also referenced by other repositories.

Except for the following list, do not edit unless ultimately necessary.

Known refereced by: