Minimal Requirements on EPUB for Japanese Text Layout

Japan Electronic Publishing Association

Date: 2010-04-01

Status: The first working draft

Editor: MURATA Makoto (FAMILY Given)

Authors: Hiroshi Takase and Masayuki Inoguchi

1. Introduction

This document summarizes minimal requirements on future versions of EPUB for Japanese text layout. Such requirements include vertical writing and ruby among others. The focus is on electronic books. Publications such as magazines and pamphlets are kept outside the scope.

Some requirements listed as minimal in this draft might be non-minimal. They may be removed from the next draft. Such requirements are R5 (stylesheets for both vertical and horizontal writing), R9 (tate-chu-yoko), and R14 (emphasis dots). Meanwhile, jukugo-ruby and group-ruby may be added to R13 (mono-ruby). Feedbacks on these requirements are most welcome.

This document heavily borrows from Requirements for Japanese Text Layout.

This document uses 'RX' (where X is a number) as a marker for indicating specific requirements.

2. Vertical writing and horizontal writing

As described in 2.3.1 of Requirements for Japanese Text Layout, Japanese composition has two text directions: vertical (vertical writing mode) and horizontal (horizontal writing mode). In vertical writing mode, reading systems should arrange characters from top to bottom, and arrange lines from right to left.

2.1 Principal text direction

The principal text direction (b in the itemized list in 2.2.3 Elements of Page Formats, Requirements for Japanese Text Layout) of a Japanese book is either vertical or horizontal. When it is vertical (resp. horizontal), main text in the book is in vertical (resp. horizontal) writing mode.

R1: It should be possible to specify "vertical" as the principal text direction of a book. If the principal text direction is vertical, main text in this book should be in vertical writing mode.

When the principal text direction is vertical, page progression is from right to left. See [Fig 21] Progression of pages for a vertically set books. Moreover, columns are arranged from top to bottom. See [Fig 20] Direction of arrangement of characters in vertical writing mode. This is different from Western books in horizontal writing mode, where page progression is from left to right and columns are arranged from left to right.

R2: The principal text direction "vertical" should imply that columns are arranged from top to bottom and page progression is from right to left.

2.2 Horizontal writing when the principal text direction is vertical

Some text in the book does not follow the principal text direction. Specifically, even when the principal text direction is vertical, running heads, page numbers, captions, and table entries are typically in horizontal writing mode. See [Fig.19] Example of horizontal writing mode in parts of vertically set books.

R3: Running heads, page numbers, captions, and table entries should be in horizontal writing mode even when the principal text direction is vertical.

Note: Paragraphs in boxed articles within a book may be in horizontal writing mode even when the principal text direction is vertical. But the support of such boxed articles is not a must.

Note: Non-minimal requirements include vertical writing for page headers/footers, page numbers, figure captions, table captions, and table entries.

2.3 Vertical writing mode for the principal text direction "horizontal"

When the principal text direction is horizontal, every text including page headers/footers, page numbers, figure captions, table captions, and table entries is in horizontal writing mode.

Note: Non-minimal requirements include page headers in vertical writing mode.

2.4 Overriding the principal text direction

Even if the principal text direction "horizontal" is specified, users might want to view the document using the principal text direction "vertical", and vice versa. Many e-book viewers in Japan already allow users to choose the principal text direction.

Moreover, existing EPUB viewers and early implementations of EPUB 2.X viewers are unlikely to support vertical writing. If books of the principal text direction "vertical" cannot be rendered by such viewers, publishers will hesitate to create books using vertical writing.

R4: It should be possible to render a document of the principal text direction "vertical" using the principal text direction "horizontal", and vice versa.

2.5 Providing stylesheets for both principal text directions

Even when the principal text direction is overridden, users will expect that the viewer provides reasonable layout using the stylesheets provided by the publisher.

R5: It should be possible to provide stylesheets for both principal text directions. This requirement does not necessarily mean that the same stylesheet can be used for both directions.

Unfortunately, the design of CSS makes it difficult to meet this requirement. This is because "top", "bottom", "right", and "left" in CSS2 are not relative to the principal text direction (see the section Interaction with Other CSS Attributes in Using Vertical Layout in Internet Explorer 5.5). A stylesheet intended for vertical writing provides miserable results for horizontal writing.

Note: An html page at http://www.asahi-net.or.jp/%7Eeb2m-mrt/epub/tategakiTest.html looks good if the browser supports vertical writing, but it looks miserable, otherwise.

2.6 Japanese and Western Mixed Text Composition

As described in 3.2.3 of Requirements for Japanese Text Layout, there are three different styles for setting Latin letters or European numerals in Japanese vertical writing mode. The three styles are shown below:

  1. one by one in inline direction,
  2. 90 degrees clockwise rotation, and
  3. tate-chu-yoko

R6: There should be a mechanism for selecting one of the three styles.

Note: Font specification or code points may be used as a mechanism for implicitly selecting one of the three styles. Alternatively, explicit markup may be used.

2.6.1 One by one in inline direction

This style is depicted by [Fig.94] Example of Latin letters in normal orientation.

R7: In the one-by-one-in-inline-direction style, each Latin letter or European numeral should be set one by one (i.e., no rotation) in inline direction with Japanese characters.

2.6.2 90-degrees clockwise rotation

This style is depicted by [Fig.95] Example of Latin letters rotated 90 degrees clockwise.

R8: In the 90-degrees-clockwise-rotation style, a sequence of Latin letters or a European numerals should be first arranged from left to right, and the whole string should be then rotated 90 degrees clockwise.

2.6.3 Tate-chu-yoko

This style is depicted by [Fig.96] Example of European numerals in tate-chu-yoko (horizontal-in-vertical setting) and [Fig.101] Example of tate-chu-yoko (horizontal-in-vertical text setting).

R9: In the tate-chu-yoko style, a sequence of Latin letters or a European numerals should be arranged from left to right, and the whole string should be then aligned to the center of the vertical line.

3. Line Breaking Rules

3.1.7 Characters Not Starting a Line, 3.1.8 Characters Not Ending a Line, and 3.1.10 Unbreakable Character Sequences of Requirements for Japanese Text Layout show line breaking rules for Japanese text. These rules are based on character classes in Appendix A of Requirements for Japanese Text Layout.

R10 (characters not starting a line): A line should not begin with the characters shown below:

Note: Some printed publications adopt a less strict rule, which allows iteration marks, prolonged sound marks, and small kana. Even KATAKANA MIDDLE DOT and dividing punctuation marks (cl-04) are sometimes allowed, for example in newspaper.

R11 (characters not ending a line): A line should not end with the characters shown below:

R12 (unbreakable character sequences): A line should not be broken within the following character sequences. In other words, such sequences should be handled as one unit.

Note: The "word-break" property in a Working Draft of CSS Text Level 3 allows fine control of line breaking. However, such fine control is not a minimal requirement.

4. Ruby and Emphasis Dots

4.1 Ruby

The definition of "Ruby" in JIS Z 8125 is shown below:

Supplementary small characters indicating pronunciation, meaning, etc. for the character or the block of characters they annotate.

The character or the block of characters annotated by ruby text is called base characters. In vertical writing mode, ruby text is typically attached to the right of the base characters. Meanwhile, in horizontal writing mode, it is immediately above the base characters. Ruby and base characters are depicted by [Fig.105] Ruby and base characters. Further information about ruby is shown in 3.3.1 Usage of Ruby of Requirements for Japanese Text Layout.

Ruby annotation has several methods including mono-ruby, jukugo-ruby, and group-ruby.

Mono-ruby
Attaching hiragana or katakana characters to indicate the reading of a single base ideographic character. See [Fig.106] Example of ruby annotation per Kanji Character.
Jukugo-ruby
Attaching hiragana or katakana characters to indicate the reading of a compound word (jukugo), which is represented by a sequence of ideographic characters. See [Fig.108] Example of jukugo-ruby method. Line breaks within this compound word are intended to be allowed. A mono-ruby sequence, each of which is a base character having mono-ruby, should not be confused with Jukugo-ruby. See [Fig.107] Example of mono-ruby method. Ruby letters are attached to each base kanji character in a compound word.
Group-ruby
Attaching hiragana or katakana characters to indicate the meaning of a word, which is represented by a sequence of ideographic, hiragana, or katakana characters. See [Fig.111] Examples of ruby for compound kanji words to indicate corresponding words in katakana. Line breaks within this word are intended to be disallowed.

Although the support of ruby is preferable, it is widely believed that implementations should not be required to support ruby but should be allowed to rely on fallback, typically by inserting ruby text (together with parentheses) after base. It is thus necessary to provide fallback information as well as ruby.

R13: It should be possible to specify mono-ruby as well as fallback information.

Note: It is still not clear whether the support of jukugo-ruby and group-ruby should be listed as a minimal requirement.

The character size of ruby text is intended to be, in principle, the half size of the base characters (see [ Fig.34]: Inserting ruby or other items between lines). The presence or absence of ruby is not expected to alter the distance between two adjacent lines. In other words, ruby texts should fit in line gaps.

Sometimes, multiple ruby texts are attached to a single ruby base. See [Fig.117] An example of ruby attached to both sides of the base characters. However, minimal requirements do not include the support of such multiple ruby texts.

4.2 Emphasis Dots

Emphasis dots (also known as bouten or side dots) are symbols placed alongside a run of kanji or kana characters to emphasize the text. Emphasis dots are depicted by [Fig.142] Composition of emphasis dots. Further information is available at 3.3.9 Composition of Emphasis Dots in Requirements for Japanese Text Layout.

Emphasis dots are attached to the right of the base characters in vertical writing mode, or above them in horizontal writing mode. The center of emphasis dots is aligned with that of the base characters.

Although several symbols are used as emphasis dots, SESAME DOT (U+FE45) in vertical writing mode and BULLET (U+2022) in horizontal writing mode are most frequently used.

R14: It should be possible to attach emphasis dots to text runs. At least two characters (SESAME DOT and BULLET) should be available as emphasis dots.

The character size of emphasis dots is intended to be, in principle, the half size of the base characters (see [ Fig.34]: Inserting ruby or other items between lines). The presence or absence of emphasis dots is not expected to alter the distance between two adjacent lines. In other words, emphasis dots should fit in line gaps.

The following is a list of characters typically used as emphasis dots.

References

  1. Requirements for Japanese Text Layout, W3C Working Group Note, 4 June 2009, available at http://www.w3.org/TR/jlreq/.
  2. Using Vertical Layout in Internet Explorer 5.5, Microsoft Corporation, Mark Grinols, October 2000, available at http://msdn.microsoft.com/en-us/library/bb250415(VS.85).aspx
  3. Editor's Draft of CSS Text Layout Module Level 3, W3C, 8 October 2008,available at http://dev.w3.org/csswg/css3-text-layout/
  4. Working Draft of CSS Text Level 3, W3C, 6 March 2007, available at http://www.w3.org/TR/css3-text/.
  5. Candidate Recommendation of CSS3 Ruby Module, W3C, 14 May 2003, available at http://www.w3.org/TR/css3-ruby.
  6. Recommendation of Ruby Annotation, W3C, 31 May 2001, available at http://www.w3.org/TR/ruby.
  7. Working Draft of HTML5, W3C, 4 March 2010, available at http://www.w3.org/TR/html5/.
  8. Working Draft of CSS3 module: line, W3C, 15 May 2002, available at http://www.w3.org/TR/css3-linebox.
  9. Recommended Specification of Open Publication Structure (OPS) 2.0 v1.0, IDPF, 11 September 2007, available at http://www.idpf.org/2007/ops/OPS_2.0_final_spec.html.
  10. JIS Z 8125:2004, "Graphic arts — Glossary — Digital printing", Japan Standards Association