Overall HTML Generation Process Flow

Describes the overall processing flow for the HTML2 transformation

The root transformation file is map2html2Impl.xsl. This transform defines all the runtime parameters, sets up global variables and handles the initial map processing. It contains a simple passthrough template for "/" and the default-mode template for map.

The default-mode map template does the following tasks:
  • Constructs a list of the graphics used within the publication
  • Constructs an intermediate file that maps the graphic locations as authored to their output locations relative to the generated HTML files. This allows the graphics to have different filenames and locations in the output (for example, all graphics can be put into a single output directory regardless of how they are stored for authoring).
  • Constructs an intermediate data set for the back-of-the-book index that can then be used to generate various renditions of the index.
  • Processes the map in the mode "generate-root-pages" in order to generate the root page or pages, such as the "index.html" file that contains navigation structures, a frameset if appropriate, and so on. This mode in turn does generation of static and dynamic tables of contents as selected by the runtime parameters. By default a dynamic ToC is generated and a static ToC is not.
  • Processes the map in the mode "generate-content", which manages the generation of HTML from topics as driven by the map structure.
  • Processes the map in the mode "generate-index", which manages the generation of a back-of-the-book index as a separate HTML page or pages.
  • Processes the map in the mode "generate-graphic-copy-ant-script", which generates an Ant script that does the copying of graphics from their source locations to their output locations.

Most of this processing is fairly straightforward as it is simply processing of topicref-type elements in various contexts to produce different outputs. There is no particular non-obvious processing going on.

The non-obvious processing occurs for the processing of topics in the context of their references within the map. In order to enable use of base XSLT transforms for topics while still providing map-context-aware processing that can be overridden easily, the code uses a somewhat convoluted "double dispatch" method.

The XSLT file map2html2Context.xsl iterates over the list of unique topic document references, as defined by the utility function df:getUniqueTopicrefs(), which takes into account chunking behavior. For each unique topic document reference, processing is applied in the mode "generate-content".

The template for topic references (as opposed to topic heads or topic groups) resolves the topicref to a topic document. It then first processes the topic in the mode "href-fixup", which is implemented by the common module "topicHrefFixup.xsl in the "org.dita4publishers.common.xslt" Toolkit plugin. That mode is an identity transform that rewrites all @@href attribute values to reflect the eventual output locations of all the output files, resulting in an in-memory temporary topic document that is then processed in the mode "generate-content", passing the topicref element and result URI as tunnel parameters to the apply-templates call.

The generate-content template for topics then applies templates in the mode "map-driven-content-processing" to generate the initial no-namespace HTML file generated by the base HTML1 transforms, captured into a temporary variable. The no-namespace HTML is then processed in the mode "no-namespace-html-post-process" to produce the result file, which by default simply copies the no-namespace HTML to the output. You can override this mode to further process the HTML, for example to generate XHTML or HTML5 (as is done by the EPUB and HTML5 transforms, respectively).

The "map-driven-content-processing" template for topics checks to see whether or not the topicref parameter has a value. If it does, it applies templates in the mode "topicref-driven-content" to the topicref passed in as a parameter, passing the topic as a parameter to apply templates. If the topicref parameter does not have a value, then it applies templates to the topic in the default mode, which has the effect of applying the normal HTML1 processing to topic in order to generate non-namespaced HTML.

The "topicref-driven-content" template for topicref then applies templates to the topic in the default mode, passing the topicref as a tunnel parameter. This has no effect on the (current) base HTML1 transforms but makes the topicref available to any extensions that choose to grab it.

The result URLs for files generated from topics (the HTML files) are determined by by the function htmlutil:getTopicResultUrl(). If the global file organization strategy is "single-dir" then the function calls the named template "get-topic-result-url-single-dir" otherwise it applies templates to the map in the mode "get-topic-result-url", which by default applies the "as-authored" strategy. You can implement your own template that matches on map/map in the "get-topic-result-url" mode or you can override the "get-topic-result-url-single-dir" template to implement your own output file organization strategy.

This rather convoluted processing chain is necessary in order to enable direct use of the HTML1 transforms with the option of map awareness. If the HTML1 transforms were reimplemented as XSLT2 transforms then at least some of this mode switching would not be necessary.

This profusion of XSTL modes also enables override and extension at several points and enables both pre-processing of the topics before they are sent to the HTML processing (by hooking the href-fixup mode) and post-processing of the initially-generated HTML (by overriding the no-namespace-html-post-process mode).