HTML2IPF will recognise a limited number of HTML tags; basicaly this is a subset of
HTML 2.0. However, keep in mind that I never seen an complete HTML 2.0 specification,
so maybe I`m somewhere wrong :-)
If you think HTML2IPF is ignoring some tags, you can enable debug logging (-D+), in this
state HTML2IPF will track all unrecognised tags into a file named HTML2IPF.LOG.
So, what features of HTML documents are kept (basically) intact in the resulting INF book?
Here is a list of HTML tags supported by HTML2IPF (however, keep in mind that some of them
are simply ignored):
- <HTML> & </HTML>
- This tag is not required by HTML2IPF although it is a good practice to put it in every document.
Note that like any normal browser (;-) HTML2IPF has some HTML extensions; notably to this tag:
- <HTML HIDDEN> - if you specify this attribute, the title of this document will 
not appear in book contents. You cannot hide the index file, this keyword is ignored in such
cases. This document also contains such a section, if you`re reading it with VIEW.EXE, try finding it ;-)
- <HTML SUBLINKS="..."> - this attribute will specify the first characters for all
sublinks (A HREF=..`s) found in this HTML file which will be compiled as subheadings, i.e.
their title will appear when you`ll expand the [+] sign before current HTML title in the book
contents tree. Processing of all links which do not fit at least one of given masks (you can
specify any number of SUBLINKS= tags) will be delayed until a document with a suitable
SUBLINKS= attribute (or without any such attribute) will be encountered, and placed as its
sublinks. This especially can be useful when you have to split a large HTML file into
sections (see tips`n`tricks section). If some links are
still unresolved after processing all files, they are resolved at heading level 1 mandatory.
Note that the HIDDEN keyword automatically sets SUBLINKS to '*' (impossible filename), so
that any sublinks will be placed as subheadings after next suitable chapter.
- <HTML NOSUBLINKS="..."> - this is exactly the inverse of the previous subtag,
it defines the first characters of links which must NOT be included as subheadings to the
current file. The NOSUBLINKS subtag has priority over the SUBLINKS. For example, the
 <HTML SUBLINKS="java.awt." NOSUBLINKS="java.awt.image." NOSUBLINKS="java.awt.peer.">
 tag tells HTML2IPF to compile current file, then to include those links referenced in
this file as subheadings, which begins with "java.awt." but not with either "java.awt.image."
or "java.awt.peer.".
 
- <META ...>
- This tag is simply ignored
- <HEAD> & </HEAD>
- <TITLE> & </TITLE>
- The text marked as the title of the first (INDEX) HTML file is taken as overall document name.
- <BODY ...> & </BODY>
- This pair of tags marks the body of the document; any 'advanced' things like
background colors and bitmaps are ignored since IPF doesn't support them.
- <H1> & </H1>
- <H2> & </H2>
- <H3> & </H3>
- <H4> & </H4>
- <H5> & </H5>
- <H6> & </H6>
- The H1 through H6 headings are emulated using 'big' fonts; you can change the fonts
used to mark headings by changing the initial values of the Global.Header1Font,
Global.Header2Font etc. at the start of the REXX script; in the same place you can change
the default font for book (by default HTML2IPF uses default system font; this in most
cases is the System Proportional font if you haven`t added a PM_SystemFonts ->
DefaultFont key in the OS2.INI file); you can change it to WarpSans Bold for a nicer
looking books; however this font has not been supplied with OS/2 versions prior to 4.0.
- <I> & </I>
- <B> & </B>
- <U> & </U>
- <EM> & </EM>
- <CITE> & </CITE>
- Italic, Bold, Underlined, Emphasis and
Citations are supported using IPF's :hp#. & :ehp#. tags.
Citation is equivalent to Italicized text.
- <TT> & </TT>
- <CODE> & </CODE>
- The <CODE> tag actually does the same as <TT> tag; TypeWriter font is
emulated with System VIO font; you can change this by replacing the initial value for
Global.ProportFont in the start of the REXX script.
- <P> & </P>
- <BLOCKQUOTE> & </BLOCKQUOTE>
- The <BLOCKQUOTE> tag is treated like the new-paragraph tag
- <BR>
- The break-line tag is supported
- <HR>
- The horizontal rule is emulated with a row of 80 'Ä' (0xC4) characters.
- <OL> & </OL>
- ... <LI>
- Ordered lists are fully supported by HTML2IPF
- <UL> & </UL>
- <MENU> & </MENU>
- ... <LI>
- Unordered lists are fully supported by HTML2IPF.
The <MENU> tag is treated in same manner as <UL>
- <DL> & </DL>
- ... <DT> & <DD>
- Descriptive lists are fully supported by HTML2IPF. Note that HTML language
allows lists with <DD> tags only and IPF language doesn`t, so empty :dt.
tags are inserted in these cases.
- <PRE> & </PRE>
- Preformatted text is supported via the :cgraphic. & :ecgraphic. tags of IPF
- <A> & </A>
- <A HREF=> tag is supported only for local files; if tag references a remote file
(i.e. starts with something like ###://), HTML2IPF will add them to a chapter
called 'Internet links' to the end of the book. Every link in this chapter will launch
Web Explorer (if you want to use a different browser, change the Global.WWWbrowser variable
at the beginning of the script).
 <A NAME=> tag is ignored since IPF lacks the possibility to reference links inside
same section.
- <IMG>
- The IMG tag is partialy supported; it works only to embed pictures and simple links
(i.e. when the IMG tag is surrounded by <A HREF=...> ... </A> pair). Image maps
are not supported by HTML2IPF although IPFC supports them in its own fashion.
 Other limitation is that IPFC accepts only OS/2 BMP files as image files; however, most
images on the net are kept in GIF or JPEG formats; because of this HTML2IPF uses an
external image converter (I used the demo version of Image Alchemy for OS/2 - it must be
available somewhere on hobbes.nmsu.edu archive.
If you know of other (preferably free) image conversion tool (which can be used automatically
from command line), please mail me.
 If you don`t have the Image Alchemy see tips`n`tricks
section for a work-around.
- <STRONG> & </STRONG>
- <ADDRESS> & </ADDRESS>
- This is just ignored
- <CENTER> & </CENTER>
- This tag is emulated using the IPF tag :lines align=center., i.e. the text is treated
as pre-formatted. If you don`t like this behaviour, you can disable centering at all
using the -CENTER- command-line switch.
- <TABLE> & </TABLE>
- ... <TR> & </TR>
- ... <TH> & </TH>
- ... <TD> & </TD>
- IPF tables are much more limited, so HTML2IPF will mostly strip any 'extra' things such
as images (IPF does not support them), un&ordered/descriptive lists (same), line breaks
and some other. Table headers (<TH>) are imitated with a underlined font (first I
tried to do it using a bold font, but IPFC displays tables with different font widths
distorted). The only neat thing supported in tables are links. Centered tables are not
supported; if HTML2IPF encounters a table in <CENTER> context, it disables centering
during table processing. IPFC can display a 'Out of memory' message when processing
really big tables - sorry, it`s not my fault.
Here is a example of a not too complex table which will be so-so well transformed into IPF
language:
| Heading 1-1 | Heading 2-1 | Heading 3-1 | Heading 4-1 | Heading 5-1 | 
| Heading 1-2 | Cell 2-2 | Cell 3-2 | Cell 4-2 | Cell 5-2 | 
| Heading 1-3 | Example link to title page |  picture example
(as I said, IPF does not support pictures inside tables) | 
| Heading 1-4 | Bold text in a IPF table will cause distorted tables | so avoid such things if you`re planning to convert HTML files into INF. | Theoretically, this can be fixed automatically (i.e. HTML2IPF will ignore such tags
in tables), but for now I don`t see why I should do it | If you want such a feature, please mail me. | 
One more example of a HTML which converts so-so good with HTML2IPF you can find,
say, on Netscape Navigator for OS/2 Warp unofficial homepage (I found it while looking for testcases :-). Try to convert it, you never
seen such a pretty INF file :-) Actually, it contains some unrecognised by HTML2IPF tags
(you can see this in DEBUG+ state) but they`re not too important for the overall look.
Return to title page