Home of html2text

Related pages:

html2text is a command line utility, written in C++, that converts HTML documents into plain text.

Each HTML document is loaded from a location indicated by a URI or read from standard input, and formatted into a stream of plain text characters that is written to standard output or into an output-file. The input-URI may specify a remote site, from that the documents are loaded via the Hypertext Transfer Protocol (HTTP).

The program is able to preserve the original positions of table fields, allows you to set the screen width (to a given number of output characters), and accepts also syntactically incorrect input (attempting to interpret it "reasonably"). Boldface and underlined text is rendered by default with backspace sequences (which is particulary usefull when piping the program's output into "less" or an other pager). All rendering properties can largely be costomised trough an RC-file.

html2text is developed and tested under Linux, but should work on most other UNIX platforms as well, including AIX, FreeBSD, IRIX, NetBSD, and SINIX. It is also reported to run on Cygwin and Fink.

html2text was written up to version 1.2.2 by Arno Unkrig for GMRS. As GMRS does not support this program any longer nor provides its source code any more, but agreed to change its licence terms, it is now published here under the terms of the GNU General Public License.

Version 1.3.2 is distributed in two "flavours": 1.3.2a includes changes for modern compilers, like recent versions of gcc, 1.3.2 is for the users of older compilers.

For downloads please refer to html2text's download page, you will find there links to both, the source code and contributed binary packages. Or go directly to Ibiblio to fetch the source code of the most recent version.

You may wish to read html2text's documentation as provided in its "readme" file, or see the program's changes log. In case you encounter problems not described on the program's known bugs page, please refer first to the collection of questions and answers about html2text, and make sure you read the html2text(1) and html2textrc(5) manual pages. There also is a page with samples for an html2textrc configuration file.

This is free Software. Please help to keep it free from software patents by supporting the FFII.