HTML Cleaner is a powerful open source HTML parser written in Java. HTML code found in web pages often contains junk syntax, is poor-quality code, and is unattractive for further processing. The tags, attributes, and the regular text should be organized and formatted as soon as possible in order to optimize use. Using the original HTML structure, the program has been developed.Content should be documented and copied, and its content should be arranged according to specifications. Overall, the output is very good.formed XML document. When creating object models of a document using the program by default, it follows the rules applicable to most modern web browsers in the creation of object models.
HTML Cleaner can be used when working with Java code, as a command line or Ant-task. Its objective was to be small but independent of other packages (except JRE), and highly integrated. As developers, our main goal was to create applications which would display and prepare HTML.It involves further processing with XQuery, XPath and XSLT.
- Documents, such as HTML code, can be automatically processed and generated faster.
- Is there a choice to specify in which type a final file should be;
- A wide range of parameters can be set;
- Multiple copies of the program can be run simultaneously; d several copies of the program simultaneously;
- It is suitable for handling Java code;
- There is only one package, JRE 1.5+).