public class Html
extends java.lang.Object
Modifier and Type | Method and Description |
---|---|
org.dom4j.Document |
load(org.apache.commons.httpclient.HttpClient client,
java.lang.String url)
Loads a HTML page from a URL and parses it as DOM document
|
org.dom4j.Document |
load(java.lang.String url)
Loads a HTML page from a URL and parses it as DOM document
The HTML parser NekoHTML is responsible for parsing the HTML document as a DOM tree.
|
org.dom4j.Document |
parse(java.lang.String html)
Parses HTML text and returns it as DOM document object.
|
org.dom4j.Document |
parseFragment(java.lang.String html)
Parses HTML text and returns it as DOM document object.
|
public org.dom4j.Document load(java.lang.String url) throws java.io.UnsupportedEncodingException, de.innovationgate.webgate.api.WGException, java.io.IOException, org.apache.commons.httpclient.HttpException, org.xml.sax.SAXException
url
- The URL to load the HTML document fromjava.io.UnsupportedEncodingException
de.innovationgate.webgate.api.WGAPIException
java.io.IOException
org.apache.commons.httpclient.HttpException
org.xml.sax.SAXException
de.innovationgate.webgate.api.WGException
public org.dom4j.Document load(org.apache.commons.httpclient.HttpClient client, java.lang.String url) throws java.io.UnsupportedEncodingException, de.innovationgate.webgate.api.WGException, java.io.IOException, org.apache.commons.httpclient.HttpException, org.xml.sax.SAXException
client
- HTTP client object to load the resource on the URL, used to specify various settings regarding this operation. If no client object is passed a default one is used.url
- The URL to load the HTML document fromjava.io.UnsupportedEncodingException
de.innovationgate.webgate.api.WGAPIException
java.io.IOException
org.apache.commons.httpclient.HttpException
org.xml.sax.SAXException
de.innovationgate.webgate.api.WGException
public org.dom4j.Document parse(java.lang.String html) throws de.innovationgate.webgate.api.WGException, org.xml.sax.SAXException, java.io.IOException
html
- org.xml.sax.SAXException
java.io.IOException
de.innovationgate.webgate.api.WGException
public org.dom4j.Document parseFragment(java.lang.String html) throws de.innovationgate.webgate.api.WGException, org.xml.sax.SAXException, java.io.IOException
html
- org.xml.sax.SAXException
java.io.IOException
de.innovationgate.webgate.api.WGException