Extract attributes, text, and HTML from elements

Problem

After parsing a document, and finding some elements, you'll want to get at the data inside those elements.

Solution

For example:

  1. String html = "<p>An <a href='http://example.com/'><b>example</b></a> link.</p>";
  2. Document doc = Jsoup.parse(html);
  3. Element link = doc.select("a").first();
  4. String text = doc.body().text(); // "An example link"
  5. String linkHref = link.attr("href"); // "http://example.com/"
  6. String linkText = link.text(); // "example""
  7. String linkOuterH = link.outerHtml();
  8. // "<a href="http://example.com"><b>example</b></a>"
  9. String linkInnerH = link.html(); // "<b>example</b>"

Description

The methods above are the core of the element data access methods. There are additional others:

All of these accessor methods have corresponding setter methods to change the data.

See also