jsoup is a Java library for working with real-world HTML. It provides a very convenient API for extracting and manipulating data, using the best of DOM, CSS, and jquery-like methods.
Maven
If you use Maven to manage the dependencies in your Java project, you do not need to download; just place the following into your POM’s section:
1 2 3 4 5 6 |
<!-- jsoup HTML parser library @ http://jsoup.org/ --> <dependency> <groupId>org.jsoup</groupId> <artifactId>jsoup</artifactId> <version>1.10.1</version> </dependency> |
Gradle
1 2 |
// jsoup HTML parser library @ http://jsoup.org/ compile 'org.jsoup:jsoup:1.10.1' |
Dependencies
jsoup is entirely self contained and has no dependencies. jsoup runs on Java 1.5 and up, Scala, Android, OSGi, Lambda, and Google App Engine.