HTML parsing using JSOUP and combine all data into one with out repeats.
Is that possible?
Below is my program and which is working.
public static void main(String[] args) {
String html = "<p>I am making the letter <span
style=\"font-weight:bold\">BOLD</span></p>";
Document document = Jsoup.parse(html);
Elements textNodes = document.select("p");
for (Element element : textNodes) {
System.out.println("Data in P : " + element.text());
for (Element span : element.select("span")) {
System.out.println("Data In Span : " + span.text());
String att = span.attr("style");
int a = 1;
StringTokenizer st2 = new StringTokenizer(att, ":");
while (st2.hasMoreTokens()) {
String att2 = st2.nextToken();
if (a == 1) {
a = 2;
continue;
} else {
System.out.println("Attribute : " + att2);
a = 1;
}
}
}
}
}
the out put is :
Data in P : I am making the letter BOLD
Data In Span : BOLD
Attribute : bold
Actual html page will look like this :
I am making the letter BOLD
I need to reproduce same HTML like out put when I run program.
I need output as "I am making the letter BOLD"
"BOLD" in above sentence should print from "Data In Span";
So that I could compare if (attribute of Data In Span), is bold? Then
print the word in bold.
So my out put will be :
I am making the letter BOLD
Can you help me with this?
No comments:
Post a Comment