Answer to Question #264200 in Java | JSP | JSF for Sougandhika

Question #264200

Write a program to read the content of any of the below website and all its sub pages and perform following actions:




Parse all the pages and sub pages of News, Sports and Business section




Extract the content, Image and Links




Dump the Content, Image and Links into the respective mongo collections





Websites




https://timesofindia.indiatimes.com/

1
Expert's answer
2021-11-10T20:42:42-0500
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;

import javax.net.ssl.HttpsURLConnection;
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.net.URL;
import java.util.ArrayList;
import java.util.List;

public class Main {
    public static void main(String[] args) {
        try {
            URL newsURL = new URL("https://timesofindia.indiatimes.com/news");
            HttpsURLConnection connection = (HttpsURLConnection) newsURL.openConnection();
            BufferedReader reader = new BufferedReader(new InputStreamReader(connection.getInputStream()));
            StringBuilder builder = new StringBuilder();
            String data;
            while ((data = reader.readLine()) != null) {
                builder.append(data).append('\n');
            }
            Document document = Jsoup.parse(builder.toString());
            List<String> imgURLs = new ArrayList<>();
            for (Element element : document.getElementsByTag("img")) {
                imgURLs.add(element.attr("data-src"));
            }
            // imgURLs.forEach(System.out::println);
            List<String> hrefURLs = new ArrayList<>();
            for (Element element : document.getElementsByAttribute("href")) {
                hrefURLs.add(element.attr("href"));
            }
            // hrefURLs.forEach(System.out::println);
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}


https://jsoup.org/


Need a fast expert's response?

Submit order

and get a quick answer at the best price

for any assignment or question with DETAILED EXPLANATIONS!

Comments

No comments. Be the first!

Leave a comment

LATEST TUTORIALS
New on Blog
APPROVED BY CLIENTS