Java Web Scraping

Today I introduced you from a new word i.e. Web scraping this word is used for extracting data from a website for some purpose. In this technique we extract large amount data from website and store inside our local file or we can also store these data inside our database for personal use.Now this technique is widely used all around the world because in the age of Information data is every thing.As much amount of you have as much power you have.Most of programming languages contains package for Web Scraping such in python we used BeautifulSoup. 
BeautifulSoup is a fantastic tool of python for web Scraping.
I don't know much about python and I am good in java so, here I am discussing here  how we can scarp data from website using concepts of Java? Now, we are presenting a little package for Java web Scarping.

Jsoup is an open source Java HTML parser.
Jsoup is an open source HTML parser which is used for scraping in Java.It uses DOM,CSS and Jquery-like methods for extracting and manipulating file.

Installing Jsoup

There are two ways for installing Jsoup inside your project for web scraping ..
  • By maven pom.xml
  • By Jsoup.jar file
I am discussing here second one for web Scraping in java, for this purpose download Jsoup file and and inside your project library.

To Enhance your interest continuously in Jsoup we are writing a simple and Interesting code of scarping.Inside below Example 

Scrap Details of your favorite item from amazon.in

How much it will be interesting that we are scraping information from amazon without clicking on multiple buttons or tab with a great fun and it is also fun for beginners to show magic among your friends.Now, without wasting your time I am going to develop code
import java.io.IOException;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
* @author Dheeraj Kumar
import org.jsoup.select.Elements; /** * */
* @param args the command line arguments
public class JsoupAmazonScrapExample { /** */
Document doc = Jsoup.connect("http://www.amazon.in/s/ref=nb_sb_noss_2?url=search-alias%3Daps&field-keywords=books").get();
public static void main(String[] args) { try{
for(Element link :links){
Elements links=doc.getElementsByClass("s-item-container");
System.out.println(e);
System.out.println("\ntext : " + link.text()); } } catch(Exception e){ } }
}
OutPut :
text : The Power of your Subconscious MindDecember 2015 by Joseph Murphy Paperback   99  199 You Save:   100 (50%) prime More Buying Choices   98offer(154 offers) Kindle Edition   49  150 You Save:   101 (67%) Other Formats:Hardcover, Paperback, Paperback, MP3 CD 4.5 out of 5 stars 2,466 See DetailsOffer: Rs.75 back with Amazon Pay balance See Details
text : General Knowledge 20182017 by Manohar Pandey Paperback   26  30 You Save:   4 (13%) prime More Buying Choices   11offer(43 offers) 4.2 out of 5 stars 764 See DetailsOffer: Rs.75 back with Amazon Pay balance See Details
text : Think and Grow Rich1 January 2014 by Napoleon Hill Paperback   75  150 You Save:   75 (50%) prime More Buying Choices   59offer(109 offers) Kindle Edition   29  49 You Save:   20 (40%) Other Formats:Hardcover, Paperback, Mass Market Paperback, Audio CD, Audio Cassette, CD-ROM 4.5 out of 5 stars 1,956
text : This Is Not Your Story14 February 2017 by Savi Sharma Paperback   88  175 You Save:   87 (49%) prime More Buying Choices   65offer(119 offers) Kindle Edition   49  139 You Save:   90 (64%) 4.6 out of 5 stars 1,625 See DetailsOffer: Rs.75 back with Amazon Pay balance See Details
text : Attitude Is Everything: Change Your Attitude ... Change Your Life!15 May 2015 by Jeff Keller Paperback   104  199 You Save:   95 (47%) prime More Buying Choices   104offer(46 offers) Kindle Edition   98.80  199 You Save:   100.20 (50%) 4.5 out of 5 stars 143 See DetailsOffer: Rs.75 back with Amazon Pay balance See Details
text : Black Holes: The Reith Lectures11 July 2016 by Stephen Hawking Paperback   80  99 You Save:   19 (19%) prime More Buying Choices   80offer(42 offers) Kindle Edition   49  84 You Save:   35 (41%) 4.3 out of 5 stars 74 See DetailsOffer: Rs.75 back with Amazon Pay balance See Details
text : The Monk Who Sold His Ferrari25 September 2003 by Robin Sharma Paperback   98  199 You Save:   101 (50%) prime More Buying Choices   70offer(202 offers) Kindle Edition   49  199 Other Formats:Hardcover, Mass Market Paperback, Audio CD 4.5 out of 5 stars 2,282
text : Three Thousand Stitches: Ordinary People, Extraordinary Lives14 July 2017 by Sudha Murty Paperback   125  250 You Save:   125 (50%) prime More Buying Choices   105offer(58 offers) Kindle Edition   99  212 You Save:   113 (53%) 4.3 out of 5 stars 109 See DetailsOffer: Rs.75 back with Amazon Pay balance See Details text : I Do What I Do4 September 2017 by Raghuram G. Rajan Hardcover   349  699 You Save:   350 (50%) prime More Buying Choices   245offer(103 offers) Kindle Edition   199  699 You Save:   500 (71%) 4.3 out of 5 stars 163
text : Diary of a Wimpy Kid: The Getaway (book 12)7 November 2017 by Jeff Kinney Hardcover   184  399 You Save:   215 (53%) prime More Buying Choices   86.21offer(67 offers) Kindle Edition   174.80  339 You Save:   164.20 (48%) Other Formats:Audio CD 4.9 out of 5 stars 98 See DetailsOffer: Rs.75 back with Amazon Pay balance See Details
text : Inner Engineering: A Yogi’s Guide to Joy12 December 2016 by Sadhguru Paperback   156  299 You Save:   143 (47%) prime More Buying Choices   156offer(88 offers) Kindle Edition   99  299 You Save:   200 (66%) Other Formats:Hardcover 4.7 out of 5 stars 723 See DetailsOffer: Rs.75 back with Amazon Pay balance See Details text : Wings of Fire: An Autobiography of Abdul Kalam1999 by Tiwari Paperback   188  350 You Save:   162 (46%) prime More Buying Choices   100offer(445 offers) 4.6 out of 5 stars 1,756
text : The Theory of Everything1 January 2007 | Special Edition by Stephen Hawking Paperback   169  199 You Save:   30 (15%) prime More Buying Choices   90offer(53 offers) Kindle Edition   78.55  199 Other Formats:Hardcover, Audio CD, Audio Cassette 4.5 out of 5 stars 352
text : How to Win Friends and Influence People26 September 2016 by Dale Carnegie Paperback   95  199 You Save:   104 (52%) prime More Buying Choices   95offer(69 offers) Kindle Edition   15.72 Other Formats:Hardcover, Paperback, Mass Market Paperback, Audio CD, Audio Cassette 4.5 out of 5 stars 1,609 See DetailsOffer: Rs.75 back with Amazon Pay balance See Details text : Sita - Warrior of Mithila (Book 2- Ram Chandra Series): An adventure thriller that follows Lady Sita’s journey, set in mythological times29 May 2017 by Amish Tripathi Paperback   195  350 You Save:   155 (44%) prime More Buying Choices   109offer(179 offers) Kindle Edition   49  350 You Save:   301 (86%) 4.2 out of 5 stars 1,279
text : Our Story Needs No Filter26 July 2017 by Sudeep Nagarkar Paperback   95  199 You Save:   104 (52%) prime More Buying Choices   95offer(80 offers) Kindle Edition   49  169 You Save:   120 (71%) 4.8 out of 5 stars 734 See DetailsOffer: Rs.75 back with Amazon Pay balance See Details

Comments

  1. As I know, Java programming language is very popular today. Personally, I value all applications written and working in the cloud. I really like the solution provided by the company - more about cloudboostr where actually when it comes to business everything works as it should.

    ReplyDelete
  2. Harrah's Cherokee Casino & Hotel - Mapyro
    Harrah's Cherokee Casino & Hotel 수원 출장안마 - Find your way around the casino, 서산 출장마사지 find 창원 출장마사지 where everything is located with Mapyro. 이천 출장안마 Harrah's Cherokee Casino and Hotel - 충청남도 출장안마 Mapyro

    ReplyDelete

Post a Comment

Popular posts from this blog

Secure Database Connectivity in node.js with mysql

Export data from mysql db to csv file using java

API (Application Programming Interface)