What is web scraping in data science with example programs

WHAT IS WEB SCRAPING? What is web scraping in data science with example programs

  1. The web contains large amount of data that can be structured or unstructured.
  2. Each website has a different layout, style etc.
  3. Web scraping is a technique used to extract information from websites.

 PYTHON LIBRARIES USED

requests

  1. The requests library fetches the web page.
  2. The url of the web page should be mentioned in the geturl() function.
  3. If the page is downloaded successfully 200 is returned.
  4. Then use page.text to get the content of the web page along with the HTML tags.
  5. The content is not in a readable format.
  6. It becomes difficult to read because there is no alignment, spacing and indentation.
  7. Therefore, BeautifulSoup library is used.

BeautifulSoup

  1. Put data out of XML and HTML files.
  2. It prettifies the content and gives proper spacing, alignment and indentation.
  3. prettify() method in BeautifulSoup is used for this task.
  4. Then finally, extract single tags using BeautifulSoup "find_all" method.

PROGRAMS

Here are some easy programs on web scraping to get you started.
What is web scraping in data science with example programs