Scraping News v3

This project consists of a code was developed using the Selenium, a web-based application automation framework through browsers.

The code leaves the search page of news sites with a filter for a period of time, filling the search field with the keywords. Like result, relevant news about the keywords is shown in a period of time. For each news item, the algorithm collects the title, description, date and full news URL. After collecting all this information, the algorithm enters each stored URL and collects the news content.

Selenium is a portable framework for testing web applications. Selenium provides a reproduction tool to create functional tests without the need to learn a test scripting language.

*Disclaimer: The use of this library/software in the wrong way is the sole responsibility of the user. This code was developed for academic projects and approved by the sites that are receiving data collection.

Development current status

All methods are in the process of being built since the moment I write this.

Installation

The repo is structured like a package, so it can be installed from pip using github clone url. From command line type:

pip install git+https://github.com/luizeduardomr/ScrapingNews.v3

To upgrade the package if you have already installed it:

pip install git+https://github.com/luizeduardomr/ScrapingNews.v3h --upgrade

Please note that you should also install Google Chrome browser in order to use this software better

About the repository

This repository is intended to help developers understand the Scraping process. The purpose is NOT to disclose the code for malicious use or to disclose the software to anyone to use.

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
.github		.github
.vscode		.vscode
src		src
.gitignore		.gitignore
README.md		README.md
debug.log		debug.log
main.py		main.py
main_secao.py		main_secao.py
package-lock.json		package-lock.json
todo.txt		todo.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Scraping News v3

Development current status

Installation

About the repository

This repository is intended to help developers understand the Scraping process. The purpose is NOT to disclose the code for malicious use or to disclose the software to anyone to use.

About

Sponsor this project

Languages

luizmellodev/ScrapingNews.v3

Folders and files

Latest commit

History

Repository files navigation

Scraping News v3

Development current status

Installation

About the repository

This repository is intended to help developers understand the Scraping process. The purpose is NOT to disclose the code for malicious use or to disclose the software to anyone to use.

About

Topics

Resources

Stars

Watchers

Forks

Sponsor this project

Languages