Skip to content

higorpo/google-news-scraper-api

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Google News Scraper API

This repository contains the source code of an API that scraps the Google News page. It was created in Typescript (Node.js) using the Puppeteer library together with Express.

You can use the API endpoint using:

curl -X GET https://google-news-scraper-api.herokuapp.com/news?lang=<lang-code>

Like

curl -X GET https://google-news-scraper-api.herokuapp.com/news?lang=en-US

You can also include the parameter q to search for certain news items, for example:

curl -X GET https://google-news-scraper-api.herokuapp.com/news?lang=en-US&q=covid19

The API will return a JSON with an array of objects containing the following structure:

{
    "news_title": string,
    "news_url": string,
    "newspaper_name": string,
    "datetime": string | null,
}

Important: The API caches the information and keeps it for 20 minutes until the next refresh of the data.

--

Created for study purposes.