🏴☠️ Web scrape Meetup.com now that they have locked down their API. 😤
In August 2019, Meetup.com shutdown open access their API. In order to gain access to the API you now needed to apply and get approval. And in order to apply, you needed to pay for a 💸PRO account💸. This action hurt individual groups, like the Coffee Club of Seattle, which used the API to help organizers use historical data to schedule new events.
Without access to the API, one now needs to scrape the website to get event details. Let's get started! 💪🏻
Every public Meetup group has a page with their last 10 events.
Example [https://www.meetup.com/seattle-coffee-club/events/past/]
getEventHistory.js will scrape that page to pull out the eventID from the a.eventCard--link element. My sample code adds the eventID to a MySQL table for processing in the next step.
With the eventID, you can build a direct link to an event page.
Example: [https://www.meetup.com/seattle-coffee-club/events/265684295/]
On this page, we will scrape to pull up all the details related to the event and the venue. The only piece of data not inside the scrape that was available in the API was the venueID. If you have a table of prior events already saved you can write a query that matches the latitude and longitude of the venue to get a match. 😎
Once the data is pulled, this code will save a JSON file locally. You can process your JSON in whatever way is best for your situation.
getEventHistoryAndJSON.js combines both steps.
On the first 20 scrapes, the code was able to pull full event and venue details for 19. One was returned as a partial with missing data. If I can improve the scraper, I'll update this repo.
The risk with any scraper is that it only works until the website gets redesigned. At which point, the code will need to be modified to work with the new layout.
processEventJSON.js takes the event JSON files in the /json/ folder and FTPs them up to a webserver. Then it places that file into a sent folder.
I used the Last 10 page for my group, because we have all the legacy data saved for over 1,300 events going back to 2006. If I didn't have that data, I'd look into scraping the monthly pages.
Example [https://www.meetup.com/seattle-coffee-club/events/calendar/2019-09/]