Skip to content

This repository contains datasets scrapped from Instagram pages of popular fashion bloggers, based on more than 400 fashion review articles.

Notifications You must be signed in to change notification settings

jinzhenfan/instagramBlogger

Repository files navigation

instagramBlogger

This repository contains datasets scrapped from Instagram pages of popular fashion bloggers, based on more than 391 fashion review articles. First, I performed a google search on "Instagram fashion Blogger". Then a chrome add-on, linkclump, was used to get the links of 391 fashion review articles from the search results, saved in googlesearchlinks.csv. Here is a link on how to use linkclump: https://www.linkedin.com/pulse/how-scrape-1000-google-search-result-links-5-minutes-graham-onak Afterwards, I wrote a Python script to scrap accounts of all the recommended bloggers from 391 links, using beautifulsoup. See soup.py. Results were saved in bloggerList.csv

Then I wrote blogger_info_collection.py to scrap the raw html of Instagram pages, and get statistics of followers, posts, tags, likes, comments, icons, and emoji from each blogger. The python-instagram API was no longer being maintained, so I wrote my own scripts to do this job.

A sample statistical metrics is saved in Follower_Posts_Nums_Comments_Likes.csv. The columns represents account id, number of followers, number of posts, icon image link, max_comments, min_comments, mean_comments, max_likes, min_likes, and mean_likes.

One can further use these scripts and datasets to perform analysis on these popular Instagram fashion bloggers.

About

This repository contains datasets scrapped from Instagram pages of popular fashion bloggers, based on more than 400 fashion review articles.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published