This repository contains datasets scrapped from Instagram pages of popular fashion bloggers, based on more than 391 fashion review articles. First, I performed a google search on "Instagram fashion Blogger". Then a chrome add-on, linkclump, was used to get the links of 391 fashion review articles from the search results, saved in googlesearchlinks.csv. Here is a link on how to use linkclump: https://www.linkedin.com/pulse/how-scrape-1000-google-search-result-links-5-minutes-graham-onak Afterwards, I wrote a Python script to scrap accounts of all the recommended bloggers from 391 links, using beautifulsoup. See soup.py. Results were saved in bloggerList.csv
Then I wrote blogger_info_collection.py to scrap the raw html of Instagram pages, and get statistics of followers, posts, tags, likes, comments, icons, and emoji from each blogger. The python-instagram API was no longer being maintained, so I wrote my own scripts to do this job.
A sample statistical metrics is saved in Follower_Posts_Nums_Comments_Likes.csv. The columns represents account id, number of followers, number of posts, icon image link, max_comments, min_comments, mean_comments, max_likes, min_likes, and mean_likes.
One can further use these scripts and datasets to perform analysis on these popular Instagram fashion bloggers.