Skip to content

kaustubhgupta/blogathon-analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Analytics Vidhya Blogathon Data Analysis 📈📉📊

blogathon-analysis

Blogathons are competitions that are conducted for over a month or so where, instead of coding, we need to create technical content on any relevant topic. In this Analysis, relevant data were extracted from blogathons, and then concluded some of the articles and blogathon trends, most popular and unexplored topics.

Step-by-Step Extraction and Analysis Article Link: Guide For Data Analysis: From Data Extraction to Dashboard

Final Dashboard 📊

Conclusions 🔥

  • The views went from 280k in the 6th edition to 481k in the 7th edition. This happened because the team introduced a base price for all the articles published.
  • Though I don’t have the data for blogathon 8 articles categories, but, I believe that in comparison to blogathon 8, blogathon 9 had a huge surge of articles aligned towards advanced categories such as NLP. data engineering, computer vision as these categories were prized higher as compared to normal articles. As many as 108 advanced articles were published in blogathon 9.
  • In blogathon 9, a maximum of 300 blogs was– published which is the highest of all times with a total of 628k views. At the same time, the lowest views went to 13! That’s why in blogathon 10, a threshold of 500 views was set because they had to give prizes even for this many views too. It didn’t reduce the number of articles and in fact second-highest articles, 284 were posted. The shocking thing was that this edition, blogathon 10, recorded a total of 1 million views even after the threshold condition.
  • The 11th edition had a hard time with only 222k views and 123 articles. I think that’s the reason that a new category, guide, is introduced in blogathon 12.
  • May 2021 had the highest number of views of all time. peaking at 0.79M.
  • Python, Data, beginner. learning, and project are some of the most popular categories of all time.
  • Docker, R, Julia, Excel, and Deployment are some of the least explored categories.

Project Tech Stack 🏟

  • Python (Language)
  • Libraries
    • urllib.request (Making request to the website)
    • pandas (Data manipulation)
    • re (Rules for data extraction)
    • time (Handling requests)
    • numpy (Data manipulation)
    • tqdm (Progress bars for extraction process)
  • PowerBI (Data wrangling, visualization and dashboard creation)