Skip to content

Neloy-Barman/Daraz-11.11-Top-Selling-Product-Data-Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

49 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Daraz 11.11 Top Selling Product Data Analysis

Project Development Journal

Problem Statement

11.11 sale was going on Daraz where discount was given on most of the products. From each category, we will go to the sub-categories page. Then fetch the top selling products data from the first 3 pages and prepare a dataset. Henceafter, we will apply data transformation and do data analysis using Tableau.

Data Collection

I collected all the data data using web scraping with selenium. You can find all the scraper files and the scraped data within "scrapers" folder. Firstly I collected categorywise product data in different csv files. After that, I merged all the csv files into one. For some reason, I missed the sub-categories name. In place of the name I kept the urls. That's why to replace the urls with names, I collected the names in different csv files. Check out the overview of the initial collected data in the following table.
File Name Dataset Type File Extension Rows Columns
merged_data.csv Tabular csv 12907 14
subcategories.csv Tabular csv 119 3

Data Cleaning & EDA

I performed necessary redundant data cleaning. Check out the final dataset at Top Selling Product Data. The bottom table refers to the overview of the final dataset.
File Name Dataset Type File Extension Rows Columns
Top_Selling_Product_Data.csv Tabular csv 12907 15
Some data anaylsis parts on the final dataset were done using pandas. You can find the data analysis notebook within the "notebooks" folder.

Analysis Requirements Blueprint

  • How many sub-categories does each category have?
  • Among all the products, how many of them are offered with standard delivery and free delivery?
  • How many Flagship and General Stores are there? Which one is more in number?
  • Which category is offered with the most and the least discount?
  • Which product has the most discount? What was its original and discount price? How much the discount prices scatter from the original prices?
  • Considering all types of ratings, which product has the highest number of ratings among all the categories and also sub-categories?
  • How many unique products are sold by each seller? Which seller is selling the most number of products based on category and sub-categories?
  • Does one seller sell its products in different categories?
  • Does 3 factors:
    • Chat Response Rate(%),
    • Positive Seller Ratings(%)
    • & Ship On Time(%) affects a seller on being a flagship store?
  • Among the categories, which sub-categories are offered with the most & the least discounts?
  • As much as the discount is, the more the seller has to sell the products to reach a break-even point equal to the original price of 50 products. Is it true?

DashBoard

You can find all the analysis within this Tableau DashBoard

Analysis and Observations

i. Overall Data Analysis

  • Number of sub-categories within the categories ranges from 7-12.
    • Mother & Baby
    • TV & Home Appliances
    • Watches, Bags & Jewellery
      have the most number of sub-categories.
  • Daraz has more non-flagship stores than the flagship ones.
  • This e-commerce website offered more products with free delivery facility than the ones with standard delivery.
  • Categorically the most discount was offered in Men's & Boy's Fashion and the least discount was offered within Groceries.

ii. Product Data Analysis

  • Top 5 most discounted products within all categories -

    Product Name Discount(%)
    (11 Taka Deal) Dancing cactus talking cactus Stuffed Plush Toy Electronic toy with song plush cactus potted toy Early Education Toy For kids 98
    New Style Leather feragamo Belt For Men
    Black Leather Formal Belt For Men
    92
    Men's Pu Leather Wallet High Quality Men Long Wallet Male Business Pu Leather Purse
    Black High quality Leather Long Wallet For Men
    91
    Je-ep Chocolate Artificial Leather Long Wallet for Men
    Grey Regular Fit China Cotton Golf Cap For Men - Cap For Men - Cap - Winter Cap
    Furdani Stylish High quality Artificial Leather wallet for men
    Canvas Wild Polyester Belts For Men - Belt For Men
    90
    Superb Indispensable -Upscale Living -Black Color Cotton DJ Cap for Men- Inventive Choice Remarkable - Disclose Styles & Luxe
    Men Wallets Men Jeep Wallet with Coin Bag Small Money Purses New Design Dollar Slim Purse Money Clip Wallet
    Jeep Long High quality Artificial Leather wallet For Men
    Jeep Chocolate High quality High Capacity Artificial Leather Long Wallet for Man
    Jeep Black Long Artificial Leather Wallet for men
    High quality Artificial Leather belt for men
    11 Taka Deal Joya Ultra Comfort Wings - 8 Pads Pack - pad
    89
  • There were many products where no discount was offered.

  • Top 5 products with most number of ratings among all categories -

Product Name Number of Ratings
Diamond potato 1kg ± 25 gm 11914
S8 Ultra Smart Watch Series 8 Ultra Men Women Bluetooth Call Wireless Charging Fitness Bracelet 1.95/1.44 Inch HD Screen 8908
Local Onion 1 kg (± 25 gm) - onion 6280
QKZ DM10 Zinc Alloy HiFi Earphones 5430
Imported Onion 1 kg 4868
  • There are various types of products among all categories but potato and onion which are take place in the most number of ratings products.

iii. Seller Data Analysis

  • Top 3 sellers selling most number of distinct products: -
Seller Name Product Count
Daraz Fresh 114
SWAP 101
Well-being Distribution Ltd. 66
FOGG Bangladesh 62
Unilever 61
  • There are many sellers who sells products in different categories and also in various sub-categories.

  • A seller's Avg.

    • Chat Response Rate(%)
    • Positive Seller Ratings(%)
    • Ship On Time(%)
      don't affect it on being a flagship seller.

iv. Break-even Point & Sub-caterical Average Discount Analysis

  • There is a relationship between the offered discount and the number of products to be sold to reach a break-even point. So, as much as the discount is, the more the products to be sold to reach the break-even point equal to the price of 50 products.

  • Sub-categorical most discount -

Category Sub-category Discount(%)
Men's & Boy's Fashion Accessories 67.52
Watches, Bags, Jewellery Shop Men's Bags Online in Bangladesh 61.55
Electronic Device Mobile Accessories 51.18
Electronic Accessories Audio 49.89
Mother & Baby Sports & Outdoor Play 46.18
Sports & Outdoors Shoes & Clothing 46.36
Automotive & Motorbike Interior Accessories 44.84
TV & Home Appliances TV & Video Accessories 41.11
Health & Beauty Beauty Tools 38.64
Groceries Cooking Ingredents 33.64

Short Video Demonstration

I prepared a short video demonstration and shared it as a linked in post. Check it out here.

How to Use Scraper Files

  1. Clone the repo
git clone https://github.com/Neloy-Barman/Daraz-11.11-Top-Selling-Product-Data-Analysis.git
  1. Create & activate virtual Environment
virtualenv --no-site-packages venv
source venv/bin/activate
  1. Install dependencies
pip install -r requirements.txt
  1. Run the product data scraper
python product_data_scraper.py
  1. Run the subcategories data scraper
python subcategory_scraper.py

Challenges Faced

  • In my case, the scrapers got stopped after running for hours. Then I had to find the last scraped index and restart from there.
  • While the loop continues to run, some data just may get missed. If checked, then there remains the same elements as others but still may miss.
  • As there are many categories along with subcategories so, it took a lot of time to scrape data.