Skip to content

vaima75/AmExpert

Repository files navigation

Amex Challenge: AmExpert

AmExpert 2018 - Machine Learning Hackathon, organized by American Express and Analytics Vidhya, an opportunity to showcase analytical abilities and talent.

Problem Statement

Recent years have witnessed a surge in the number of internet savvy users. Companies in the financial services domain leverage this huge internet traffic arriving at their interface by strategically placing ads/promotions for cross selling of various financial products on a plethora of web pages. The digital analytics unit of Best Cards Company uses cutting edge data science and machine learning for successful promotion of its valuable card products. They believe that a predictive model that forecasts whether a session involves a click on the ad/promotion would help them extract the maximum out of the huge clickstream data that they have collected. You are hired as a consultant to build an efficient model to predict whether a user will click on an ad or not, given the following features:

  • Clickstream data/train data for duration: (2nd July 2017 – 7th July 2017)
  • Test data for duration: (8th July 2017 – 9th July 2017)
  • User features (demographics, user behaviour/activity, buying power etc.)
  • Historical transactional data of the previous month with timestamp info (28th May 2017– 1st July 2017) (User views/interest registered)
  • Ad features (product category, webpage, campaign for ad etc.)
  • Date time features (exact timestamp of the user session)

Data

Train

Variable Definition
session_id Unique ID for a session
DateTime Timestamp
user_id Unique ID for user
product Product ID
campaign_id Unique ID for ad campaign
webpage_id Webpage ID at which the ad is displayed
product_category_1 Product category 1 (Ordered)
product_category_2 Product category 2
user_group_id Customer segmentation ID
gender Gender of the user
age_level Age level of the user
user_depth Interaction level of user with the web platform (1 - low, 2 - medium, 3 - High)
city_development_index Scaled development index of the residence city
var_1 Anonymised session feature
is_click 0 - no click, 1 - click

Historical User Logs

Variable Definition
DateTime Timestamp
user_id Unique ID for user
product Product ID
Action view/interest (view - viewed the product page, interest - registered interest for the product)

Evaluation Metric

The evaluation metric for this competition is AUC-ROC score.