Skip to content

chllrisll/Amazon_Reviews_Analysis

Repository files navigation

Amazon_Vine_Analysis

Pick one of the 50 datasets. Use PySpark to perform the ETL process, connect to an AWS RDS instance, and load the transformed data into pgAdmin.

Results

-How many Vine reviews and non-Vine reviews were there?

Vine_ReviewsvsNonVine

-How many Vine reviews were 5 stars? How many non-Vine reviews were 5 stars?

5Star_Review-paid-vs-nonpaid

-What percentage of Vine reviews were 5 stars? What percentage of non-Vine reviews were 5 stars?

Perc_5Star_Rev

Summary

1.) The majority of reviews 99.9% are non-vine reviewers

2a.) Percent of 5 star app reviews from paid/, 'helpful' dataset: = 25%

2b.) Percent of 5 star app reviews from nonpaid/, 'helpful' dataset: = 49%

3.) Additionally we would recommend testing a larger sample the 'helpful' parameters reduced the count considerably.