β¬
οΈ Back to home
π Project 4: A/B Testing & Statistics on Olist Data
π§© Context and Objectives
This one-day project aimed to explore a subset of the public Olist dataset (Brazil) in order to formulate business hypotheses, extract descriptive insights, and test hypotheses using statistical and A/B testing methods. The approach relied on analysis in Python (pandas, numpy, plotly express) and included an oral presentation of the results.
π Data Used
- Olist dataset (reduced version: orders, customers, products, payments, reviews)
- Data processed in a DataFrame after exploration and cleaning
π Methodology
- Exploration & cleaning: data types, missing values, duplicates
- Descriptive statistics:
- Average price / standard deviation / box plots (orders)
- Distribution by product category, payment method, customer satisfaction
- Analysis of delivery times (mean, median, std deviation)
- Exploratory correlations:
- Price vs satisfaction
- Delivery time vs satisfaction
- Product category vs rating
- A/B Testing (examples):
- On-time delivery vs average satisfaction (t-test)
- Payment method vs order completion rate (chiΒ²)
- Product categories vs order value (ANOVA)
π Results
- Delivery time has a significant impact on customer satisfaction (p-value < 0.05)
- Some product categories influence the average rating given by customers
- No significant difference found between payment methods on order completion rate
π Tools
Python (pandas, numpy, plotly, scipy.stats), Jupyter
π£οΈ Presentation
Synthetic slides with visualizations + explanation of tests
View the notebook on Google Colab