top of page

VIEW PROJECTS

Logos l; (3).png

SQL/MACHINE LEARNING

Paystack Payment Analytics - B2B Fintech Product Analytics

Project Background

Project Overview:

This project analyzes payment processing data modeled after the Nigerian fintech ecosystem (Paystack, Flutterwave) to understand merchant behavior, optimize activation funnels, predict churn, and detect fraud. By combining SQL-based product analytics with machine learning models, I built a comprehensive analytical framework that provides actionable insights for improving platform health and revenue growth.

Business Context: Payment processing platforms like Paystack earn revenue through transaction fees (typically 1.5-2% per successful payment). Their business success depends on three critical factors:

  1. Activation: Getting merchants to complete setup and process their first live transaction

  2. Retention: Keeping merchants actively processing payments month-over-month

  3. Fraud Prevention: Blocking fraudulent transactions that result in chargebacks and revenue loss

The Challenge: With thousands of merchants spanning different business types (SME, Enterprise, Individual), payment methods (Card, Bank Transfer, USSD, Mobile Money), and engagement levels, how do you identify which merchants to prioritize for retention efforts? How do you detect fraud patterns in real-time? How do you optimize the activation funnel?

The Solution: Built an end-to-end analytics pipeline covering 20 SQL queries (activation, engagement, retention, churn, revenue, payment analysis) and 3 machine learning models (churn prediction, fraud detection, merchant segmentation) to transform raw payment data into strategic business recommendations.

Objectives:

Primary Goal

Develop data-driven strategies to improve merchant activation rates, reduce churn, and prevent fraud in a B2B fintech payment platform.

Specific Goals

1. Activation Analysis

  • Calculate merchant activation rate and identify drop-off points in onboarding funnel

  • Determine average time-to-activate and factors that accelerate/slow activation

  • Compare activation performance across merchant segments (SME vs Enterprise)

2. Engagement & Retention Analysis

  • Track monthly active merchants (MAM) and growth trends

  • Segment merchants by engagement levels (Low, Medium, High)

  • Calculate M1, M3 retention rates and build cohort retention tables

  • Identify D7, D14, D30, D60, D90 retention benchmarks

3. Churn Analysis

  • Calculate monthly churn rate and identify churn drivers

  • Compare logo churn vs revenue churn (merchant count vs revenue impact)

  • Build predictive model to flag high-risk merchants before they churn

4. Revenue Analysis

  • Track Monthly Recurring Revenue (MRR) growth

  • Calculate Net Revenue Retention (NRR) to measure expansion from existing merchants

  • Analyze revenue distribution by merchant segment

  • Calculate Cohort Lifetime Value (LTV)

5. Payment Performance Analysis

  • Compare success rates across payment methods (Card, Bank, USSD, Mobile Money)

  • Identify failure reasons and their business impact

  • Analyze payment method preferences by merchant segment

Tools and Technologies:

Data Generation

Because real Paystack data is confidential, I generated a realistic synthetic dataset using Python:

  • 800 merchants

  • ~23,857 transactions

  • Payment methods: Card, Bank Transfer, USSD, Mobile Money

  • Time-based activity patterns

  • Fraud-like anomalies embedded intentionally

Machine Learning

  • scikit-learn: Random Forest (churn), Isolation Forest (fraud), K-Means (segmentation)

  • pandas, numpy: Data manipulation and feature engineering

  • matplotlib, seaborn: Model performance visualization

Key Insights:

  • Churn is driven by inactivity, not merchant size

  • Revenue is highly concentrated among power merchants

  • High logo churn does not necessarily imply business failure

  • Fraud follows predictable behavioral patterns

  • Retention strategies must differ by merchant segment

paystack.png

SQL /MACHINE LEARNING/POWER BI

Open Food Facts Sales Analytics

Project Background

Project Overview:

This project demonstrates a complete data science workflow for an e-commerce business, tackling real-world challenges in customer retention, revenue optimization, and personalized marketing. Using a dataset of 3,900 customers, 10,002 products, and 855 transactions spanning one year, I built an end-to-end analytics solution featuring SQL database queries, three machine learning models, and interactive Power BI dashboards.

The Business Problem

E-commerce companies face three critical challenges:

  1. Customer Churn - 27% of customers are at risk of leaving, resulting in significant revenue loss

  2. Ineffective Marketing - Generic campaigns fail to engage different customer segments

  3. Missed Cross-Selling Opportunities - Products are recommended without understanding customer preferences

My Solution

I created a comprehensive analytics system that:

  •  Predicts customer churn with 83% accuracy, enabling proactive retention campaigns

  •  Segments customers into 5 distinct groups for targeted marketing

  •  Recommends products based on similarity algorithms to increase cross-selling

  •  Analyzes 20+ business metrics through complex SQL queries across multiple tables

  •  Delivers insights through interactive dashboards for stakeholder decision-making

Objectives:

  • Primary Goals

1. Build Production-Ready SQL Database

  • Design relational database structure with multiple tables

  • Write complex queries using JOINs, CTEs, and window functions

  • Demonstrate real-world database querying skills beyond CSV analysis

2. Develop Predictive Machine Learning Models

  • Customer Segmentation - Identify distinct customer groups using unsupervised learning

  • Churn Prediction - Forecast which customers will leave using classification

  • Product Recommendation - Suggest relevant products using content-based filtering

3. Create Actionable Business Insights

  • Translate model outputs into clear business recommendations

  • Calculate ROI for retention campaigns

  • Prioritize high-value customers for marketing spend

4. Visualize Complex Data Effectively

  • Build interactive Power BI dashboards with drill-down capabilities

  • Present model performance metrics (confusion matrix, feature importance)

  • Enable stakeholders to make data-driven decisions

Technical Learning Objectives:

  • Master SQL for multi-table analysis (not just single CSV files)

  • Implement supervised learning (classification) and unsupervised learning (clustering)

  • Evaluate models properly (train/test split, ROC-AUC, confusion matrix)

  • Build end-to-end pipeline: Database → Analysis → Modeling → Visualization

Tools and Technologies:

  • Programming & Data Analysis

  • Python 3.8: Data manipulation, ML modeling

  • Pandas: Cleaning, transformation, feature engineering

  • NumPy: Array operations, mathematical functions

  • Jupyter Notebook: Exploratory analysis, model experimentation

  • Machine Learning & Statistics

  • Scikit-learn: Random Forest, K-Means, preprocessing

  • OneHotEncoder: Categorical variable transformation

  • Train-Test Split: Model validation (70-30 split for unbiased evaluation)

  • Cosine Similarity: Recommendations, Product similarity calculations

  • Specific Algorithms:

  • K-Means Clustering - Unsupervised segmentation (5 customer groups)

  • Random Forest Classifier - Churn prediction (100 trees, max_depth=10)

  • Content-Based Filtering - Product recommendations (similarity matrix)

  • SQL Techniques Applied:

  • INNER/LEFT/RIGHT JOINs - Multi-table relationships

  •  Window Functions - RANK(), ROW_NUMBER(), cumulative calculations

  •  Subqueries - Nested SELECT statements

  •  Date/Time Functions - YEAR(), MONTH(), FORMAT()

  •  Aggregate Functions - SUM(), AVG(), COUNT(), GROUP BY, HAVING

  • Data Visualization

  • Power BI Desktop Interactive Dashboard

  • Matplotlib

  • Seaborn

  • Power BI Features:

  • DAX measures for calculated metrics

  • Drill-through functionality

  • Conditional formatting

  • Custom tooltips

  • Matrix visuals with color scales

Open_Food_Facts_logo_vertical_2022.svg.png

SQL/MACHINE LEARNING/VISUALIZATION

League of Legends Player Retention Analytics

Project Background

Project Overview:

This project analyzes player engagement and retention patterns in League of Legends using real-time data from the Riot Games API. By combining SQL analytics, machine learning (K-Means clustering and Logistic Regression), and interactive Power BI dashboards, I identified the key drivers of player churn and built predictive models to flag at-risk players before they leave.

The Core Problem: Out of 12,760 new players, only 378 (2.96%) returned for a second match—a staggering 97% early drop-off rate that represents massive lost revenue and community growth potential.

The Solution: A comprehensive analytics pipeline that segments players by behavior, predicts 14-day retention with 84% accuracy, and provides actionable recommendations to reduce churn by 30-40%.

Project Duration: 4 weeks
Data Volume: 100,000+ matches, 12,760+ players, 1,944 total matches analyzed
Key Deliverables: SQL database, 2 ML models, interactive Power BI dashboard, business recommendations

Objectives:

Primary Objective

Identify why players churn early and develop data-driven strategies to improve retention rates.

Specific Goals

  1. Understand Player Behavior: Analyze engagement patterns, match frequency, and performance metrics

  2. Segment Players: Use machine learning to identify distinct player types with different retention profiles

  3. Predict Churn: Build a model to identify at-risk players before they quit (D14 retention prediction)

  4. Quantify Impact: Calculate the business value of retention improvements

  5. Provide Recommendations: Deliver specific, actionable product changes based on data insights

.

Tools and Technologies Used:

Data Collection

  • Riot Games API: Real-time match data, player profiles, in-game events

  • Python (Requests library): API integration with rate limiting and error handling

Data Storage & Processing

  • SQL Server: Relational database design for players, matches, events

  • SQL: Complex queries for cohort analysis, retention calculations, funnel metrics

  • Python (pandas, numpy): Data cleaning, transformation, feature engineering

Machine Learning

  • scikit-learn: K-Means clustering (player segmentation), Logistic Regression (churn prediction)

  • Python: Model training, evaluation, hyperparameter tuning

  • Jupyter Notebooks: Exploratory data analysis and model development

Visualization & Reporting

  • Power BI: Interactive dashboards with DAX measures and Power Query transformations

  • DAX: Custom retention metrics, time intelligence calculations

  • Power Query (M language): Advanced data transformations for player-level aggregations

Methodology

Phase 1: Data Collection & Database Design

1. Registered for Riot Games API access and obtained development key

2. Built Python scripts to extract match history, player profiles, and in-game events

3. Designed normalized SQL database schema with relationships:

    • players table (demographics, join dates)

    • matches table (match metadata, duration, game mode)

    • match_participants table (player performance per match)

    • events table (in-game actions: kills, deaths, objectives)

  1. Implemented automated daily data refresh pipeline

Phase 2: Exploratory Analysis

  1. Cohort Retention Analysis: Calculated D1, D7, D14, D30 retention rates by signup week

  2. Engagement Funnel: Mapped player progression from first match → second match → activation (3+ matches)

  3. Performance Impact: Analyzed correlation between kills, win rate, match duration, and retention

  4. Segment Hypothesis: Identified patterns suggesting different player types exist

EGS_LeagueofLegends_RiotGames_S1_2560x1440-47eb328eac5ddd63ebd096ded7d0d5ab.jpg

VISUALIZATION WITH POWER BI

Fintech Project Management Analysis

Project Background

Project Overview:

This project is the FP20 Analytics Live Challenge 32 which analyzes real-world Fintech project management operations using a multi-table dataset covering projects, tasks, employees, milestones, and budgets. The aim was to understand how digital payment solutions are developed across teams and to uncover patterns in delays, resource use, and cost efficiency. The final Power BI report highlights key delivery metrics, performance gaps, and financial insights needed to improve on-time delivery and optimize workforce utilization.

Objectives:

  • Evaluate project performance: timeline accuracy, delivery progress, and completion rates.

  • Measure cost performance against budget and highlight overspending or underspending risks.

  • Track task-level efficiency to identify blockers such as long review cycles or tasks on hold.

  • Compare departments, teams, and cities based on consistency and delivery reliability.

  • Analyze workforce impact using experience level, hourly rates, and actual hours logged.

  • Support decision-making through interactive drill-through analysis and department-level insights.

Key Insights:

  • 48% of budget allocated remained unused, signaling potential planning inefficiencies.

  • Projects with low completion rates shared common blockers like “On Hold” or “Review Required.”

  • High-experience employees consistently delivered faster with fewer hours variance.

  • Certain departments completed tasks with better cost efficiency, indicating better workload management.

  • Projects with early milestone delays had a much higher likelihood of overspending later.

  • Cities with stable workforce distribution demonstrated more predictable delivery patterns.

Tools and Techniques:

  • Power BI Desktop: Data modeling, DAX measures, interactive dashboard creation

  • Power Query: Type corrections, locale date fixes, merging & cleaning

  • DAX: KPIs, variance metrics, resource cost calculations, experience-efficiency modeling

  • UI/UX principles: sidebar navigation, section layout, KPI placement, consistent typography

  • Drill-through analytics: Employee-level and project-level investigation

Deliveries:

  • 3-page interactive Power BI dashboard

  • Clean PDF export for portfolio

  • Insight summary for stakeholders

Stock Charts in the Newspaper

MACHINE LEARNING/POWER BI

Marketing Campaign Success Predictor

Project Background

Project Overview:

Built an end-to-end machine learning system that predicts whether digital marketing campaigns will achieve high ROI (>1000%) before launch, enabling data-driven budget allocation decisions and preventing wasted ad spend.

 

The Challenge

Marketing teams struggle to predict campaign success before investing significant budgets. Most decisions are based on intuition rather than data, resulting in:
- £90,000+ annual waste on underperforming campaigns
- Missed opportunities on high-potential campaigns
- Lack of actionable insights on what drives success

 

The Solution

Developed a machine learning classification system that:
- Analyzes 9,900 historical campaigns across Facebook, Instagram, and Pinterest
- Predicts campaign success probability with 79% accuracy on unseen data
- Identifies the key factors driving campaign performance
- Provides interactive dashboards for decision-making
- Offers a "What-If" simulator for testing new campaign ideas

Objectives:

  • Primary Objectives

1. Build a Predictive Model
   - Develop ML classifier to predict campaign success (High ROI vs Low ROI)
   - Achieve minimum 75% accuracy on unseen test data
   - Ensure model is interpretable for business stakeholders

2. Identify Success Drivers
   - Determine which factors matter most for campaign performance
   - Quantify the importance of budget, channel, timing, and engagement
   - Provide actionable insights for marketing strategy

3. Create Decision-Support Tools
   - Build interactive dashboards for campaign analysis
   - Develop "What-If" simulator for testing new campaigns
   - Enable non-technical users to leverage ML predictions

4. Demonstrate Business Value
   - Calculate ROI and cost savings from model implementation
   - Show clear recommendations for budget allocation
   - Prove the model's real-world applicability

Tools and Technologies:

Programming & Machine Learning: Python, RandomForest, pandas, numpy
Data Processing & Feature Engineering: OneHotEncoder, ColumnTransformer, MinMaxScaler
Visualization & Reporting: Power BI, matplotlib, seaborn

Marketing Dashboard Display

BIG QUERY/SQL SERVER/POWER BI

Google Merchandise Store: Full Stack Product Analytics

Project Background

Project Overview:

A comprehensive e-commerce analytics project analyzing 900,000+ customer events from Google's official Merchandise Store. Using the full product analytics stack—from BigQuery data extraction to Power BI dashboards—I identified critical conversion bottlenecks and developed data-driven recommendations projected to increase revenue by 15% ($58K annually).

This project demonstrates end-to-end product analytics capabilities: data engineering, advanced SQL analysis, statistical reasoning, visualization design, and business strategy development.

Objectives:

Primary Goal

  • Conduct a complete conversion funnel analysis to identify where Google Merchandise Store loses customers and provide actionable recommendations to increase revenue

Specific Objectives

  • Map the customer journey - Build product-level funnel from view to purchase

  • Quantify drop-off points - Calculate conversion rates between each funnel stage

  • Segment user behavior - Analyze by traffic source, device, and temporal patterns

  • Identify product opportunities - Determine best and worst-performing items

  • Calculate business impact - Estimate revenue potential of optimization efforts

Success Metrics

  • Overall conversion rate analysis

  • Cart abandonment diagnosis

  • Traffic source ROI evaluation

  • Product-level conversion insights

  • Revenue opportunity quantification

.

Tools and Technologies Used:

Data Infrastructure

  • Google BigQuery - Extracted 900K events from GA4 public dataset

  • SQL Server - Data warehouse for transformation and analysis

  • T-SQL - 20+ complex analytical queries

Analysis & Visualization

  • Power BI Desktop - Interactive 2-page dashboard

  • DAX - Custom measures for conversion and abandonment metrics

  • Technical Skills Applied

  • SQL: CTEs, window functions, self-joins, date manipulation, aggregations

  • Data Modeling: Dimensional modeling with fact and dimension tables

  • Statistics: Conversion rate analysis, cohort retention, A/B test framework

  • Dashboard Design: User experience principles, visual hierarchy, storytelling

  • Business Analysis: Revenue impact modeling, prioritization frameworks

Skills Demonstrated:

  • Technical Proficiency

  • Advanced SQL: Complex CTEs, window functions, self-joins for 900K+ row datasets

  • Data Engineering: End-to-end pipeline from BigQuery to SQL Server to Power BI

  • Statistical Analysis: Conversion rates, cohort analysis, A/B test design

  • Data Visualization: Dashboard UX design following best practices

Business Acumen

  • Product Thinking: Translated metrics into user behavior insights

  • Impact Quantification: Tied every finding to revenue opportunity

  • Prioritization: Ranked recommendations by effort vs. impact

  • Stakeholder Communication: Crafted narrative for technical and non-technical audiences

Product Analytics Mindset

  • Asked "why" behind every metric (not just "what")

  • Segmented data multiple ways to find hidden patterns

  • Connected user behavior to business outcomes

  • Provided actionable next steps, not just observations

2a5758d6-4edb-4047-87bb-e6b94dbbbab0-cover.png

PYTHON/MACHINE LEARNING

Hull Tactical Analysis

Project Background

Project Overview:

This project analyzes the Hull Tactical Fund with dataset gotten from kaggle.com using Python to explore market data, identify patterns, and evaluate risk and performance metrics. It applies data preprocessing, visualization, and predictive modeling techniques to draw insights into market behavior and investment trends.

Objectives:

  • Perform exploratory data analysis (EDA) on the provided datasets (train.csv, test.csv).

  • Identify correlations and trends in historical financial data.

  • Build and evaluate predictive models to forecast fund performance.

  • Visualize insights using Python libraries for better interpretability.

.

Tools and Technologies:

  • Python

  • Jupyter Notebook

  • pandas, numpy, matplotlib, seaborn, scikit-learn, Linear Regression

Statistics on an ipad

VISUALIZATION WITH POWER BI

West Africa Development Dashboard

Project Background

Project Overview:

This Power BI project explores economic, infrastructure, and environmental development trends across West African countries using World Bank (WDI) data.
It provides a multi-dimensional view of growth—connecting GDP, fiscal health, labor participation, poverty reduction, and sustainability performance—to reveal how the region balances progress with environmental responsibility.

Objectives:

  • Evaluate economic performance through GDP per capita, PPP, and government debt (% of GDP).

  • Examine welfare trends—poverty ratio and labor force participation.

  • Assess infrastructure and environmental balance via electricity access, CO₂ emissions, forest coverage, and land area.

  • Identify top-performing countries and opportunities for sustainable development.

.

Tools and Concepts Used:

  • Power BI Desktop: Data modeling & interactive visualization

  • Data Source: World Bank DataBank (WDI) – Africa Development Indicators 2020–2024

  • DAX Measures: Dynamic averages, ratios (e.g., Sustainability Index, PPP/GDP comparisons)

  • Data Cleaning: Power Query (removed nulls, filtered years, normalized country names)

  • Design: Clean theme with KPIs, conditional formatting, and iconography for readability

Analytic Takeaways:

  • Steady economic progress coexists with slow poverty reduction → policy focus should target inclusive growth.

  • Energy access expansion drives development but raises CO₂ emissions → green energy transition is key.

  • Countries like Ghana and Cabo Verde show that high purchasing power can coexist with moderate emissions—models for regional sustainability.

  • Integrating economic and environmental metrics offers a richer picture of Africa’s development story beyond GDP alone.

africa-60570_1280.jpg

SQL | POWER BI | MACHINE LEARNING

Telco Customer Churn Analysis

Project Background

Project Overview:

This project analyzes customer churn patterns for a telecommunications company using the Telco Customer Churn dataset from Kaggle.
It demonstrates how SQL, Power BI, and Python can be combined to uncover key drivers of churn and build a predictive model that helps businesses retain customers.

The project integrates:

  • Data extraction and cleaning in MySQL

  • Visual analytics and KPIs in Power BI

  • Predictive modeling in Python using Logistic Regression

Objectives:

  • Identify factors contributing most to customer churn.

  • Quantify churn rates across service types, contracts, and demographics.

  • Develop a machine-learning model to predict customers likely to leave.

  • Present insights visually for clear business communication.

Tools and Concepts Used:

  • Category Tools / Libraries

  • Database - MySQL

  • Visualization Power BI

  • Machine Learning - Python, Pandas, NumPy, Scikit-Learn

  • Preprocessing - OneHotEncoder, MinMaxScaler, ColumnTransformer

  • Evaluation Metrics - Accuracy, AUC, ROC Curve

Analytic Takeaways:

  • Customers on month-to-month contracts and fiber optic plans have the highest churn rates.

  • Longer tenure significantly reduces churn probability.

  • High monthly charges correlate strongly with churn risk.

  • Payment method and internet service type are major churn drivers.

Business Impact:

This analysis provides a data-driven foundation for:

  • Targeted customer retention campaigns.

  • Pricing or contract adjustments to minimize churn.

  • Predictive alerts to flag high-risk customers early.

Walking on Tiles

Let's create data driven success together

bottom of page