SQL Challenge Problem #2 – Calculating Average Revenue Per User (ARPU)

Welcome back to another SQL challenge! Today, we will solve a problem using both Presto SQL and MySQL to calculate Average Revenue Per User (ARPU) for each product. This challenge will highlight differences between the two SQL dialects and their respective approaches.

The Problem

We have two tables:

users: Contains user-level data:
- user_id: Unique identifier for each user
- region: Region of the user
transactions: Contains transaction-level data:
- transaction_id: Unique identifier for each transaction
- user_id: Identifier of the user who made the transaction
- product_id: Product associated with the transaction
- amount: Revenue generated from the transaction

Goal: Calculate the Average Revenue Per User (ARPU) for each product

Presto SQL Solution

WITH product_metrics AS (
    SELECT
        product_id,
        SUM(amount) AS total_revenue,
        CARDINALITY(ARRAY_AGG(DISTINCT user_id)) AS unique_users
    FROM transactions
)
SELECT
    product_id,
    total_revenue,
    unique_users,
    (total_revenue / unique_users) AS average_revenue_per_user
FROM product_metrics

Explanation:

ARRAY_AGG(DISTINCT user_id)creates an array of unique user IDs for each product
CARDINALITY calculates the number of unique users from the array
The query calculates total revenue and ARPU directly in a single CTE

MySQL Solution

SELECT
    product_id,
    SUM(amount) AS total_revenue,
    COUNT(DISTINCT user_id) AS unique_users,
    (SUM(amount) / COUNT(DISTINCT user_id)) AS average_revenue_per_user
FROM transactions
GROUP BY product_id;

Explanation:

COUNT(DISTINCT user_id)directly computes the unique user count per product.
total revenue and ARPU are calculated in a single query without additional CTEs.

Key Differences

Handling Unique Users:
- Presto SQL uses ARRAY_AGG and CARDINALITY for unique user counts.
- MySQL uses COUNT(DISTINCT) directly for unique user counts.
Query Structure:
- Presto SQL leverages advanced array functions for flexibility in aggregation.
- MySQL simplifies the aggregation without requiring additional constructs.
Performance:
- Presto SQL is designed for distributed querying and handles large datasets more efficiently.
- MySQL, while simpler, may struggle with performance on massive datasets.

Key Takeaways

Presto SQL excels in handling complex operations using functions like ARRAY_AGG and CARDINALITY.
MySQL provides straightforward solutions for simpler aggregations using COUNT(DISTINCT).
Choose the right tool based on your data size and complexity requirements.

What’s Next

In next post, we will be solving questions based on best approach instead of Presto or MySQL comparision, as both seems to have some advantages and disadvantages. Therefore, it is better to focus on 1 instead of comparing apple and oranges down the road.

Stay tuned many more challenges ahead!

Spread the love

SQL Challenge Problem #2 – Calculating Average Revenue Per User (ARPU)

The Problem

Presto SQL Solution

Explanation:

MySQL Solution

Explanation:

Key Differences

Key Takeaways

What’s Next

Related Posts

Leave a Comment Cancel Reply