Get the highest answer rate question from a table survey_log
with these columns: uid, action, question_id, answer_id, q_num, timestamp.
uid means user id; action has these kind of values: "show", "answer", "skip"; answer_id is not null when action column is "answer", while is null for "show" and "skip"; q_num is the numeral order of the question in current session.
Write a sql query to identify the question which has the highest answer rate.
Example:
Input: +------+-----------+--------------+------------+-----------+------------+ | uid | action | question_id | answer_id | q_num | timestamp | +------+-----------+--------------+------------+-----------+------------+ | 5 | show | 285 | null | 1 | 123 | | 5 | answer | 285 | 124124 | 1 | 124 | | 5 | show | 369 | null | 2 | 125 | | 5 | skip | 369 | null | 2 | 126 | +------+-----------+--------------+------------+-----------+------------+ Output: +-------------+ | survey_log | +-------------+ | 285 | +-------------+ Explanation: question 285 has answer rate 1/1, while question 369 has 0/1 answer rate, so output 285.
Note: The highest answer rate meaning is: answer number's ratio in show number in the same question.
Solution
Approach I: Using sub-query and SUM()
function [Accepted]
Intuition
Calculate the answered times / show times for each question.
Algorithm
First, we can use SUM()
to get the total number of answered times as well as the show times for each question using a sub-query as below.
SELECT question_id, SUM(CASE WHEN action = 'answer' THEN 1 ELSE 0 END) AS num_answer, SUM(CASE WHEN action = 'show' THEN 1 ELSE 0 END) AS num_show FROM survey_log GROUP BY question_id ;
| question_id | num_answer | num_show | |-------------|------------|----------| | 285 | 1 | 1 | | 369 | 0 | 1 |
Then we can calculate the answer rate by its definition.
MySQL
SELECT question_id as survey_log FROM ( SELECT question_id, SUM(case when action="answer" THEN 1 ELSE 0 END) as num_answer, SUM(case when action="show" THEN 1 ELSE 0 END) as num_show, FROM survey_log GROUP BY question_id ) as tbl ORDER BY (num_answer / num_show) DESC LIMIT 1
Approach II: Using sub-query and COUNT(IF...)
function [Accepted]
Algorithm
This solution is very straight forward: use the COUNT()
function to sum the answer and show time combining with the IF()
function.
MySQL
SELECT question_id AS 'survey_log' FROM survey_log GROUP BY question_id ORDER BY COUNT(answer_id) / COUNT(IF(action = 'show', 1, 0)) DESC LIMIT 1;