题目:现在运营想要查看用户在某天刷题后第二天还会再来刷题的平均概率。请你取出相应数据。 示例:question_practice_detail id device_id quest_id result date 1 2138 111 wrong 2021-05-03 2 3214 112 wrong 2021-05-09 3 3214 113 wrong 2021-06-15 4 6543 111 right 2021-08-13 5 2315 115 right 2021-08-13 6 2315 116 right 2021-08-14 7 2315 117 wrong 2021-08-15 …… 根据示例,你的查询应返回以下结果: avg_ret 0.3000
区块链毕设网qklbishe.com为您提供问题的解答
id | device_id | quest_id | result | date |
1 | 2138 | 111 | wrong | 2021-05-03 |
2 | 3214 | 112 | wrong | 2021-05-09 |
3 | 3214 | 113 | wrong | 2021-06-15 |
4 | 6543 | 111 | right | 2021-08-13 |
5 | 2315 | 115 | right | 2021-08-13 |
6 | 2315 | 116 | right | 2021-08-14 |
7 | 2315 | 117 | wrong | 2021-08-15 |
…… |
avg_ret |
0.3000 |
SELECT COUNT(distinct q2.device_id,q2.date)/count(DISTINCT q1.device_id,q1.date) as avg_ret from question_practice_detail as q1 left outer join question_practice_detail as q2 on q1.device_id=q2.device_id and DATEDIFF(q2.date,q1.date)=1
思路是:
select avg(if(b.device_id is not null,1,0)) as avg_ret from (select distinct device_id,date from question_practice_detail )a left join ( select distinct device_id,date_sub(date,interval 1 day) as date from question_practice_detail )b on a.device_id = b.device_id and a.date = b.date
DATE_SUB() 函数从日期减去指定的时间间隔。
代码中就是减去一天
date_sub(date,interval 1 day)
select count(t2.date)/count(t1.date) from (select distinct device_id,date from question_practice_detail) t1 left join (select distinct device_id,date from question_practice_detail) t2 on t1.device_id = t2.device_id and t2.date = date_add(t1.date,interval 1 day)
思路:表的自联结
SELECT COUNT(b.device_id)/ COUNT(a.device_id) FROM( SELECT DISTINCT device_id,date FROM question_practice_detail ) as a LEFT JOIN ( SELECT DISTINCT device_id,DATE_SUB(date,interval 1 day) as date FROM question_practice_detail ) as b ON a.date=b.date AND a.device_id=b.device_id
select count(id) / (select count(distinct device_id,date) from question_practice_detail) as avg_ret from (SELECT a.device_id as id ,a.date as date1 ,lead(a.date,1) over(partition by device_id order by a.date) as date2 from (select distinct device_id,date from question_practice_detail) a ) b where DATEDIFF(date2,date1) =1
SELECT COUNT(q2.device_id) / COUNT(q1.device_id) AS avg_ret FROM (SELECT DISTINCT device_id, date FROM question_practice_detail)as q1 LEFT JOIN (SELECT DISTINCT device_id, date FROM question_practice_detail) AS q2 ON q1.device_id = q2.device_id AND q2.date = DATE_ADD(q1.date, interval 1 day)
select count(distinct b.device_id,b.date)/count(distinct a.device_id,a.date) as avg_ret from question_practice_detail a left JOIN question_practice_detail b on a.date = b.date-1 and a.device_id = b.device_id
先用左连接,可以得到第一天回答但是第二天没有回答的记录,然后通过count,并且通过和device_id和日期进行去重
select count(date2)/count(date) as avg_ret from
#第一天用户
(select distinct device_id,date from question_practice_detail) a left join
(select distinct device_id,date_sub(date,interval 1 day) date2 from question_practice_detail) b on a.device_id=b.device_id and a.date=b.date2;
#只有在满足两个限定条件的情况下,date2才能匹配出数值,不能匹配的情况下,即第二天没有刷题,date2 is null
#计算用户的平均次日留存率
这道题确实有些难,吐槽一下真的要实现这个功能我宁愿写两个SQL查出来再进行计算,性能差也比花半天时间来写强
思路
- 首先要搞明白这个概率的计算,不然无从下手
概率=(去重后的用户有连续两天刷题记录次数)/(去重日期用户后刷题记录次数) - 最好先写成两句SQL,分别查出分母和分子的记录数,然后用子查询结合起来就好了
SQL1
1.1 先来找有连续两天刷题的记录,显然单靠本表查询是做不到的,这里应该用到自连接
1.2 自关联后要进行根据日期和device_id进行进一步筛选,然后求count得到分子
(这里建议将案例sql复制到MySQL数据库一点点查着写,就清楚了,语言描述还是苍白无力)select count(distinct q3.date,q3.device_id) from (select q1.device_id,q1.date from question_practice_detail q1 left join question_practice_detail q2 on q1.device_id = q2.device_id where date_add(q1.date,interval 1 day)=q2.date) q3;
sql2
2.1 对原表进行查询,采用相同的手段进行去重,最后再求count得到分母
select count(DISTINCT date,device_id) from question_practice_detail;
子查询
这题之所以用子查询是因为非要一个SQL计算出结果的无奈之举。目的是为了把上面两句SQL结果在一句SQL用起来进行计算。我是将SQL1作为子SQL的。
ps:子查询最好多结合别名来使用select q5.times/count(DISTINCT q4.date,q4.device_id) avg_ret from question_practice_detail q4, (select count(distinct q3.date,q3.device_id) times from (select q1.device_id,q1.date from question_practice_detail q1 left join question_practice_detail q2 on q1.device_id = q2.device_id where date_add(q1.date,interval 1 day)=q2.date) q3) q5;
from (select distinct device_id,date from question_practice_detail) t1
inner join (select distinct device_id,date from question_practice_detail) t2
on t1.device_id=t2.device_id
where datediff(t2.date,t1.date)=1)/(select count(*) from (select distinct device_id,date from question_practice_detail) t3) avg_ret
avg(if(b.device_id is not null,1,0)) as avg_ret
可写为公式2:(count:已知拆分为两个表,如果b表中的device_id不为空值,记为1/总数)
count(if(b.device_id is not null,1,null)) / count(a.device_id) as avg_ret
框架1中已解释:b表中含日期的为连续刷题用户,没有连续刷题用户为null,在根据公式2进行理解。 公式3:count(b.device_id)/ count(a.device_id)
select distinct device_id, date n_date, dense_rank()over(partition by device_id order by date) t_r from question_practice_detail) a) rt
再用偏移窗口函数把上面的日期顺序向下偏移一格,组合起来就是
select a.device_id,a.n_date, lead(a.n_date,1)over(partition by device_id order by a.n_date) t_date from (select distinct device_id, date n_date, dense_rank()over(partition by device_id order by date) t_r from question_practice_detail) a
最后进行判断即可,完整公式如下:
select sum(if(rt.n_date=date_sub(rt.t_date,interval 1 day),1,0))/count(*) from (select a.device_id,a.n_date, lead(a.n_date,1)over(partition by device_id order by a.n_date) t_date from (select distinct device_id, date n_date, dense_rank()over(partition by device_id order by date) t_r from question_practice_detail) a) rt
第一步我们把同一天的同一用户去重,
第二步我们做有条件的sum求和,
从这个新的表里拿出每一天的数据,看它之后的一天是否也在原表中有数据
有则代表他次日留存了,记1,没有则记0,
第三步我们把留存数除以访问总数count(*)即可
select avg(if(date_add(date, interval 1 day) in (select date from question_practice_detail t2 where t1.device_id = t2.device_id),1,0)) as avg_ret from (select distinct device_id, date from question_practice_detail) t1
以上就是关于问题题目:现在运营想要查看用户在某天刷题后第二天还会再来刷题的平均概率。请你取出相应数据。
示例:question_practice_detail id device_id quest_id result date 1 2138 111 wrong 2021-05-03 2 3214 112 wrong
2021-05-09 3 3214 113 wrong
2021-06-15 4 6543 111 right 2021-08-13 5 2315 115 right
2021-08-13 6 2315 116 right
2021-08-14 7 2315 117 wrong
2021-08-15 ……
根据示例,你的查询应返回以下结果: avg_ret 0.3000的答案
欢迎关注区块链毕设网-
专业区块链毕业设计成品源码,定制。
区块链NFT链游项目方科学家脚本开发培训
从业7年-专注一级市场
微信:btc9767
TELEGRAM :https://t.me/btcok9
具体资料介绍
web3的一级市场千万收益的逻辑
进群点我
qklbishe.com区块链毕设代做网专注|以太坊fabric-计算机|java|毕业设计|代做平台-javagopython毕设 » 题目:现在运营想要查看用户在某天刷题后第二天还会再来刷题的平均概率。请你取出相应数据。
示例:question_practice_detail id device_id quest_id result date 1 2138 111 wrong 2021-05-03 2 3214 112 wrong
2021-05-09 3 3214 113 wrong
2021-06-15 4 6543 111 right 2021-08-13 5 2315 115 right
2021-08-13 6 2315 116 right
2021-08-14 7 2315 117 wrong
2021-08-15 ……
根据示例,你的查询应返回以下结果: avg_ret 0.3000
微信:btc9767
TELEGRAM :https://t.me/btcok9
具体资料介绍
web3的一级市场千万收益的逻辑
进群点我
qklbishe.com区块链毕设代做网专注|以太坊fabric-计算机|java|毕业设计|代做平台-javagopython毕设 » 题目:现在运营想要查看用户在某天刷题后第二天还会再来刷题的平均概率。请你取出相应数据。
示例:question_practice_detail id device_id quest_id result date 1 2138 111 wrong 2021-05-03 2 3214 112 wrong
2021-05-09 3 3214 113 wrong
2021-06-15 4 6543 111 right 2021-08-13 5 2315 115 right
2021-08-13 6 2315 116 right
2021-08-14 7 2315 117 wrong
2021-08-15 ……
根据示例,你的查询应返回以下结果: avg_ret 0.3000
进群点我
qklbishe.com区块链毕设代做网专注|以太坊fabric-计算机|java|毕业设计|代做平台-javagopython毕设 » 题目:现在运营想要查看用户在某天刷题后第二天还会再来刷题的平均概率。请你取出相应数据。
示例:question_practice_detail id device_id quest_id result date 1 2138 111 wrong 2021-05-03 2 3214 112 wrong
2021-05-09 3 3214 113 wrong
2021-06-15 4 6543 111 right 2021-08-13 5 2315 115 right
2021-08-13 6 2315 116 right
2021-08-14 7 2315 117 wrong
2021-08-15 ……
根据示例,你的查询应返回以下结果: avg_ret 0.3000
qklbishe.com区块链毕设代做网专注|以太坊fabric-计算机|java|毕业设计|代做平台-javagopython毕设 » 题目:现在运营想要查看用户在某天刷题后第二天还会再来刷题的平均概率。请你取出相应数据。 示例:question_practice_detail id device_id quest_id result date 1 2138 111 wrong 2021-05-03 2 3214 112 wrong 2021-05-09 3 3214 113 wrong 2021-06-15 4 6543 111 right 2021-08-13 5 2315 115 right 2021-08-13 6 2315 116 right 2021-08-14 7 2315 117 wrong 2021-08-15 …… 根据示例,你的查询应返回以下结果: avg_ret 0.3000