급하게 답을 구해야하는데 해결 방법을 찾지 못하여 여기에 글올립니다.
ㅜㅜ
아래와 같은 데이터가 있다고 가정할때
click은 항상 view와 동시 또는 항상 나중에 일어나야 합니다.
view한번에 click은 한번으로 매핑합니다.
UUID LOG_TIME EVENT
A 12:00:00 VIEW
A 12:00:02 CLICK
A 12:00:03 VIEW
A 12:00:04 VIEW
A 12:00:06 CLICK
A 12:00:07 CLICK
A 12:00:11 VIEW
A 12:00:12 VIEW
A 12:00:13 VIEW
A 12:00:14 VIEW
A 12:00:15 VIEW
A 12:00:15 CLICK
A 12:00:17 VIEW
A 12:00:18 VIEW
A 12:00:19 VIEW
A 12:00:20 VIEW
A 12:00:26 CLICK
A 12:00:27 CLICK
A 12:00:28 CLICK
A 12:00:29 VIEW
A 12:00:30 CLICK
A 12:00:34 VIEW
A 12:00:35 VIEW
A 12:00:36 VIEW
A 12:00:37 VIEW
A 12:00:41 CLICK
위테이블을 아래와 같이 변경합니다
UUID LOG_TIME EVENT CLICK
A 12:00:00 VIEW CLICK
A 12:00:03 VIEW null
A 12:00:04 VIEW CLICK
A 12:00:11 VIEW null
A 12:00:12 VIEW null
A 12:00:13 VIEW null
A 12:00:14 VIEW null
A 12:00:15 VIEW CLICK
A 12:00:17 VIEW null
A 12:00:18 VIEW null
A 12:00:19 VIEW null
A 12:00:20 VIEW CLICK
A 12:00:29 VIEW CLICK
A 12:00:34 VIEW null
A 12:00:35 VIEW null
A 12:00:36 VIEW null
A 12:00:37 VIEW CLICK
view와 view 사이에 클릭이 있을경우 항상 최상위 뷰에 맞게 클릭이 일어났다는걸 테이블로 표현 하려고 합니다.
시스템은 hive를사용중이고 건수는 200만건입니다
view와 click를 템프테이블에 각각 담아도 조인키가 없어서 조인도 되지 않고 난감합니다
도와주세요 ㅜㅜ
원본테이블
변환 테이블
하기 쿼리 참고하세요~
---쿼리시작---
;with tblA(UUID,LOG_TIME,EVENT) As
(
Select 'A','12:00:00','VIEW ' Union All
Select 'A','12:00:02','CLICK' Union All
Select 'A','12:00:03','VIEW ' Union All
Select 'A','12:00:04','VIEW ' Union All
Select 'A','12:00:06','CLICK' Union All
Select 'A','12:00:07','CLICK' Union All
Select 'A','12:00:11','VIEW ' Union All
Select 'A','12:00:12','VIEW ' Union All
Select 'A','12:00:13','VIEW ' Union All
Select 'A','12:00:14','VIEW ' Union All
Select 'A','12:00:15','VIEW ' Union All
Select 'A','12:00:15','CLICK' Union All
Select 'A','12:00:17','VIEW ' Union All
Select 'A','12:00:18','VIEW ' Union All
Select 'A','12:00:19','VIEW ' Union All
Select 'A','12:00:20','VIEW ' Union All
Select 'A','12:00:26','CLICK' Union All
Select 'A','12:00:27','CLICK' Union All
Select 'A','12:00:28','CLICK' Union All
Select 'A','12:00:29','VIEW ' Union All
Select 'A','12:00:30','CLICK' Union All
Select 'A','12:00:34','VIEW ' Union All
Select 'A','12:00:35','VIEW ' Union All
Select 'A','12:00:36','VIEW ' Union All
Select 'A','12:00:37','VIEW ' Union All
Select 'A','12:00:41','CLICK'
)
,tblB(UUID,LOG_TIME,EVENT,seq) As
(
Select a.*
,Row_Number() Over(Partition By a.UUID Order By a.LOG_TIME Asc,a.EVENT Desc)
From tblA a
)
Select a.UUID
,a.LOG_TIME
,a.EVENT
,b.EVENT
From tblB a
Left Outer Join
tblB b
On a.UUID = b.UUID
And a.seq = b.seq - 1
And b.EVENT = 'CLICK'
Where a.EVENT = 'VIEW'
---쿼리끝---