這篇文章主要介紹了臨時處理去重 80w 數據時夯死現象,需要的朋友可以參考下
近日,在對一張百萬數據的業務表進行去重時,去重操作竟然夯住了。下面就來簡單回憶一下。
1、查詢業務表數據量,查看到總共有200多w條
- SQL> select count(*) from tb_bj_banker_etl;
- 2552381
2、查詢表內應該去掉的重復數據量,共80多w條
- SQL> select count(*) from tb_bj_banker_etl where (id) in (select id from tb_bj_banker_etl group by id having count(*)>1) and rowid not in(select max(rowid) from tb_bj_banker_etl group by id having count(*)>1);
- 830099
3、于是,在晚上下班前,執行了下面的語句腳本,為了去重
- SQL> delete from tb_bj_banker_etl where(id) in (select id from tb_bj_banker_etl group by id having count(*)>1) and rowid not in(select max(rowid) from tb_bj_banker_etl group by id having count(*)>1);
- SQL> commit;
4、第二天,到達現場時,發現PL/SQL Developer工具中昨天晚上執行的語句仍在執行中
首先察覺,80多w的去重數據跑了一個晚上也沒跑完?這肯定是哪里出了問題?
懷疑有鎖表。
于是查詢是否有鎖表的用戶。
- SELECT
- A.OWNER, --OBJECT所屬用戶
- A.OBJECT_NAME, --OBJECT名稱
- B.XIDUSN,
- B.XIDSLOT,
- B.XIDSQN,
- B.SESSION_ID, --鎖表用戶的session
- B.ORACLE_USERNAME, --鎖表用戶的Oracle用戶名
- B.OS_USER_NAME, --鎖表用戶的操作系統登陸用戶名
- B.PROCESS,
- B.LOCKED_MODE,
- C.MACHINE, --鎖表用戶的計算機名稱
- C.STATUS, --鎖表狀態
- C.SERVER,
- C.SID,
- C.SERIAL#,
- C.PROGRAM --鎖表用戶所用的數據庫管理工具
- FROM
- ALL_OBJECTS A,
- V$LOCKED_OBJECT B,
- SYS.GV_$SESSION C
- WHERE
- A.OBJECT_ID = B.OBJECT_ID
- AND B.PROCESS = C.PROCESS
- ORDER BY 1,2
在下面結果中可以看到,鎖表的只是去重語句的發起會話,并沒有其它用戶造成鎖表,這說明語句仍然在執行嘛?帶著疑問,開始嘗試解決。
1 BJHYL tb_bj_banker_ETL 15 18 9000 913 BJHYL Administrator 4036:972 3 WORKGROUP/BACKDB ACTIVE DEDICATED 913 3381 plsqldev.exe
2 BJHYL tb_bj_banker_ETL 15 18 9000 913 BJHYL Administrator 4036:972 3 WORKGROUP/BACKDB INACTIVE DEDICATED 649 41791 plsqldev.exe
3 BJHYL tb_bj_banker_ETL 15 18 9000 913 BJHYL Administrator 4036:972 3 WORKGROUP/BACKDB INACTIVE DEDICATED 817 27777 plsqldev.exe
4 BJHYL tb_bj_banker_ETL 15 18 9000 913 BJHYL Administrator 4036:972 3 WORKGROUP/BACKDB INACTIVE DEDICATED 841 1981 plsqldev.exe
5、采用分批次,解決去重夯住問題
由于直接去重無法順利進行,于是想到了分批次去重的方法,試一下。
- 第一次:
- delete from tb_bj_banker_etl where(id) in (select id from tb_bj_banker_etl group by id having count(*)>1) and rowid not in(select max(rowid) from tb_bj_banker_etl group by id having count(*)>1) and rownum<=100000;
- commit;
- 第二次:
- delete from tb_bj_banker_etl where(id) in (select id from tb_bj_banker_etl group by id having count(*)>1) and rowid not in(select max(rowid) from tb_bj_banker_etl group by id having count(*)>1) and rownum<=100000;
- commit;
- 。。。。。。。
- 。。。。。。。
- 。。。。。。。
- 第八次:
- delete from tb_bj_banker_etl where(id) in (select id from tb_bj_banker_etl group by id having count(*)>1) and rowid not in(select max(rowid) from tb_bj_banker_etl group by id having count(*)>1);
- commit;
結果:通過將80多萬數據劃分成以10w數據為單次進行去重操作,總共用時140多秒,完成了去重80萬數據的目的。但為何直接處理出現夯死情況,有待后續跟蹤分析。
以上就是臨時處理去重80w數據時夯死現象的全部過程,希望可以幫到大家。
新聞熱點
疑難解答