数据库

本类阅读TOP10

·SQL语句导入导出大全
·SQL Server日期计算
·SQL语句导入导出大全
·SQL to Excel 的应用
·Oracle中password file的作用及说明
·MS SQLServer OLEDB分布式事务无法启动的一般解决方案
·sqlserver2000数据库置疑的解决方法
·一个比较实用的大数据量分页存储过程
·如何在正运行 SQL Server 7.0 的服务器之间传输登录和密码
·SQL中两台服务器间使用连接服务器

分类导航
VC语言Delphi
VB语言ASP
PerlJava
Script数据库
其他语言游戏开发
文件格式网站制作
软件工程.NET开发
数据库性能优化分析案例---解决SQL语句过度消耗CPU问题

作者:未知 来源:月光软件站 加入时间:2005-2-28 月光软件站

问题描述:
10月25日上午滨州网通的工程师报告OSS应用系统运行缓慢,具体操作是通过OSS系统查询话单,很长时间才能返回结果,严重影响了客户的正常使用。

问题处理:
1.登陆数据库主机,用sar命令看到idle的值持续为0,CPU的资源已经耗尽:

bz_db1# sar 2 4

SunOS kest 5.8 Generic_108528-19 sun4u    10/26/04

10:56:46    %usr    %sys    %wio   %idle
10:56:48       1       4      95       0
10:56:50       1       5      94       0
10:56:52       0       6      93       0
10:56:54       1       6      93       0

Average        1       5      94      0


2.使用TOP命令看到有两个明显占用CPU利用率过高的进程,以下是top命令的结果:

bz_db1# top

last pid:  1664;load averages:  3.26,  3.24,  3.69                       
159 processes: 152 sleeping, 2 running, 2 zombie, 1 stopped, 2 on cpu
CPU states:  1.5% idle, 72.5% user, 17.9% kernel,  8.0% iowait,  0.0% swap
Memory: 2.0G real, 233M free, 2.0G swap in use, 3.4G swap free

   PID USERNAME THR PR NCE  SIZE   RES STATE   TIME FLTS    CPU COMMAND
 27420 oracle     1 10   0  1.3G  1.2G cpu01  22.9H    2 31.94% oracle
 27418 oracle     1 10   0  1.3G  1.2G run    23.0H    6 26.86% oracle
  5943 oracle     1 59   0  1.3G  1.2G sleep  25:26   37  4.92% oracle
  6295 oracle     1 55   0  1.3G  1.2G run    25:14   74  4.90% oracle
  7778 oracle     1 43   0  1.3G  1.2G sleep  11:43  110  4.86% oracle
 13270 oracle     1 59   0  1.3G  1.2G sleep 210.6H    0  0.96% oracle
 13056 oracle     1 48   0  1.3G  1.2G sleep 303:30    0  0.37% oracle
 10653 root       1 58   0 2560K 1624K cpu00   0:00    0  0.32% top
 18827 oracle     1 58   0  1.3G  1.2G sleep  18.4H    0  0.31% oracle
 12748 oracle   258 58   0  1.3G  1.2G sleep 555:14    0  0.21% oracle
 10634 oracle     1 59   0  1.3G  1.2G sleep   0:01    0  0.21% oracle
 28458 oracle     1 58   0  1.3G  1.2G sleep 535:02    0  0.18% oracle
 13075 oracle     1 59   0  1.3G  1.2G sleep 326:33    0  0.15% oracle
 13173 oracle     1 58   0  1.3G  1.2G sleep 593:07    0  0.13% oracle
  4927 oracle     1 59   0  1.3G  1.2G sleep  33.4H    0  0.11% oracle

可以看到这两个进程号分别是27420和27418.

3.捕获占用CPU利用率过高的SQL语句:

以下用到了我总结的SQL语句:

SQL>set line 240
SQL>set verify off
SQL>column sid format 999
SQL>column pid format 999
SQL>column S_# format 999
SQL>column username format A9 heading "ORA User"
SQL>column program  format a29
SQL>column SQL format a60
SQL>COLUMN OSname format a9 Heading "OS User"
SQL>SELECT P.pid pid,S.sid sid,P.spid spid,S.username username,
S.osuser osname,P.serial# S_#,P.terminal,P.program  program,
P.background,S.status,RTRIM(SUBSTR(a.sql_text, 1, 80))  SQL
FROM v$process P, v$session S,v$sqlarea A WHERE P.addr = s.paddr
AND S.sql_address = a.address (+)  AND P.spid LIKE '%&1%';

Enter value for 1: 27420(注意这里应输入占用CPU最高的进程对应的PID)

得到以下SQL语句:

Select nvl(sum(LOCALCHARGE),0),nvl(sum(usage),0) from LOCALUSAGE where to_char(ENDTIME,'YYYYMMDD')=20041016
and LOCALCHARGE>0 and caller like '0543886%';

27418进程对应的SQL语句如下:
select nvl(sum(LOCALCHARGE),0) from LOCALUSAGE where to_char(ENDTIME,'YYYYMMDD')=20041016 and caller like '0543888%';


4.使用相关用户连接到数据库,检查其执行计划:
SQL>connect wacos/oss
Connected.

SQL>@?/rdbms/admin/utlxplan.sql
Table created.

SQL>set autotrace on

SQL>set timing on

SQL>Select nvl(sum(LOCALCHARGE),0),nvl(sum(usage),0) from LOCALUSAGE where to_char(ENDTIME,'YYYYMMDD')=20041016
 and LOCALCHARGE>0 and caller like '0543886%';

NVL(SUM(LOCALCHARGE),0) NVL(SUM(USAGE),0)
----------------------- -----------------
                      0                 0

Elapsed: 00:02:56.37

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=13435 Card=1 Bytes=5
          3)

   1    0   SORT (AGGREGATE)
   2    1     PARTITION RANGE (ALL)
   3    2       TABLE ACCESS (FULL) OF 'LOCALUSAGE' (Cost=13435 Card=1
          81 Bytes=9593)

Statistics
----------------------------------------------------------
        258  recursive calls
          0  db block gets
      88739  consistent gets
      15705  physical reads
          0  redo size
        580  bytes sent via SQL*Net to client
        651  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          8  sorts (memory)
          0  sorts (disk)
          1  rows processed

发现对localusage表做了全表扫描,什么记录也没有返回居然用了2分多钟。

SQL> select nvl(sum(LOCALCHARGE),0) from LOCALUSAGE where to_char(ENDTIME,'YYYYMMDD')=20040816 and caller like '0543888%';

NVL(SUM(LOCALCHARGE),0)
-----------------------
                   27.6

Elapsed: 00:03:56.46

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=13435 Card=1 Bytes=4
          0)

   1    0   SORT (AGGREGATE)
   2    1     PARTITION RANGE (ALL)
   3    2       TABLE ACCESS (FULL) OF 'LOCALUSAGE' (Cost=13435 Card=3
          615 Bytes=144600)

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
      88588  consistent gets
      15615  physical reads
          0  redo size
        507  bytes sent via SQL*Net to client
        651  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

这个SQL语句有结果返回,发现也是对localusage表做了全表扫描,但速度也很慢,用了3分多钟。

SQL> select count(*) from localusage;

  COUNT(*)
----------
   5793776

该表有579万多条记录,数据量很大,全表扫描已经不再适合。

5.检查该表的类型:

SQL> SELECT INDEX_NAME, TABLE_NAME, STATUS, PARTITIONED FROM USER_INDEXES WHERE TABLE_NAME='LOCALUSAGE';

INDEX_NAME                     TABLE_NAME                     STATUS   PAR
------------------------------ ------------------------------ -------- ---
I_LOCALUSAGE_SID               LOCALUSAGE                     N/A      YES
UI_LOCALUSAGE_ST_SEQ           LOCALUSAGE                     N/A      YES

SQL> SELECT index_name,table_name,locality FROM user_part_indexes where table_name='LOCALUSAGE';

INDEX_NAME                     TABLE_NAME                     LOCALI
------------------------------ ------------------------------ ------
I_LOCALUSAGE_SID               LOCALUSAGE                     LOCAL
UI_LOCALUSAGE_ST_SEQ           LOCALUSAGE                     LOCAL

发现该表是分区表,并在SERVICEID,STARTIME和CDRSEQUENCE列上建立了分区索引,索引类型是local索引。

6.查看分区索引的索引键值:

SQL> select INDEX_NAME,COLUMN_NAME,INDEX_OWNER from dba_ind_columns where TABLE_NAME='LOCALUSAGE';

INDEX_NAME           COLUMN_NAME          INDEX_OWNER
-------------------- -------------------- ------------------------------
I_LOCALUSAGE_SID     SERVICEID            WACOS
UI_LOCALUSAGE_ST_SEQ STARTTIME            WACOS
UI_LOCALUSAGE_ST_SEQ CDRSEQUENCE          WACOS

发现在endtime和caller列上都没有建立索引,这也是导致SQL语句做全表扫描的最终原因。

7.决定创建新的分区索引以消除全表扫描:

(1).首先查看localusage表分区情况:

SQL> select PARTITION_NAME,tablespace_name from user_tab_partitions where table_name='LOCALUSAGE';

PARTITION_NAME                 TABLESPACE_NAME
------------------------------ ------------------------------
LOCALUSAGE_200312              WACOS
LOCALUSAGE_200401              WACOS
LOCALUSAGE_200402              WACOS
LOCALUSAGE_200404              WACOS
LOCALUSAGE_200405              WACOS
LOCALUSAGE_200406              WACOS
LOCALUSAGE_200407              WACOS
LOCALUSAGE_200409              WACOS
LOCALUSAGE_200410              WACOS
LOCALUSAGE_200411              WACOS
LOCALUSAGE_200403              WACOS
LOCALUSAGE_200408              WACOS
LOCALUSAGE_200412              WACOS

13 rows selected.

(2).在caller列上创建local分区索引:
SQL>set timing on
SQL>create index I_LOCALUSAGE_CALLER on localusage(caller)
LOCAL
(      
        PARTITION LOCALUSAGE_200312,
 PARTITION LOCALUSAGE_200401,
 PARTITION LOCALUSAGE_200402,
 PARTITION LOCALUSAGE_200404,
 PARTITION LOCALUSAGE_200405,
 PARTITION LOCALUSAGE_200406,
 PARTITION LOCALUSAGE_200407,
 PARTITION LOCALUSAGE_200409,
 PARTITION LOCALUSAGE_200410,
 PARTITION LOCALUSAGE_200411,
 PARTITION LOCALUSAGE_200403,
 PARTITION LOCALUSAGE_200408,
 PARTITION LOCALUSAGE_200412
)
TABLESPACE wacos
STORAGE(
 INITIAL 6553600
 NEXT 6553600
 MAXEXTENTS unlimited
 PCTINCREASE 0)
 PCTFREE 5
 NOLOGGING;

Index created.

Elapsed: 00:06:27.90  (由于数据量比较大,耗时6分钟)

8.再次查看执行计划:
SQL>Select nvl(sum(LOCALCHARGE),0),nvl(sum(usage),0) from LOCALUSAGE where to_char(ENDTIME,'YYYYMMDD')=20041016
and LOCALCHARGE>0  and caller like '0543886%';

NVL(SUM(LOCALCHARGE),0) NVL(SUM(USAGE),0)
----------------------- -----------------
                      0                 0

Elapsed: 00:00:03.00

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=22 Card=1 Bytes=53)
   1    0   SORT (AGGREGATE)
   2    1     PARTITION RANGE (ALL)
   3    2       TABLE ACCESS (BY LOCAL INDEX ROWID) OF 'LOCALUSAGE' (Cost=22 Card=181 Bytes=9593)
   4    3         INDEX (RANGE SCAN) OF 'I_LOCALUSAGE_CALLER' (NON-UNIQUE) (Cost=14 Card=65063)

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
      16813  consistent gets
        569  physical reads
          0  redo size
        580  bytes sent via SQL*Net to client
        651  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

这次走了索引后速度明显快多了,用了3秒钟就返回了结果。

SQL>select nvl(sum(LOCALCHARGE),0) from LOCALUSAGE where to_char(ENDTIME,'YYYYMMDD')=20040816 and caller like '0543888%';

NVL(SUM(LOCALCHARGE),0)
-----------------------
                   27.6

Elapsed: 00:00:24.73

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=22 Card=1 Bytes=40)
   1    0   SORT (AGGREGATE)
   2    1     PARTITION RANGE (ALL)
   3    2       TABLE ACCESS (BY LOCAL INDEX ROWID) OF 'LOCALUSAGE' (Cost=22 Card=3615 Bytes=144600)
   4    3         INDEX (RANGE SCAN) OF 'I_LOCALUSAGE_CALLER' (NON-UNIQUE) (Cost=14 Card=65063)

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
     129336  consistent gets
       7241  physical reads
          0  redo size
        507  bytes sent via SQL*Net to client
        651  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

这个SQL语句走了索引,用了24秒钟返回结果,性能明显提高了很多。




相关文章

相关软件