Archive

Archive for the ‘Oracle’ Category

SQL优化之警惕死循环

July 22nd, 2009 No comments

昨天接到一个任务:一个报表相关sql语句需要优化,原来执行时间几分钟,现在半小时也出不来。

登录系统后,查看sql语句执行计划未发现问题;

会话A中执行语句,A会话的状态为inactive,相关等待事件为SQL*Net message from client;

上面的现象比较怪异,值得怀疑;

进一步查看sql语句,发现select后面有部分列使用了自定义函数,试图去掉自定义函数相关字段,查询,很快正常返回结果;

到这里可以断定是函数导致的问题,经过进一步针对自定义函数以及表中相关列数据的排查,最后确定问题的原因:

对于表中一条特殊数据,自定义函数进入死循环,故sql语句长时间未返回。

问题函数,看看死循环在哪呢:


create or replace function P_StringReplace(Src in varchar2) return varchar2 as
Result varchar2(2000) := '' ;
tempstr varchar2(2000) := '';
tempstr2 varchar2(2000) := '';
begin
tempstr := '';
tempstr2 := '';
if (Src is not null) then
begin
tempstr := substr(Src,instr(Src,'=')+1);
loop
if instr(tempstr,';')<=0 then
exit;
end if;
tempstr2 := substr(tempstr,instr(tempstr,';'),instr(tempstr,'=')-instr(tempstr,';')+1);
tempstr := replace(tempstr, tempstr2, '/');
end loop;
Result := tempstr;
end;
else
begin
Result := '无';
end;
end if;

return(Result);
end P_StringReplace;
Categories: Oracle Tags: ,

debian i686 oracledb install tips

July 21st, 2009 No comments

apt-get install make gcc g++ libc6-dev libstdc++5 libaio1 unzip
mkdir /etc/rc.d
for i in 0 1 2 3 4 5 6 S ; do ln -s /etc/rc$i.d /etc/rc.d/rc$i.d ; done
ln -s /usr/bin/awk /bin/awk
ln -s /usr/bin/basename /bin/basename

Categories: Linux, Oracle Tags: , ,

ORACLE 10G install steps on ubuntu 8.10 x64 server

July 20th, 2009 No comments

安装过程中遇到不少问题,记录一下:

wayne@wayne-laptop:~/.ssh$ ssh -X holly@10.8.66.201

holly@10.8.66.201′s password:

Linux ubuntu108 2.6.24-19-server #1 SMP Wed Jun 18 14:44:47 UTC 2008 x86_64

The programs included with the Ubuntu system are free software;

the exact distribution terms for each program are described in the

individual files in /usr/share/doc/*/copyright.

Ubuntu comes with ABSOLUTELY NO WARRANTY, to the extent permitted by

applicable law.

To access official Ubuntu documentation, please visit:

http://help.ubuntu.com/

Last login: Mon Jul 20 09:34:34 2009 from 10.8.10.154

holly@ubuntu108:~$ sudo su -

root@ubuntu108:~#

root@ubuntu108:~# hostname

ubuntu108

root@ubuntu108:~# apt-get update

……

root@ubuntu108:~# apt-get install build-essential libaio1 gawk ksh libmotif3 alien libtool lsb-rpm

……

root@ubuntu108:~# cd /bin

root@ubuntu108:/bin# ls -l /bin/sh

lrwxrwxrwx 1 root root 4 Jul 17 22:33 /bin/sh -> dash

root@ubuntu108:/bin# ln -sf bash /bin/sh

root@ubuntu108:/bin# ls -l /bin/sh

lrwxrwxrwx 1 root root 4 Jul 20 09:45 /bin/sh -> bash

root@ubuntu108:/bin# addgroup oinstall

Adding group `oinstall’ (GID 1001) …

Done.

root@ubuntu108:/bin# addgroup dba

Adding group `dba’ (GID 1002) …

Done.

root@ubuntu108:/bin# addgroup nobody

Adding group `nobody’ (GID 1003) …

Done.

root@ubuntu108:/bin# usermod -g nobody nobody

root@ubuntu108:/bin# useradd -g oinstall -G dba -p oracle -d /home/oracle -s /bin/bash oracle

root@ubuntu108:/bin# passwd -l oracle

Password changed.

root@ubuntu108:/bin# mkdir /home/oracle

root@ubuntu108:/bin# chown -R oracle:dba /home/oracle

root@ubuntu108:/bin# ln -s /usr/bin/awk /bin/awk

root@ubuntu108:/bin# ln -s /usr/bin/rpm /bin/rpm

root@ubuntu108:/bin# ln -s /usr/bin/basename /bin/basename

root@ubuntu108:/bin# ln -s /lib/libgcc_s.so.1 /lib/libgcc_s.so

root@ubuntu108:/bin# mkdir /etc/rc.d

root@ubuntu108:/bin# for i in 0 1 2 3 4 5 6 S ; do ln -s /etc/rc$i.d /etc/rc.d/rc$i.d ; done

root@ubuntu108:/bin# mkdir -p /oracle

root@ubuntu108:/bin# chown -R oracle:dba /oracle

root@ubuntu108:/bin# echo “fs.file-max = 65536″ >> /etc/sysctl.conf

root@ubuntu108:/bin# echo “kernel.shmall = 2097152″ >> /etc/sysctl.conf

root@ubuntu108:/bin# echo “kernel.shmmax = 2147483648″ >> /etc/sysctl.conf

root@ubuntu108:/bin# echo “kernel.shmmni = 4096″ >> /etc/sysctl.conf

root@ubuntu108:/bin# echo “kernel.sem = 250 32000 100 128″ >> /etc/sysctl.conf

root@ubuntu108:/bin# echo “net.ipv4.ip_local_port_range = 1024 65535″ >> /etc/sysctl.conf

root@ubuntu108:/bin# echo “net.core.rmem_default = 262144″ >> /etc/sysctl.conf

root@ubuntu108:/bin# echo “net.core.rmem_max = 262144″ >> /etc/sysctl.conf

root@ubuntu108:/bin# echo “net.core.wmem_max = 262144″ >> /etc/sysctl.conf

root@ubuntu108:/bin# echo “net.core.wmem_default = 262144″ >> /etc/sysctl.conf

root@ubuntu108:/bin# echo “oracle soft nproc 2047″ >> /etc/security/limits.conf

root@ubuntu108:/bin# echo “oracle hard nproc 16384″ >> /etc/security/limits.conf

root@ubuntu108:/bin# echo “oracle soft nfile 1024″ >> /etc/security/limits.conf

root@ubuntu108:/bin# echo “oracle hard nofile 65530″ >> /etc/security/limits.conf

root@ubuntu108:/bin# sysctl -p

kernel.printk = 4 4 1 7

kernel.maps_protect = 1

fs.inotify.max_user_watches = 524288

vm.mmap_min_addr = 65536

net.ipv4.conf.default.rp_filter = 1

net.ipv4.conf.all.rp_filter = 1

fs.file-max = 65536

kernel.shmall = 2097152

kernel.shmmax = 2147483648

kernel.shmmni = 4096

kernel.sem = 250 32000 100 128

net.ipv4.ip_local_port_range = 1024 65535

net.core.rmem_default = 262144

net.core.rmem_max = 262144

net.core.wmem_max = 262144

net.core.wmem_default = 262144

root@ubuntu108:/bin# su – oracle

Your account has expired; please contact your system administrator

su: User account has expired

(Ignored)

oracle@ubuntu108:~$ pwd

/home/oracle

oracle@ubuntu108:~$ wget ftp://usr:pwd@10.8.2.70/Oracle/10g/10201_database_linux_x86_64.cpio.gz–10:06:07– ftp://hollycrm:*password*@10.8.2.70/

……

oracle@ubuntu108:~$ ls

10201_database_linux_x86_64.cpio.gz

oracle@ubuntu108:~$ gunzip 10201_database_linux_x86_64.cpio.gz

oracle@ubuntu108:~$ ls

10201_database_linux_x86_64.cpio

oracle@ubuntu108:~$ cpio -idmv < 10201_database_linux_x86_64.cpio

……

oracle@ubuntu108:~$ ls

10201_database_linux_x86_64.cpio database

oracle@ubuntu108:~$ cd database/

oracle@ubuntu108:~/database$ ./runInstaller -help

./runInstaller: line 52: /home/oracle/database/install/.oui: No such file or directory

oracle@ubuntu108:~/database$ cd install

oracle@ubuntu108:~/database/install$ ls -al

total 364

drwxr-xr-x 5 oracle oinstall 4096 2009-07-20 10:11 .

drwxr-xr-x 6 oracle oinstall 4096 2009-07-20 10:11 ..

-rwxr-xr-x 1 oracle oinstall 28 2005-10-23 15:49 addLangs.sh

-rwxr-xr-x 1 oracle oinstall 76 2005-10-23 15:49 addNode.sh

drwxr-xr-x 2 oracle oinstall 4096 2009-07-20 10:11 images

-rwxr-xr-x 1 oracle oinstall 46448 2005-10-23 15:49 lsnodes

-rwxr-xr-x 1 oracle oinstall 2268 2005-10-23 15:49 oneclick.properties

-rwxr-xr-x 1 oracle oinstall 2387 2005-10-23 15:49 oraparam.ini

-rwxr-xr-x 1 oracle oinstall 6428 2005-10-23 15:49 oraparamsilent.ini

-rwxr-xr-x 1 oracle oinstall 163185 2005-10-23 15:49 .oui

drwxr-xr-x 2 oracle oinstall 4096 2009-07-20 10:11 resource

drwxr-xr-x 2 oracle oinstall 4096 2009-07-20 10:11 response

-rwxr-xr-x 1 oracle oinstall 102612 2005-10-23 15:49 unzip

–这里遇到这个问题,具体原因是由于Oracle数据库64位安装介质使用了32位的安装程序,而系统中没有32位包。

root@ubuntu108:/usr/X11R6# apt-get install ia32-libs

Reading package lists… Done

……

ldconfig deferred processing now taking place

oracle@ubuntu108:/oracle/database$ export DISPLAY=10.8.10.154:0.0

oracle@ubuntu108:/oracle/database$ export LANG=us_EN.UTF-8

oracle@ubuntu108:/oracle/database$ ./runInstaller -ignoreSysPrereqs -record -destinationFile /oracle/orainstall.rsp

……

oracle@ubuntu108:/oracle/database$ ./runInstaller -ignoreSysPrereqs -silent -responseFile /oracle/orainstall.rsp

……

WARNING:A new inventory has been created in this session. However, it has not yet been registered as the central inventory of this system.

To register the new inventory please run the script ‘/home/oracle/oraInventory/orainstRoot.sh’ with root privileges.

If you do not register the inventory, you may not be able to update or patch the products you installed.

The following configuration scripts

/home/oracle/oracle/product/10.2.0/db_1/root.sh

need to be executed as root for configuring the system. If you skip the execution of the configuration tools, the configuration will not be complete and the product wont function properly. In order to get the product to function properly, you will be required to execute the scripts and the configuration tools after exiting the OUI.

The installation of Oracle Database 10g was successful.

Please check ‘/home/oracle/oraInventory/logs/silentInstall2009-07-20_02-41-14PM.log‘ for more details.

这里应该查看/home/oracle/oraInventory/logs/silentInstall2009-07-20_02-41-14PM.log,看是否有报错信息;我安装过程中没有注意,结果后面打补丁报出跟这里一样的错误。

****************************************************************************

root@ubuntu108:~# /home/oracle/oraInventory/orainstRoot.sh

Changing permissions of /home/oracle/oraInventory to 770.

Changing groupname of /home/oracle/oraInventory to oinstall.

The execution of the script is complete

root@ubuntu108:~# /home/oracle/oracle/product/10.2.0/db_1/root.sh

Running Oracle10 root.sh script…

The following environment variables are set as:

ORACLE_OWNER= oracle

ORACLE_HOME= /home/oracle/oracle/product/10.2.0/db_1

Enter the full pathname of the local bin directory: [/usr/local/bin]:

Copying dbhome to /usr/local/bin …

Copying oraenv to /usr/local/bin …

Copying coraenv to /usr/local/bin …

Creating /etc/oratab file…

Entries will be added to the /etc/oratab file as needed by

Database Configuration Assistant when a database is created

Finished running generic part of root.sh script.

Now product-specific root actions will be performed.

****************************************************************************


–打补丁:

–第一次图形安装报错,日志如下(查看日志的过程中,发现这个错误在之前的静默安装日志中已经报错):

INFO: /usr/bin/ld: skipping incompatible /usr/lib/gcc/x86_64-linux-gnu/4.2.4/../../../libpthread.so when searching for -lpthread

/usr/bin/ld: skipping incompatible /usr/lib/gcc/x86_64-linux-gnu/4.2.4/../../../libpthread.a when searching for -lpthread

/usr/bin/ld: skipping incompatible /usr/bin/../lib/libpthread.so when searching for -lpthread

/usr/bin/ld: skipping incompatible /usr/bin/../lib/libpthread.a when searching for -lpthread

/usr/bin/ld: skipping incompatible /usr/lib/libpthread.so when searching for -lp

INFO: thread

/usr/bin/ld: skipping incompatible /usr/lib/libpthread.a when searching for -lpthread

/usr/bin/ld: cannot find -lpthread

collect2: ld returned 1 exit status

INFO: make[1]: *** [/home/oracle/oracle/product/10.2.0/db_1/sysman/lib/libnmemso.so] Error 1

INFO: make[1]: Leaving directory `/home/oracle/oracle/product/10.2.0/db_1/sysman/lib’

INFO: make: *** [libnmemso] Error 2

INFO: End output from spawned process.

INFO: ———————————-

INFO: Exception thrown from action: make

Exception Name: MakefileException

Exception String: Error in invoking target ‘agent’ of makefile ‘/home/oracle/oracle/product/10.2.0/db_1/sysman/lib/ins_sysman.mk’. See ‘/home/oracle/oraInventory/logs/installActions2009-07-20_02-54-28PM.log’ for details.

Exception Severity: 1

–继续安装包

root@ubuntu108:/home/oracle/oraInventory/logs# sudo apt-get install gcc libaio1 lesstif2 lesstif2-dev make libc6 libc6-i386 libc6-dev-i386 libstdc++5 lib32stdc++6 lib32z1 ia32-libs

……

–报错:

INFO: /usr/bin/ld: skipping incompatible /usr/lib/gcc/x86_64-linux-gnu/4.2.4/libgcc.a when searching for -lgcc

/usr/bin/ld: skipping incompatible /usr/lib/gcc/x86_64-linux-gnu/4.2.4/libgcc.a when searching for -lgcc

/usr/bin/ld: cannot find -lgcc

INFO: collect2: ld returned 1 exit status

make[1]: *** [/home/oracle/oracle/product/10.2.0/db_1/sysman/lib/libnmemso.so] Error 1

INFO: make[1]: Leaving directory `/home/oracle/oracle/product/10.2.0/db_1/sysman/lib’

INFO: make: *** [libnmemso] Error 2

INFO: End output from spawned process.

INFO: ———————————-

INFO: Exception thrown from action: make

Exception Name: MakefileException

Exception String: Error in invoking target ‘agent’ of makefile ‘/home/oracle/oracle/product/10.2.0/db_1/sysman/lib/ins_sysman.mk’. See ‘/home/oracle/oraInventory/logs/installActions2009-07-20_03-10-03PM.log’ for details.

Exception Severity: 1

–继续安装包

root@ubuntu108:/home/oracle/oraInventory/logs# apt-get install gcc-multilib

……

oracle@ubuntu108:/oracle/Disk1$ ./runInstaller -ignoreSysPrereqs -record -destinationFile /oracle/oraupdate.rsp

……

oracle@ubuntu108:/oracle/Disk1$ ./runInstaller -ignoreSysPrereqs -silent -responseFile /oracle/oraupdate.rsp

……

WARNING:
The following configuration scripts need to be executed as the “root” user.
#!/bin/sh
#Root script to run
/home/oracle/oracle/product/10.2.0/db_1/root.sh
To execute the configuration scripts:
1. Open a terminal window
2. Log in as “root”
3. Run the scripts

The installation of Oracle Database 10g Release 2 Patch Set 3 was successful.
Please check ‘/home/oracle/oraInventory/logs/silentInstall2009-07-20_03-19-54PM.log’ for more details.

****************************************************************************
root@ubuntu108:/home/oracle/oraInventory/logs# /home/oracle/oracle/product/10.2.0/db_1/root.sh
Running Oracle10 root.sh script…

The following environment variables are set as:
ORACLE_OWNER= oracle
ORACLE_HOME=  /home/oracle/oracle/product/10.2.0/db_1

Enter the full pathname of the local bin directory: [/usr/local/bin]:
The file “dbhome” already exists in /usr/local/bin.  Overwrite it? (y/n)
[n]: y
Copying dbhome to /usr/local/bin …
The file “oraenv” already exists in /usr/local/bin.  Overwrite it? (y/n)
[n]: y
Copying oraenv to /usr/local/bin …
The file “coraenv” already exists in /usr/local/bin.  Overwrite it? (y/n)
[n]: y
Copying coraenv to /usr/local/bin …

Entries will be added to the /etc/oratab file as needed by
Database Configuration Assistant when a database is created
Finished running generic part of root.sh script.
Now product-specific root actions will be performed.
****************************************************************************

vi .bash_profile

umask 022

export ORACLE_HOME=/home/oracle/oracle/product/10.2.0/db_1

export PATH=$PATH:$ORACLE_HOME/bin

export ORACLE_SID=hollyc5

export ORACLE_BASE=/home/oracle

export LD_LIBRARY_PATH=$ORACLE_HOME/lib

export NLS_LANG=AMERICAN_AMERICA.UTF8

export LANG=us_EN.UTF-8

dbca

……

netca

……

最后,ubuntu应该安装的包:

sudo apt-get install gcc libaio1 lesstif2 lesstif2-dev make libc6 libc6-i386 libc6-dev-i386 libstdc++5 lib32stdc++6 lib32z1 ia32-libs gcc-multilib

Categories: Linux, Oracle Tags: , ,

字符型日期字段惹的祸

July 16th, 2009 No comments

今天项目中定时任务报错,影响报表数据;日志表看到错误

ORA-01839: date not valid for month specified

起初怀疑是Oracle的bug;……省略一千字,最后发现是在表中掏出两条’2009-02-29′这样的字符型日期数据……

出于应用可以跨数据库的角度,表中日期型数据都使用varchar2存储,留下了隐患。

这是一个教训:日期型数据就要用日期类型字段存储,记得在tom的书中有类似的描述!
否则可能导致的后果:

  • 可能进入脏数据
  • 也可能导致CBO选择错误的执行计划
Categories: Oracle Tags:

Oracle安装-Linux内核参数-共享内存信号量设置

July 10th, 2009 No comments

Oracle数据库在linux系统上的安装过程中,涉及到设置linux内核参数;

这些内核参数究竟代表什么含义,如何调整这些参数?

共享内存

可以通过ipcs -lm命令查看目前系统共享内存的参数限制:

# ipcs -lm

—— Shared Memory Limits ——–
max number of segments = 4096
max seg size (kbytes) = 1048576
max total shared memory (pages) = 2097152
min seg size (bytes) = 1

这里涉及到3个于共享内存相关的参数:SHMMAX,SHMMNI,SHMALL

SHMMAX

  • 含义:单个共享内存段最大字节数
  • 设置:比SGA略大
  • 查看:cat /proc/sys/kernel/shmmax
  • $ cat /proc/sys/kernel/shmmax
    1073741824

  • 修改:
  • sysctl -w kernel.shmmax=1073741824
    echo “kernel.shmmax=1073741824″ >> /etc/sysctl.conf

SHMMNI

  • 含义:共享内存段最大个数
  • 设置:至少4096
  • 查看:cat /proc/sys/kernel/shmmni
  • # cat /proc/sys/kernel/shmmni
    4096

  • 修改:

# sysctl -w kernel.shmmni=4096
# echo “kernel.shmmni=4096″ >> /etc/sysctl.conf

SHMALL

  • 含义:系统中共享内存页总数
  • 设置:至少ceil(shmmax/PAGE_SIZE);ORACLE DOC 默认值:2097152*4096=8GB
  • 查看:cat /proc/sys/kernel/shmall

$ getconf PAGE_SIZE
4096

# cat /proc/sys/kernel/shmall
2097152

  • 修改:

# sysctl -w kernel.shmall=2097152
# echo “kernel.shmall=2097152″ >> /etc/sysctl.conf

信号量:

当前系统信号量限制:

oracle@lab-rd-01:~$ ipcs -ls

—— Semaphore Limits ——–
max number of arrays = 128
max semaphores per array = 250
max semaphores system wide = 32000
max ops per semop call = 100

SEMMSL

  • 含义:每个信号量set中信号量最大个数
  • 设置:最小250;对于processes参数设置较大的系统建议设置为processes+10

SEMMNI

  • 含义:linux系统信号量set最大个数
  • 设置:最少128

SEMMNS

  • 含义:linux系统中信号量最大个数
  • 设置:至少32000;SEMMSL * SEMMNI

SEMOPM

  • 含义:semop系统调用允许的信号量最大个数
  • 设置:至少100;或者等于SEMMSL

查看信号量设置:cat /proc/sys/kernel/sem

order:SEMMSL, SEMMNS, SEMOPM, SEMMNI

oracle@lab-rd-01:~$ cat /proc/sys/kernel/sem
250    32000    100    128

修改:

sysctl -w kernel.sem=”250 32000 100 128″
echo “kernel.sem=250 32000 100 128″ >> /etc/sysctl.conf

ORACLE提供了sysresv工具管理共享内存以及信号量,操作系统级别可以通过ipcs管理。

Categories: Linux, Oracle Tags: , , , ,

分页问题

June 18th, 2009 No comments

表T
索引T(id)
select * from (select * from t order by id desc) where rownum<=10;
搜集统计信息,不走索引;
用hint,不走索引;
在zh提醒下,终于想起来是not null约束搞鬼,以前也遇到过,笨死。

Categories: Oracle Tags:

查看数据文件在裸设备中的高水位

June 16th, 2009 No comments

问题来源于这样一个案例:需要把数据库从裸设备中迁移到文件系统上,裸设备大小为10000M一个,但实际占用非常小;

考虑到上面的情况,想到找到裸设备的高水位,然后把高水位下的块dd到文件系统中,从而缩短迁移时间。

如下:


SQL> l
1  select file_id,sum(blocks),min(block_id),max(block_id) from dba_extents
2  where tablespace_name='USERS'
3* group by  file_id
SQL> /

FILE_ID SUM(BLOCKS) MIN(BLOCK_ID) MAX(BLOCK_ID)
---------- ----------- ------------- -------------
4      7080           9          7689

users表空间只有file#为4的一个文件,最大的block_id是7689


SQL> select 7689*8/1024 Mb from dual;

MB
----------
60.0703125

SQL> alter database datafile '/dev/raw/raw4' resize 60m;
alter database datafile '/dev/raw/raw4' resize 60m
*
ERROR at line 1:
ORA-03297: file contains used data beyond requested RESIZE value

SQL> alter database datafile '/dev/raw/raw4' resize 62m;

Database altered.

经过上面确认,可以dd拷贝裸设备的数据到块7689即可。

Categories: Oracle Tags:

sqlplus hang

June 16th, 2009 No comments

环境:ubuntu,10.2.0.1 oracle db
现象:客户端连接数据库时长时间无响应
处理:
以oracle用户登录ubuntu主机,ps -ef | grep pmon查看后台进程,没有问题;
尝试用sqlplus /  as sysdba方式登录数据库,失败,同样无响应;
top查看系统资源占用情况,cpu user20%,sys70%,oracle用户进程平均占4%左右cpu;
怀疑由于cpu繁忙导致无响应;尝试杀掉所有oracle进程,重启数据库;
在杀掉oracle进程后;cpu资源得到释放,恢复正常,再次尝试sqlplus / as sysdba,仍然无响应;
strace sqlplus后,发现进入死循环:

times(NULL)                             = -2144584792
times(NULL)                             = -2144584792
……

跟bug:338461.1非常相像

$ strace /oracle/home/bin/sqlplus -V 2>&1 |less
……
old_mmap(NULL, 385024, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0×41794000
gettimeofday({1122996561, 411035}, NULL) = 0
access(“/usr/local/UD/conf/sqlnet.ora”, F_OK) = -1 ENOENT (No such file or directory)
access(“/usr/local/UD/lib/oracle/network/admin/sqlnet.ora”, F_OK) = -1 ENOENT (No such file or directory)
access(“/usr/local/UD/conf/sqlnet.ora”, F_OK) = -1 ENOENT (No such file or directory)
access(“/usr/local/UD/lib/oracle/network/admin/sqlnet.ora”, F_OK) = -1 ENOENT (No such file or directory)
fcntl64(-1218313656, F_SETFD, FD_CLOEXEC) = -1 EBADF (Bad file descriptor)
times(NULL) = -1825782405
times(NULL) = -1825782405
times(NULL) = -1825782405
times(NULL) = -1825782405
times(NULL) = -1825782405
times(NULL) = -1825782405
times(NULL) = -1825782405

It is looping on the times() function.

解决:1. 重启主机 or  2. 打10.2.0.2补丁。

Categories: Oracle Tags:

ORA-07445 opidsa 错误处理

June 16th, 2009 No comments

Alert日志中出现少量如下错误:

Thu Apr  9 14:07:26 2009
Errors in file /oracle/admin/hollyc5/udump/hollyc5_ora_14616.trc:
ORA-07445: exception encountered: core dump [opidsa()+386] [SIGSEGV] [Address not mapped to object] [0x000000000] [] []
……

查看了所有trc文件:

[oracle@db-01-001 ~]$ head -20 /oracle/admin/hollyc5/udump/hollyc5_ora_14616.trc
/oracle/admin/hollyc5/udump/hollyc5_ora_14616.trc
Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 – 64bit Production
With the Partitioning, OLAP and Data Mining options
ORACLE_HOME = /oracle/product/10.2.0
System name:    Linux
Node name:    db-01-001.hollycrm.com
Release:    2.6.9-67.ELlargesmp
Version:    #1 SMP Wed Nov 7 14:07:22 EST 2007
Machine:    x86_64
Instance name: hollyc5
Redo thread mounted by this instance: 1
Oracle process number: 51
Unix process pid: 14616, image: oraclehollyc5@db-01-001.hollycrm.com

*** 2009-04-09 14:07:26.294
*** ACTION NAME:(Main session) 2009-04-09 14:07:26.257
*** MODULE NAME:(PL/SQL Developer) 2009-04-09 14:07:26.257
*** SERVICE NAME:(hollyc5) 2009-04-09 14:07:26.257
*** SESSION ID:(486.41443) 2009-04-09 14:07:26.257

发现问题有一个共同点:MODULE NAME:(PL/SQL Developer)  ;似乎对jdbc并无影响;
建议如果再遇到类似问题,考虑尝试升级PL/SQL到最新版本。

Metalink Alert:418531.1记录了这一问题:

Description
After having applied the 10.2.0.3 patch set on any platform, processes from user sessions can core dump (ORA-7445) in opidsa().
The problem is also known to affect 10.2.0.2, but only for Windows platforms when running with 10.2.0.2 Patch 6 upwards.

Workaround or Resolution
No real workaround exists for the problem, but flushing the shared pool after receiving the error may clear the situation to prevent further occurrences for a while. Restarting the instance will also clear the problem for a period. Implementing regular flushes of the shared pool can prevent the problem from occurring (this could eg. be executed via DBMS_JOB outside business hours to minimize the impact).

The problem is fixed in 11g via non-published bug:5648872 and the fix is included in 10.2.0.4.

Categories: Oracle Tags:

LTOM - Oracle数据库故障诊断工具介绍

June 3rd, 2009 No comments

LTOM 是Oracle官方提供的嵌入式在线的系统数据采集和诊断工具,有三个功能:

  • Automatic Hang Detection
  • System Profiler
  • Automatic Session Tracing

官方介绍:

The Lite Onboard Monitor (LTOM) is a java program designed as a real-time diagnostic platform for deployment to a customer site. LTOM differs from other support tools, as it is proactive rather than reactive. LTOM provides real-time automatic problem detection and data collection. LTOM runs on the customer’s UNIX server, is tightly integrated with the host operating system and provides an integrated solution for detecting and collecting trace files for system performance issues. The ability to detect problems and collect data in real-time will hopefully reduce the amount of time it takes to solve problems and reduce customer downtime.

可以从metalink下载ltom412.tar

tar xvf ltom412.tar

参考README进行安装

oracle@wayne-laptop:~/tools/ltom/tom_base/install$ ./autoinstall.sh
TOM_HOME=/home/oracle/tools/ltom/tom_base/install/../tom
rm: cannot remove `autoinstall.log’: No such file or directory

Preparing to install LTOM v4.0.0…

Checking your Oracle Environment…

ORACLE_HOME=/home/oracle/app/10.2.0/db_1
ORACLE_SID=ora10g
TNS_ADMIN=
TOM_HOME=/home/oracle/tools/ltom/tom_base/install/../tom

Is this the correct Oracle environment you wish to use for installing LTOM?
Hit Return to Accept Default:
============================> \c

Continuing the install with this environment

Checking for Java…

./autoinstall.sh: line 176: /tmp/jstout.txt: Permission denied
Java found.

Checking for top utility…
Found top utility.

Checking for vmstat utility…
Found vmstat utility.

Need to run some sql now to create TOM user
Hit Return to allow setup to do this for you, enter other to do this manually:
============================> \c

Tom user created successfully.
Warning…Please change tom password immediately!

Grant user Tom successful.
Create package dbmssupp successful.
Grant execute on dbmssupp successful.

Checking for statspack…
Found statspack.

Enter your email id to get notification of hangs via email…
Hit Return to ignore hang notification via email:
============================> \c

Configuring tom_deploy.properties file…

TOM/LTOM successfully installed.

Please review any messages you received above.

REMEMBER: You must have TOM_HOME defined and java in your PATH
before you can run TOM/LTOM.

安装后需要设置TOM_HOME,然后就可以启动了

oracle@wayne-laptop:~/tools/ltom/tom_base/tom$ export TOM_HOME=`pwd`
oracle@wayne-laptop:~/tools/ltom/tom_base/tom$ echo $TOM_HOME
/home/oracle/tools/ltom/tom_base/tom
oracle@wayne-laptop:~/tools/ltom/tom_base/tom$ ./startltom.sh
Enter username:tom
Enter your password:****

Starting LTOM V4.1.2
LTOM written by Oracle Center of Expertise
Copyright (c)  2008 by Oracle Corporation

Enter 1  to Start Auto Hang Detection
Enter 2  to Stop  Auto Hang Detection
Enter 3  to Start System Profiling
Enter 4  to Stop  System Profiling
Enter 7  to Start Session Tracing
Enter 71 to Display Sessions Traced
Enter 72 to Dump All Trace Buffers
Enter 73 to Dump Specific Trace Buffer
Enter 74 to Stop Specific Session Tracing
Enter 8  to Stop All Session Tracing

Enter S  to Update status
Enter Q  to End Program
CURRENT STATUS: HangDetection=OFF Profiling=OFF AutoRec=OFF SessionRec=OFF

下面,使用1进入Auto Hang Detection,在数据库中模拟library cache pin事件;下面就可以看到相应的挂起信息

Please Select an Option:1

Enter polling frequency in seconds
(Hit return to accept default value of 20 seconds): 5

Enter 1  to Start Auto Hang Detection
Enter 2  to Stop  Auto Hang Detection
Enter 3  to Start System Profiling
Enter 4  to Stop  System Profiling
Enter 7  to Start Session Tracing
Enter 71 to Display Sessions Traced
Enter 72 to Dump All Trace Buffers
Enter 73 to Dump Specific Trace Buffer
Enter 74 to Stop Specific Session Tracing
Enter 8  to Stop All Session Tracing

Enter S  to Update status
Enter Q  to End Program
CURRENT STATUS: HangDetection=ON Profiling=OFF AutoRec=OFF SessionRec=OFF
Please Select an Option:
>>> LTOM has detected a hang at Tue Jun  2 18:36:31 CST 2009
exit value was 1
cmd was rm /home/oracle/tools/ltom/tom_base/tom/tmp/ltomholdfilename.out
HANGANALYZE Level 3 started Tue Jun  2 18:36:31 CST 2009
exit value was 1
cmd was rm /home/oracle/tools/ltom/tom_base/tom/tmp/hangmail.tmp
exit value was 127
cmd was /home/oracle/tools/ltom/tom_base/tom/src/ltommail.sh
HANGANALYZE completed Tue Jun  2 18:36:31 CST 2009
exit value was 1
cmd was rm /home/oracle/tools/ltom/tom_base/tom/tmp/fname.out
exit value was 1
cmd was cat UNKNOWN
>ERROR PROCESSING FILE. MOST LIKELY CAUSE IS A DELETED FILE: UNKNOWN
This cmd failed: cat UNKNOWN
exit value was 1
cmd was rm /home/oracle/tools/ltom/tom_base/tom/tmp/holdfilename.out
SYSTEMSTATE DUMP Level 266 started Tue Jun  2 18:36:31 CST 2009
SYSTEMSTATE DUMP completed Tue Jun  2 18:36:33 CST 2009
exit value was 1
cmd was rm /home/oracle/tools/ltom/tom_base/tom/tmp/fname.out
SYSTEMSTATE DUMP in file UNKNOWN
HANGANALYZE Level 3 started Tue Jun  2 18:37:03 CST 2009
HANGANALYZE completed Tue Jun  2 18:37:03 CST 2009
exit value was 1
cmd was rm /home/oracle/tools/ltom/tom_base/tom/tmp/fname.out
exit value was 1
cmd was cat UNKNOWN
>ERROR PROCESSING FILE. MOST LIKELY CAUSE IS A DELETED FILE: UNKNOWN
This cmd failed: cat UNKNOWN
SYSTEMSTATE DUMP Level 266 started Tue Jun  2 18:37:03 CST 2009
SYSTEMSTATE DUMP completed Tue Jun  2 18:37:04 CST 2009
exit value was 1
cmd was rm /home/oracle/tools/ltom/tom_base/tom/tmp/fname.out
SYSTEMSTATE DUMP in file UNKNOWN

>>> Exiting autohangdetect thread…

……
CURRENT STATUS: HangDetection=OFF Profiling=OFF AutoRec=OFF SessionRec=OFF
Please Select an Option:q

>>> Beginning shutdown. Waiting for all threads to be notified…

>>> Thread 0 is main still waiting to be notified of shutdown…

>>> LTOM successfully shut down.

The following output was produced from running LTOM…

Auto Hang Detection Logs located in $TOM_HOME/hanglog
Other hang trace files may be located in the udump/bdump

进入$TOM_HOME/hanglog查看

>>>WARNING. Hang detected!
>>>Hang Detected while running query. SID= 144
Event= library cache pin  Secs_In_Wait= 48
Tue Jun  2 18:36:31 CST 2009
>>>Info Only…
Total sessions waiting on significant events = 1
Total sessions waiting on rule specified event = 1

进一步可以通过hanganalyze log 和 ubump目录下的system state dump文件获取详细信息。

其他两项功能也比较实用……

Categories: Oracle Tags: