如果自己搞不定可以找诗檀软件专业ORACLE数据库修复团队成员帮您恢复!
诗檀软件专业数据库修复团队
服务热线 : 13764045638 QQ号:47079569 邮箱:service@parnassusdata.com
当出现 scsi disk error 时至少收集如下数据:
/var/log/messages*
/sbin/fdisk -l
/bin/cat /proc/scsi/scsi
/bin/dmesg
一旦你收集了如上数据后 ,检查如上日志中与scsi error相关的记录。
若错误一直发生在某个设备上,则考虑替换该设备。 若错误发生在某个总线的不同对象上,则还需要进一步的诊断。
在下面的情况中,id 为设备对象号。
在下面的例子中我们看到的是 0号通道,id=1 并且 lun=0 ,每一行的错误都指向同一个id。我们也能看到该磁盘的不同扇区所爆出的错误。
/var/log/messages
Dec 20 10:33:23 localhost kernel: SCSI disk error : host 0 channel 0 id 1 lun 0 return code = 27010000
Dec 20 10:33:23 localhost kernel: I/O error: dev 08:03, sector 0
Dec 20 10:33:23 localhost kernel: SCSI disk error : host 0 channel 0 id 1 lun 0 return code = 27010000
Dec 20 10:33:23 localhost kernel: I/O error: dev 08:03, sector 13631520
Dec 20 10:33:23 localhost kernel: EXT3-fs error (device sd(8,3)): ext3_get_inode_loc: unable to read inode block - inode=852064, block=1703940
Dec 20 10:33:23 localhost kernel: SCSI disk error : host 0 channel 0 id 1 lun 0 return code = 27010000
Dec 20 10:33:23 localhost kernel: I/O error: dev 08:03, sector 0
Dec 20 10:33:23 localhost kernel: EXT3-fs error (device sd(8,3)) in ext3_reserve_inode_write: IO failure
Dec 20 10:33:24 localhost kernel: SCSI disk error : host 0 channel 0 id 1 lun 0 return code = 27010000
Dec 20 10:33:24 localhost kernel: I/O error: dev 08:03, sector 0
Dec 20 10:33:24 localhost kernel: SCSI disk error : host 0 channel 0 id 1 lun 0 return code = 27010000
Dec 20 10:33:24 localhost kernel: I/O error: dev 08:03, sector 13631520
Dec 20 10:33:24 localhost kernel: EXT3-fs error (device sd(8,3)): ext3_get_inode_loc: unable to read inode block - inode=852064, block=1703940
Dec 20 10:33:24 localhost kernel: SCSI disk error : host 0 channel 0 id 1 lun 0 return code = 27010000
Dec 20 10:33:24 localhost kernel: I/O error: dev 08:03, sector 0
Dec 20 10:33:24 localhost kernel: EXT3-fs error (device sd(8,3)) in ext3_reserve_inode_write: IO failure
Dec 20 10:33:24 localhost kernel: SCSI disk error : host 0 channel 0 id 1 lun 0 return code = 27010000
Dec 20 10:33:24 localhost kernel: I/O error: dev 08:03, sector 0
另一个例子:
Sep 21 23:35:41 localhost kernel: klogd 1.4.1, log source = /proc/kmsg started. Sep 21 23:35:41 localhost kernel: Inspecting /boot/System.map-2.4.18-17.7.x.4smp Sep 21 23:35:41 localhost kernel: Loaded 17857 symbols from /boot/System.map-2.4.18-17.7.x.4smp. Sep 21 23:35:41 localhost kernel: Symbols match kernel version 2.4.18. Sep 21 23:35:41 localhost kernel: Loaded 256 symbols from 11 modules. Sep 21 23:35:41 localhost kernel: SCSI disk error : host 0 channel 0 id 1 lun 0 return code = 27010000 Sep 21 23:35:41 localhost kernel: I/O error: dev 08:17, sector 66453508 Sep 21 23:35:41 localhost kernel: SCSI disk error : host 0 channel 0 id 1 lun 0 return code = 27010000 : : Sep 21 23:35:49 localhost kernel: scsi :''' aborting command due to timeout : pid 43891492, scsi0, channel 0, id 1, lun 0 Write (10) 00 00 4b ae 5b 00 00 02 00''' Sep 21 23:35:49 localhost kernel: mptscsih: OldAbort scheduling ABORT SCSI IO (sc=c2db7200) Sep 21 23:35:49 localhost kernel: IOs outstanding = 5 Sep 21 23:35:49 localhost kernel: scsi : aborting command due to timeout : pid 43891493, scsi0, channel 0, id 1, lun 0 Write (10) 00 00 43 2e 5d 00 00 02 00 : : Sep 21 23:35:49 localhost kernel: mptscsih: ioc0: Issue of TaskMgmt Successful! Sep 21 23:35:49 localhost kernel: SCSI host 0 abort (pid 43891492) timed out - resetting Sep 21 23:35:49 localhost kernel: SCSI bus is being reset for host 0 channel 0. Sep 21 23:35:50 localhost kernel: mptscsih: OldReset scheduling BUS_RESET (sc=c2db7200) Sep 21 23:35:50 localhost kernel: IOs outstanding = 6 Sep 21 23:35:50 localhost kernel: SCSI host 0 abort (pid 43891493) timed out - resetting : : Sep 21 23:35:51 localhost kernel: SCSI host 0 reset (pid 43891492) timed out again - Sep 21 23:35:51 localhost kernel: probably an unrecoverable SCSI bus or device hang. Sep 21 23:35:51 localhost kernel: SCSI host 0 reset (pid 43891493) timed out again - Sep 21 23:35:51 localhost kernel: SCSI Error Report =-=-= (0:0:0) Sep 21 23:35:51 localhost kernel: SCSI_Status=02h (CHECK CONDITION) Sep 21 23:35:51 localhost kernel: Original_CDB[]: 28 00 02 B1 4E 62 00 00 04 00 Sep 21 23:35:51 localhost kernel: SenseData[12h]: 70 00 06 00 00 00 00 0A 00 00 00 00 29 02 02 00 00 00 Sep 21 23:35:51 localhost kernel: SenseKey=6h (UNIT ATTENTION); FRU=02h Sep 21 23:35:51 localhost kernel: ASC/ASCQ=29h/02h "SCSI BUS RESET OCCURRED" Sep 21 23:35:51 localhost kernel: SCSI Error Report =-=-= (0:1:0) Sep 21 23:35:51 localhost kernel: SCSI_Status=02h (CHECK CONDITION) Sep 21 23:35:51 localhost kernel: Original_CDB[]: 2A 00 00 45 EE 5F 00 00 02 00 Sep 21 23:35:51 localhost kernel: SenseData[12h]: 70 00 06 00 00 00 00 0A 00 00 00 00 29 02 02 00 00 00 Sep 21 23:35:51 localhost kernel: SenseKey=6h (UNIT ATTENTION); FRU=02h Sep 21 23:35:51 localhost kernel: ASC/ASCQ=29h/02h "SCSI BUS RESET OCCURRED" Sep 21 23:35:51 localhost kernel: md3: no spare disk to reconstruct array! -- continuing in degraded mode Sep 21 23:35:51 localhost kernel: md: updating md2 RAID superblock on device Sep 21 23:35:52 localhost kernel: md: (skipping faulty sdb5 ) Sep 21 23:35:52 localhost kernel: md: sda5 [events: 00000012](write) sda5's sb offset: 4192832 Sep 21 23:35:52 localhost kernel: raid1: sda7: redirecting sector 30736424 to another mirror
more
scsi0: ERROR on channel 0, id 0, lun 0, CDB: Write (10) 00 06 1d 3a 0d 00 00 08 00 Info fld=0x61d3a0d, Deferred sd08:02: sense key Medium Error Additional sense indicates Write error I/O error: dev 08:02, sector 102498376 SCSI Error: (0:0:0) Status=02h (CHECK CONDITION) Key=3h (MEDIUM ERROR); FRU=0Ch ASC/ASCQ=0Ch/02h "" CDB: 2A 00 07 F9 3A 2D 00 00 08 00 scsi0: ERROR on channel 0, id 0, lun 0, CDB: Write (10) 00 07 f9 3a 2d 00 00 08 00 Info fld=0x8153a0d, Deferred sd08:02: sense key Medium Error Additional sense indicates Write error - auto reallocation failed I/O error: dev 08:02, sector 133693544
定位设备信息:
cat /proc/scsi/scsi Host: scsi0 Channel: 00 Id: 00 Lun: 00 Vendor: SEAGATE Model: ST373307LC Rev: 0007 Type: Direct-Access ANSI SCSI revision: 03 Host: scsi0 Channel: 00 Id: 01 Lun: 00 Vendor: SEAGATE Model: ST373307LC Rev: 0007 Type: Direct-Access
Sector 3228343 scsi0 (1:0): rejecting I/O to offline device RAID1 conf printout: --- wd:1 rd:2 disk 0, wo:0, o:1, dev:sda1 disk 1, wo:1, o:0, dev:sdb1 RAID1 conf printout: --- wd:1 rd:2 disk 0, wo:0, o:1, dev:sda1 scsi0 (1:0): rejecting I/O to offline device md: write_disk_sb failed for device sdb2 md: errors occurred during superblock update, repeating scsi0 (1:0): rejecting I/O to offline device
使用 SMARTmonTools 工具 ,SMARTmonTools 收集 磁盘驱动信息,下面是 smartmontools 的例子:
[root@xxx-a log]# smartctl -a /dev/sda1 smartctl version 5.38 [x86_64-redhat-linux-gnu] Copyright (C) 2002-8 Bruce Allen Home page is http://smartmontools.sourceforge.net/ Device: SEAGATE ST360057SSUN600G Version: 0B25 Serial number: 001112223333 Device type: disk Transport protocol: SAS Local Time is: Wed Oct 10 09:07:02 2012 PDT Device supports SMART and is Enabled Temperature Warning Enabled SMART Health Status: OK
Comment