【Oracle数据恢复】ORA-00600[6711]错误一例

一套Linux上的10.2.0.4系统,日志中频繁出现ORA-00600[6711]内部错误:

 

如果自己搞不定可以找ASKMACLEAN专业ORACLE数据库修复团队成员帮您恢复!

 

Wed Sep  1 21:24:30 2010
Errors in file /s01/10gdb/admin/YOUYUS/bdump/youyus_smon_5622.trc:
ORA-00600: internal error code, arguments: [6711], [4256248], [1], [4256242], [0], [], [], []
Wed Sep  1 21:24:31 2010
Non-fatal internal error happenned while SMON was doing logging scn->time mapping.

 

 

MOS上有一个关于6711内部错误十分简单的Note,该文档声称出现6711错误极有可能是部分类型为簇(cluster)的数据字典表存在潜在的讹误,这个Note甚至没有告诉我们该错误argument参数的意义。
不过其实我们可以猜出来,因为是和corruption相关的错误,那么实际上可能关联的几个因素无非是obj#,file#,block#;4256248和4256242 两个数字像极了Data Block Address,把他们当做dba来看待,也就指向了1号数据文件的61938块和61944数据块,我们来看看这些块属于哪个对象:

SQL> set linesize 200;
SQL> select segment_name, segment_type
  2    from dba_extents
  3   where relative_fno = 1
  4     and (61938 between block_id and block_id + blocks or
  5         61944 between block_id and block_id + blocks);

SEGMENT_NAME                                                                      SEGMENT_TYPE
--------------------------------------------------------------------------------- ------------------
SMON_SCN_TO_TIME                                                                  CLUSTER

不出意料是一个cluster,SMON_SCN_TO_TIME是SMON_SCN_TIME表的基簇,SMON_SCN_TIME表用以记录数据库中scn对应的时间戳。我们直接查看用以创建数据字典的sql.bsq文件,可以进一步了解他们的结构:

cat $ORACLE_HOME/rdbms/admin/sql.bsq|grep -A 24 "create cluster smon_scn_to_time"
create cluster smon_scn_to_time (
  thread number                         /* thread, compatibility */
)
/
create index smon_scn_to_time_idx on cluster smon_scn_to_time
/
create table smon_scn_time (
  thread number,                         /* thread, compatibility */
  time_mp number,                        /* time this recent scn represents */
  time_dp date,                          /* time as date, compatibility */
  scn_wrp number,                        /* scn.wrp, compatibility */
  scn_bas number,                        /* scn.bas, compatibility */
  num_mappings number,
  tim_scn_map raw(1200),
  scn number default 0,                  /* scn */
  orig_thread number default 0           /* for downgrade */
) cluster smon_scn_to_time (thread)
/

create unique index smon_scn_time_tim_idx on smon_scn_time(time_mp)
/

create unique index smon_scn_time_scn_idx on smon_scn_time(scn)
/

从以上脚本可以看到这个簇上存在多个索引,我们需要进一步validate验证所有这些对象:

SQL> analyze table SMON_SCN_TIME validate structure;
Table analyzed.

SQL>analyze table SMON_SCN_TIME validate structure cascade;
Table analyzed.

SQL> analyze cluster SMON_SCN_TO_TIME validate structure;
Cluster analyzed.

SQL> analyze cluster SMON_SCN_TO_TIME validate structure cascade;
analyze cluster SMON_SCN_TO_TIME validate structure cascade
*
ERROR at line 1:
ORA-01499: table/index cross reference failure - see trace file

到这里问题已经很清晰了,问题出在SMON_SCN_TO_TIME的索引smon_scn_to_time_idx身上,极有可能是该索引上出现了逻辑讹误。所幸有问题的仅仅是索引,找出问题所在后要解决就显得容易得多了:

SQL> alter index smon_scn_to_time_idx rebuild ;

Index altered.

/* 在索引出现讹误的情况下仅仅rebuild往往是无效的,在我们rebuild的同时告警日志中再次出现了ORA-00600[6711]错误 !!! */

/* 我们需要的彻底把有问题的索引drop掉,并再次创建!!! */

SQL> drop index smon_scn_to_time_idx ;

Index dropped.

SQL> create index smon_scn_to_time_idx on cluster smon_scn_to_time;

Index created.

/* 至此问题解决,告警日志中不再出现错误! * /

/* That's great! * /

Comments

  1. Hdr: 304020 7.1.6.2.0 RDBMS 7.1.6.2.0 PRODID-5 PORTID-2 ORA-600
    Abstract: ORA-600 [6711][67112778][1][67123600] AND DB GOES DOWN ON ANY DB ACCESS

    Customer upgraded to version 7.1.6 three weeks ago.
    He created a new tablespace and started to receive error :
    ORA-600 internal arguments: [6711][67112778][67123600]
    during any access to his database.
    He cannot select any data from any table, export tables or drop tablespaces,
    etc,that he receives this error on alter.log file and database goes down.
    He can startup the db ok, but he cannot use it.
    Minutes before this he received error ORA-600 [12235] on altert.log file, but
    this error did not appear anymore.
    *** 08/25/95 11:59 am ***
    *** 08/25/95 12:01 pm ***
    *** 08/25/95 12:23 pm ***
    Alert.log information :
    Thu Aug 24 22:27:16 1995
    alter database open
    Thu Aug 24 22:27:21 1995
    Thread 1 opened at log sequence 187654
    Current log# 2 seq# 187654 mem# 0: /users/oracle/dbs/log05orac.log
    Thu Aug 24 22:27:21 1995
    SMON: enabling cache recovery
    SMON: enabling tx recovery
    Thu Aug 24 22:27:25 1995
    Completed: alter database open
    Thu Aug 24 22:29:51 1995
    Thread 1 advanced to log sequence 187655
    Current log# 8 seq# 187655 mem# 0: /users/oracle/dbs/log02orac.log
    Thu Aug 24 22:42:16 1995
    Errors in file /users/oracle/rdbms/log/ora_641.trc
    ORA-600: internal error code, arguments: [12235], [], [], [], [], [], [], []
    Thu Aug 24 22:46:30 1995
    create tablespace temp2 datafile
    ‘/mnt1/oracle/dbs/temp2.dbf’ size 600 M
    Thu Aug 24 23:03:26 1995
    Completed: create tablespace temp2 datafile
    ‘/mnt1/oracle/dbs…
    Thu Aug 24 23:03:26
    alter tablespace temp2 online
    Completed: alter tablespace temp2 online
    Thu Aug 24 23:05:48 1995
    Thread 1 advanced to log sequence 187656
    Current log# 9 seq# 187656 mem# 0: /users/oracle/dbs/log03orac.log
    Thu Aug 24 23:07:40 1995
    Erros in file /users/oracle/rdbms/log/smon_579.trc
    ORA-600: internal error code, arguments: [6711], [67112778], [1], [67123600]
    >> This caused the db to go down. It restarts without problem but goes down
    >> on any access to tables
    *** 08/25/95 01:36 pm *** (CHG: Sta->30 Asg->NEW OWNER OWNER)
    *** 08/25/95 01:36 pm ***
    We have a cluster corruption of c_ts#, which contains key tables for space
    management. This will require reloading the database from backup. In order
    to bring up the database long enough to do an export, you can try turning off
    smon space coalescing (event 10269).

    So that we can trouble-shoot this problem, please have the customer send us
    (1) all data files in the SYSTEM tablespace; (2) the controlfiles;
    (3) all redo logs. Note: there may not be enough redo log information since
    the customer is not archiving and the attempted restarts will have caused log
    switches, but we would still like to see what type of curruption occurred in
    the cluster.

    The event was set and they seem to be able to do the export now. The tape with
    the data, control and redo files should be here early next week for debugging.

    If they are able to recreate their database correctly, can you please change th
    priority to 2.

    Customer was able to rebuild their database. Awaiting tape containing the files

    Block 0x4000f4a points to 0x4003990.0, but that row has both next and back
    links pointing to itself.

    It is possible that this was caused by bug 246393. Although that bug has been
    fixed in 7.1.6, it was possible for it to damage the cluster in a way which
    was not detected until later. If the customer has a backup of the database
    prior to the upgrade to 7.1.6, we could see if there was already damage at that
    time. Or, if the customer recreated the database and imported his data after
    upgrading to 7.1.6 but before this ocurrence, then we could be sure that the
    corruption occurred after the upgrade. Otherwise, there isn’t much to go on
    without archived logs.

    ORACLE BRAZIL replied :
    1)customer has no backup of the database prior to the upgrade to 7.1.6.
    2)customer didn’t recreate the database and imported his data after
    upgrading to 7.1.6 but before this ocurrence.
    3)customer has a backup of the database after the upgrade to 7.1.6 and
    before this error. It is a full export.
    If it helps us to continue the analysis,give me an answer as soon as possible
    Customer is not sure about database integrity and he needs our final answer.

    The customer can indeed use “analyze cluster xxx validate structure cascade”
    to find these errors. Since we do not have the information to verify our
    theory about this being 246393, I am changing to status 31. I would
    recommend for all customers to do a validate on c_ts# as part of the process
    of upgrading to 7.1.6. so we have a better chance of seeing whether there is
    still an error in the cluster code.
    *** 09/22/95 07:57 am *** (CHG: Sta->11)
    The customer ran the ‘analyze cluster xxx validate structure cascade’ on
    their clusters. Got an error on another cluster called ‘C_FILE#_BLOCK#’ owned
    by SYS. Have the trace file created during ‘analyze’ as 2 page hardcopy. Will
    forward to your mailbox (JARNETT). This is after the customer rebuilt the
    database on 7.1.6.2
    Is this similar to bug# 281501. Is it just an ‘analyze cluster’ bug (277271) ?

  2. Hdr: 341524 7.2.3 RDBMS 7.2.3 PRODID-5 PORTID-358 ORA-600
    Abstract: ORA-600[6711][16777283][1][16779447] – DATABASE KEEPS CRASHING WITH THIS ERROR

    This customer is running OPS on an ncr smp machine with 2 instances.
    Last Friday they were not able to shutdown the database normal, so they did
    a shutdown abort instead. Now when they start up either instance, after
    issuing 1 or 2 queries the database gives the above ora-600 error and then
    crashes. The database is about 170gb and they do not have any good back ups.
    The ora-600 arguments are consistent each time:
    ORA-600: internal error code, arguments: [6711], [16777283], [1], [16779447],
    [0], [], [], []
    Current SQL statement for this session:
    select f.file#, f.block#, f.ts#, f.length from fet$ f, ts$ t where t.ts#=f.ts#a
    nd t.dflextpct!=0
    Call Stack Trace is as follows:
    —– Call Stack Trace —–
    calling call entry argument values in hex
    location type point (? means dubious value)
    ——————– ——– ——————– —————————-
    ksdout+0x6d ? ksdwrf 7365676B 302B7669 313678
    85797EC 804602E
    ksfini+0x32 ? ksdout 8045FCC 83B4398 80464F8 1
    83AB92F
    kgedes+0xca ? ksfini 83AB92F 7365736B 2B346369
    61337830 0
    rtbsta+0x1aa ? kgedes 840660C 6373746B 302B626C
    37623578 85F0E00
    ktrexc+0x280 ? rtbsta 858EFF8 858EFF8 858EFF8
    858EFF8 858EFF8
    ksedmp+0x88 ? ksedst 1 FFFFFFFF 8046544 85E6964
    85AD610
    ksfini+0x142 ? ksedmp 3 83B4460 8046580 80D62D7
    85AD5A8
    kgeriv+0xc2 ? ksfini 85AD5A8 3 258 0 1A37
    kgesiv+0x61 ? kgeriv AF430000 0 1A37 4 80465CC
    ksesic4+0x3a ? kgesiv 85AD5A8 85E6964 1A37 4
    80465CC
    kdscgr+0x551 ? ksesic4 1A37 0 1000043 0 1
    rtbfch+0x65c ? kdscgr 85F4BC0 0 4 1 0
    kkscls+0x3ad ? kgidel 85AD5A8 85E742C 85F4748
    80467C0 0
    opiodr+0x987 ? kkscls 5 2 8046CC4 8B136154 0
    smcstk+0x78 ? opiodr 5 2 8046CC4 1 8046C48
    rpiswu2+0x6c7 ? smcstk 80469C4 80CA110 F618 5 2
    rpiswu2+0x4ad ? rpiswu2 8046C48 0 8B30F508 8046C80 0
    rpidrv+0x475 ? rpiswu2 8B136C6C 8040000 8046C80 2
    8046C74
    rpifch+0x28 ? rpidrv 1 8046CCC 8046CC4 0 1
    ktsclb+0x5b7 ? rpifch 1 8046D1C 8046D28 5 89C746CC
    ktmmon+0xa41 ? ktsclb 5 7FFFFFFF 8580BE0 12C 0
    lmmgmalloc+0x93 ? ktmmon 85C27EC 1 85AD600 85AD610 0
    ksbrdp+0x36f ? lmmgmalloc 88007838 1 85C2B74 0 80470D4
    lmmgmalloc+0x93 ? ksbrdp 85C27EC 85C4744 14 0 1
    lxinitc+0x12 ? lxinitc 0 85AD610 85AD610 0 0
    opirip+0x2fb ? ksbrdp 8B31DC5C 85AF090 0 0 0
    lmmgmalloc+0x93 ? opirip 85C27EC 85C2B74 2E 0 85BC094
    lmmgmalloc+0x93 ? lmmgmalloc 85C27EC 85C2B74 10 0 0
    slfimp+0x150 ? lmmgmalloc 85C27EC 85C2B74 10 0 859107C
    lmmstmalloc+0x14 ? lmmstmalloc 84F1669 0 0 80006290 8001193
    lmmstmalloc+0x14 ? lmmstmalloc 0 0 80006290 800119A3
    8000BC1B
    lmmgmalloc+0x93 ? lmmstmalloc 85C27EC 85C2B74 37 0 0
    slfign+0x199 ? _end 85C4694 8047460 37 0 8591088
    lmmstcutlrg+0x14 ? lmmstcutlrg 0 85C8D98 80474A4 85BC094 40
    lmmstlrg+0x9a ? lmmstcutlrg 85C27EC 85C2DF4 85C8D98 400
    80474D0
    lmmstmalloc+0xf4 ? lmmstlrg 85C27EC 85C2B74 85C2DF4
    84F1669 400
    lmmstmalloc+0x14 ? lmmstmalloc 400 0 85C46D4 8002103F
    8003A650
    _end+0x77a4ef96 ? lmmstmalloc 8003AE34 8003AE34 8003A650
    8003ACD4 80011594
    _end+0x77a4ef87 ? _end 8048F10 0 0 0 0
    _end+0x77a4f688 ? _end 8003AC50 3C0 8002ACFA 28
    8003B2F8
    —– Argument/Register Address Dump —–
    —– End of Call Stack Trace —–
    I am placing the alert.log generated from when the database originally
    signalled the error and then the alert.log file from this morning when the
    database kept crashing with the ora-600 error, as well as 2 smon trace files
    signalling the ora-600 error. The database is not running in archive log mode
    so we will not be able to take previous logfile dumps.
    At the time when the database would not shutdown normal, the customer had
    been running a create index parallel (degree 8), and also issuing several
    update statements on another table to set col to null. It seemed like there
    was a problem with the create index and update so they killed some of the
    processes at the os level and also did an alter system kill session in Oracle.
    Then they did the shutdown abort, restarted the database, created 2 ts and
    then the 3rd create ts failed with the ora-600[6711] error. Then the database
    crashed and has continued to crash within 1 or 2 statements after bringing
    it back up again.
    The customer would like to get the db up and running as soon as possible but
    they do not have any good backups to restore the db with. Can this problem
    be fixed without having to recreate the db?
    thanks.

    This looks very similar to bug 329698 but I can’t say for sure so far. The
    oracle consultants out on site are very concerned about this situation. The
    customer would like to save this db if at all possible. Please call syamaguc
    for any questions. Also please let me know if there is anything else needed.
    thanks.
    (fyi – I also have dial up access to the customer’s site if needed.)

    select f1.file#, f1.block#, f1.length, f2.block# from fet$ f1, fet$ f2 \
    where f1.file# = f2.file# and f1.rowid != f2.rowid and \
    f1.block# <= f2.block# and f2.block# < f1.block#+f1.length

    select u1.file#, u1.block#, u1.length, u2.block# from uet$ u1, uet$ u2
    where u1.file# = u2.file# and u1.rowid != u2.rowid and \
    u1.block# <= u2.block# and u2.block# select f1.file#, f1.block#, f1.length, f2.block# from fet$ f1, fet$ f2
    2> where f1.file# = f2.file# and f1.rowid != f2.rowid and
    3> f1.block# <= f2.block# and f2.block# select u1.file#, u1.block#, u1.length, u2.block# from uet$ u1, uet$ u2
    2> where u1.file# = u2.file# and u1.rowid != u2.rowid and
    3> u1.block# <= u2.block# and u2.block#

    I am emailing the dial-up info to rpark as requested. Let me know if you
    need anything else. thanks.

    analyze table fet$ validate structure cascade and
    analyze table ts$ validate structure cascasde both came back with statement
    processed. I did not try uet$.
    Thanks for the quick attention to this issue.

    I’ve logged out of the dial up session but I had the customer try
    analyze table uet$ validate structure cascade; and this came back with
    statement processes as well.

    1. are there any trace files from the analyze runs?
    2. without redo logs, we most likely will not be able to determine the source
    of the corruption. there appears to be a loop in the cluster chain for
    the fet$/ts$ cluster – you could get block dumps for the 2 dbas listed
    above (i.e. 16777283 & 16779447).

  3. admin says

    Hdr: 350174 7.1.6 RDBMS 7.1.6 RAM DATA PRODID-5 PORTID-319 ORA-600
    Abstract: ORA-600 [6711] BECAUSE OF CORRUPTED C_TS#

    Customer was recently adding some tablespaces to their 3.5G database which had
    a total of about 75 tablespaces and 120 datafiles.
    Later in the week, they began getting ORA-600 [6711] errors when trying to
    grant users privileges on some tablespaces or issue any type of command which
    would affect the tablespace itself (although they could still access objects
    contained in the tablespaces). This problem affected about 1/3 of their
    tablespaces…most of which had been recently added. They also got this error
    when selecting from dba_free_space.
    The ORA-600 errors always pointed to the c_ts# cluster. It seems that one of
    the blocks points to itself and so this error is signaled.

    I have recommended to the customer that they rebuild their database from their
    recent export. Customer is doing that. However, they are VERY concerned
    because this is their third major recovery in the past three months. Earlier
    they had to rebuild due to ORA-600[5199]/[4142] errors and more recently, they
    had an ORA-1578 on a large table (not in SYSTEM) and had to rebuild that table.

    I have asked the customer for the alert log, trace files, block dumps, redo
    log dumps and analyze on c_ts#. They have provided all but the analyze
    output as they have already trashed this db to rebuild. They do have a backup
    though, and if needed, may be able to restore it partially to a test machine
    if there is more information needed.

    Customer would like an explanation for why this corruption occurred. I putting
    what information we have in bug$.

    Customer says that when he dumps redo for 3e000179, he only gets redo header
    info and no actual redo info. The same is true for 01000820. The only dba
    which produced redo info was 3e000174, and these two files have already been
    provided. The redo they dumped went back to March 12th which the customer
    says was BEFORE they started adding a group of new tablespaces.
    As far as when the database was created: it was created under 7.0.13 and
    migrated up to 7.1.3 and then to 7.1.6. The redo is all from the database
    after it was migrated to 7.1.6.

    Just want to clarify: the redo that they dumped was from logs from March 12th
    forward. The customer does not have ALL the archived logs from the time the
    db was upgraded to 7.1.6; they only have redo logs from March 12th and follow-
    ing.

    Ct did analyze cluster c_ts# validate structure cascade to find out if the
    corruption had reoccurred. 3 of their databases had errors on the i_ts# index
    2 of these dbs were there most critical production dbs.
    One is the same aix db that was rebuilt from the earlier instance in this bug.
    The other database is on an alpha osf platform – also created under 7.1.6.
    I have created 2 sub directories under this bug aix and osf.
    The trace files from the analyze and the block dumps of the i_ts# indexes
    were copied to these directories.
    Ct’s db is currently functioning, but ct is very concerned that the ora-600
    not occur during critical processing tomorrow. The ct needs to know what
    tablespace is at risk from this bug. Also is there any alternative to
    rebuilding the db. They are a critical 24×7 shop. Only 3 hrs per month are
    allotted for system down time. They have already had to rebuild 2 dbs due to
    this bug. Ct has archive logs for the last month for the aix db. They are
    on site ready to provide any further information we require.
    Ct also ran an analyze on cluster containing the uet$ table on the osf system.
    This is the analyze_uet file. It show several blocks that aren’t on the
    free list but should be.

    Have looked at the files.
    The osf analyze output is a known bug in analyze. (bug #277271).
    The check for blocks on the freelist was invalid. The fix was to remove
    this check.
    As for the aix output,
    Please provide block dumps for blocks 0x01000202 and 0x01001d9f.
    The leaf block (0x1000045) indicates that the chain is 2 blocks long, and
    the current start is block 0x1001da4. Block 0x1001da4 shows
    0x01000202 as its previous block, and 0x01001d9f as the next block – indicating
    a chain of at least 3.
    Please also get the redo on these four blocks.

    Lois we have the files on a test machine as we do not have enough of space on
    wrvms. I have put them on tcpyr1.
    Please let me know if u want us to dump these archive logs too.

    The block dumps are in file f7, and the rest of the files contain redo log
    dumps.

    Have looked at the information provided on 3/29. There is redo missing.
    Please provide redo for transaction 06.26.859.
    Please get redo for the three data blocks and the index block, but be
    sure that we have complete redo for blocks 0x1001d9f and 0x1000045.
    Check for the oldest scn on 0x1001d9f, and get redo for the remaining two
    blocks from that time.

    The customer takes hot backup every 12 hrs. They noticed that the backup taken
    on the 12th of march at 8 am was fine as they opened the database and analyzed
    the c_ts# (validate structure cascade), but the one taken on the 12th of march
    at 8 pm failed to analyze the cluster. The log seq no. on the 12th of march
    started around 650, and we are getting dumps of all the four blocks mentioned
    in the bug 0x01000202, 0x01001d9f, 0x1001da4 and 0x1000045.

    The problem was narrowed down to a 15min window the log seq corresponding to
    same there are also two file readme and finfo. All the files are named
    depending on the range of sequence nos, and so are the alert.logs

    Please get the redo for txn 06.26.859
    RBA:0x000292:0x00005c71:0x003b to RBA:0x000292:0x00005c85:0x0144

    It appears that during txn 06.26.859, we are generating invalid
    redo for the leaf block. This may be due to a rollback on one of
    the data blocks that is being added to the cluster chain.

    Lois we have the dump of that period. There are 3 new files 2 of them are
    the outputs from analyze and the other is the dump from the redo logs
    (fcorruption, fcorruption1 and f658).

    Have determined the problem in the code. Fix to bug #246393 appears to be
    incomplete, so it is still possible for the chain count to be wrong.
    Am working on a fix.

    Have determined a possible fix. I suggest we take the following approach:
    1. I’ll provide a patch for 7.1.6 that will be minimally tested (eg short
    regress).
    2. After the customer has installed the patch, they should restore the
    backup and rollforward through the transaction that caused the corruption.
    3. If the corruption does not occur, then that will confirm the patch.
    Then, I can work on creating a test case for regression testing.

    Patch in tcpatch. Dir is /u01/patch/IBM_RS6000/7.1.6/bug350174.

    Have rcv’d the results of the analyze cluster validate structure from the
    osf database. The files have been copied to bug$:[.bug350174.osf2]
    please let me know if the index has the corruption – thx

    Unfortunately, analyze shows corruption.
    If they have redo for the index block (this would be the leaf block storing
    keydata for tablespace number 37) and the blocks on the chain that would
    be useful to confirm whether we are seeing the same scenario.
    *** 04/10/96 01:59 pm ***
    Ct has corruption in 2 dbs, one on aix, one on osf (Both critical prod dbs).
    The corruption occurs on ts# 2 on the aix, on ts# 37 on osf.
    Both of these ts contain rollback segments.
    Ct runs with optimal set on their rollback segments.
    Their rollback segments grow and shrink frequently.
    One odd thing – ct says that the rollback segments in ts# 2 on the aix have
    all been offline since the db was rebuilt. The rollback segments in that
    ts are named r01, r02, r03 and r04. We can verify this from the alert logs
    that the ct provided us. Ct will also double check on this.

    We are requesting the experimental patch for the alpha osf platform
    for the same version (7.1.6).

    The ct has a 2nd db on an aix platform that also has the corruption.
    Had ct check what the tablespace is for this db – it is a data tablespace,
    not a rollback tablespace as the other 2 dbs.
    ..
    More interesting experiments the ct has performed:
    Ct restored the corrupted db with the ts#2 datafile offline rolled forward
    past the time when the corruption occurred and did the analyze.
    The analyze was clean.
    Ct then reload the db and restored with the datafile online and rolled forward
    past the time the corruption occurred, then dropped ts#2, did the analyze and
    it still showed the corruption!
    ..
    Ct will be offlining the rollback segments on the critical production osf db.
    Will also offline that tablespace. They are not going to change anything
    with any of the other rollback segments.
    *** 04/11/96 01:02 pm ***
    ..
    have copied more redo for the transaction to bug$:[.bug350174.redo]

    about 1 1/2 hr after ct offlined rollback segments and the assc’d tablespace
    smon generated a trace file with the following errors:
    ora-604
    ora-376
    ora-1110 on this tablespace
    this occurred twice within two mintues (that was 20 min ago)
    application is generating these errors on some activities
    asked ct to check if they are doing set transaction usesr rollback segment
    for any of the rollback segments in this tablespace.
    *** 04/12/96 08:53 am ***
    ct spoke with the application developer. They checked the code for set
    transaction or references to the tablespace in create statements – there were
    none. curiouser and curiouser. The ct and I have not identified any reason
    why these transactions would require this tablespace to be online!
    ct has also verified that no users have quota on this tablespace and only
    the dba accounts have the unlimited tablespace privilege. Ct has also checked
    the default and temp tablespace for the users and it is not this tablespace.
    ct is wondering if the corruption in the index could be causing something to
    try to assign free extents from this tablespace rather than the intended
    tablespace???

    Current status is:
    Customer has installed the patch on both the aix and osf machines.
    The osf db was patched to workaround the corruption (the count in the index
    was modified to be in agreement with the cluster chain)
    The aix db will be patched also.
    Customer is running analyze every day to determine whether patch is resolving
    the problem.
    At this point, Customer has run 24 hours with no problem.
    In development, we continue to analyze the redo and code, so that we can
    generate a testcase.

    Further analysis of the redo and the code has confirmed that the customer
    encountered the problem that the patch addresses.

    Good news. Was able to devise a testcase that reproduces the problem –
    corruption of c_ts# and the same redo as provided by the customer.
    With this testcase, verified that the fix works on 7.1.6

    Testcase also reproduces in 7.2.2, but not in 7.3.2

    ]] Fixed problem that would result in a corrupted C_TS# cluster.

    I succesfully applied the new patch. Please document when the patch can
    be succesfully supplied.

    Please DO NOT reopen closed bugs.

  4. wqangliang says

    “4256248和4256242 两个数字像极了Data Block Address,把他们当做dba来看待,也就指向了1号数据文件的61938块和61944数据块’
    这个怎么转化过来的?
    dbms_utility.data_block_address_file
    dbms_utility.data_block_address_block
    用他们转化得到的不是你的那个结果?

    • admin says

      4256248 转换成2进程是10000001111000111111000,补全到32位也就是
      0000000001 0000001111000111111000
      取前10位 转换成十进制 1 ,也就是1号数据文件
      后面22位 转成十进制61944, 这是一种方法。

Comment

*

沪ICP备14014813号-2

沪公网安备 31010802001379号