文章来源: http://blog.sina.com.cn/s/blog_544a710b0101a6xv.html
上一篇测试了通过pg_resetxlog来清理WAL,我们还可以通过它来恢复pg_control。
pg_control在$PGDATA/global下,很小,很不起眼。但一旦这个文件被损坏,PG就启不来。 cd $PGDATA/global mv pg_control ./..
看日志,可以看到以下信息:
PANIC: could not open control file "global/pg_control": No such file or directory LOG: checkpointer process (PID 5989) was terminated by signal 6: Aborted LOG: terminating any other active server processes WARNING: terminating connection because of crash of another server process DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. HINT: In a moment you should be able to reconnect to the database and repeat your command. LOG: archiver process (PID 5993) exited with exit code 1 LOG: all server processes terminated; reinitializing pg_controldata pg_controldata: could not open file "/database/pgdata/global/pg_control" for reading: No such file or directory
这时我们只能去恢复pg_control了。
命令不难,难的是如何设置参数值!
-x: A safe value for the next transaction ID (-x) can be determined by looking for the numerically largest file name in the directory pg_clog under the data directory, adding one, and then multiplying by 1048576. Note that the file names are in hexadecimal. It is usually easiest to specify the option value in hexadecimal too. For example, if 0011 is the largest entry in pg_clog, -x 0x1200000 will work (five trailing zeroes provide the proper multiplier). cd $PGDATA/pg_clog ls -l -rw------- 1 postgres postgres 8.0K Mar 7 14:58 0000 根据上述规则,-x = 0000+1跟5个0=0x000100000 (2进制换算成16进制,*1048576 = 加5个0) ################################################################################################# -m: A safe value for the next multitransaction ID (-m) can be determined by looking for the numerically largest file name in the directory pg_multixact/offsets under the data directory, adding one,and then multiplying by 65536. As above, the file names are in hexadecimal, so the easiest way to do this is to specify the option value in hexadecimal and add four zeroes cd $PGDATA/pg_multixact/offsets ls -l -rw------- 1 postgres postgres 8.0K Mar 7 14:58 0000 所以:-m = 0000+1 跟4个0 = 0x00010000 (2进制换算成16进制,*65536 = 加4个0) ################################################################################################# -O: A safe value for the next multitransaction offset (-O) can be determined by looking for the numerically largest file name in the directory pg_multixact/members under the data directory, adding one, and then multiplying by 65536. As above, the file names are in hexadecimal, so the easiest way to do this is to specify the option value in hexadecimal and add four zeroes cd $PGDATA/pg_multixact/members ls -l -rw------- 1 postgres postgres 8.0K Feb 28 09:04 0000 所以:-O = 0000+1 跟4个0 = 0x00010000 (2进制换算成16进制,*65536 = 加4个0) ################################################################################################# -l: TheWAL starting address (-l) should be larger than anyWAL segment file name currently existing in the directory pg_xlog under the data directory. These names are also in hexadecimal and have three parts. The first part is the “timeline ID” and should usually be kept the same. Do not choose a value larger than 255 (0xFF) for the third part; instead increment the second part and reset the third part to 0. For example, if 00000001000000320000004A is the largest entry in pg_xlog, -l 0x1,0x32,0x4B will work; but if the largest entry is 000000010000003A000000FF, choose -l 0x1,0x3B,0x0 or more. cd $PGDATA/pg_xlog ls -l total 129M -rw------- 1 postgres postgres 56 Feb 28 10:40 00000002.history -rw------- 1 postgres postgres 64M Mar 7 15:03 000000030000000D0000002D -rw------- 1 postgres postgres 64M Mar 7 15:21 000000030000000D0000002E drwx------ 2 postgres postgres 32K Mar 7 15:21 archive_status 这里的算法,根据“larger than anyWAL segment file name currently existing in the directory pg_xlog under the data directory”, 所以: TLI (timeline ID)=0x00000003(同第一段) FILE = 0x0000000D (同第二段) SEG = 0x0000002E + 1 = 0x0000002F -l = 0x3,0xD,0x2F (这里前置的0都去处了) ###########################################################################################
-e和-o不清楚怎么设置,官方文档也没有介绍,幸亏提到is not critical,所以就不管了。
接下来就开始恢复了。
1,
cd $PGDATA/global
touch pg_control
2,
pg_resetxlog -x 0x100000 -m 0x10000 -O 0x10000 -l 0x3,0xD,0x2F -f $PGDATA
pg_resetxlog: pg_control exists but is broken or unknown version; ignoring it
Transaction log reset
3,
看下control内容
pg_controldata
pg_control version number: 922
Catalog version number: 201204301
Database system identifier: 5852524061837064188
Database cluster state: shut down
pg_control last modified: Thu 07 Mar 2013 05:02:24 PM CST
Latest checkpoint location: D/BC000020
Prior checkpoint location: 0/0
Latest checkpoint’s REDO location: D/BC000020
Latest checkpoint’s TimeLineID: 3
Latest checkpoint’s full_page_writes: off
Latest checkpoint’s NextXID: 0/1048576
Latest checkpoint’s NextOID: 10000
Latest checkpoint’s NextMultiXactId: 65536
Latest checkpoint’s NextMultiOffset: 65536
Latest checkpoint’s oldestXID: 2296015872
Latest checkpoint’s oldestXID’s DB: 0
Latest checkpoint’s oldestActiveXID: 0
Time of latest checkpoint: Thu 07 Mar 2013 05:02:24 PM CST
Minimum recovery ending location: 0/0
Backup start location: 0/0
Backup end location: 0/0
End-of-backup record required: no
Current wal_level setting: minimal
Current max_connections setting: 100
Current max_prepared_xacts setting: 0
Current max_locks_per_xact setting: 64
Maximum data alignment: 8
Database block size: 8192
Blocks per segment of large relation: 131072
WAL block size: 8192
Bytes per WAL segment: 67108864
Maximum length of identifiers: 64
Maximum columns in an index: 32
Maximum size of a TOAST chunk: 1996
Date/time type storage: 64-bit integers
Float4 argument passing: by value
Float8 argument passing: by value
红色标注部分已改变。
4,pg_ctl start -D $PGDATA
OK
Comment