最近准备写点Histogram和density相关的文章,先把术语给大家理一理:
cardinality (CDN)
Legend
CBQT – cost-based query transformation
JPPD – join predicate push-down
OJPPD – old-style (non-cost-based) JPPD
FPD – filter push-down
PM – predicate move-around
CVM – complex view merging
SPJ – select-project-join
SJC – set join conversion
SU – subquery unnesting
OBYE – order by elimination
OST – old style star transformation
ST – new (cbqt) star transformation
CNT – count(col) to count(*) transformation
JE – Join Elimination
JF – join factorization
SLP – select list pruning
DP – distinct placement
qb – query block
LB – leaf blocks
DK – distinct keys
LB/K – average number of leaf blocks per key
DB/K – average number of data blocks per key
CLUF – clustering factor
NDV – number of distinct values
Resp – response cost
Card – cardinality
Resc – resource cost
NL – nested loops (join)
SM – sort merge (join)
HA – hash (join)
CPUSPEED – CPU Speed
IOTFRSPEED – I/O transfer speed
IOSEEKTIM – I/O seek time
SREADTIM – average single block read time
MREADTIM – average multiblock read time
MBRC – average multiblock read count
MAXTHR – maximum I/O system throughput
SLAVETHR – average slave I/O throughput
dmeth – distribution method
1: no partitioning required
2: value partitioned
4: right is random (round-robin)
128: left is random (round-robin)
8: broadcast right and partition left
16: broadcast left and partition right
32: partition left using partitioning of right
64: partition right using partitioning of left
256: run the join in serial
0: invalid distribution method
sel – selectivity
ptn – partition
adop Automatic degree of parallelism
TABLE: Table Name
ALIAS: Table Alias
QBS: Query Block Signature
#ROWS: Number of Rows
#BLKS: Number of Blocks
ARL: Average Row Length
COR: Cardinality Original
CRD: Cardinality Rounded
CCM: Cardinality Computed
CNA: Cardinality Non Adjusted
AVGLEN: Average Column Length
NDV: Number of Distinct Values
NULLS: Number of Nulls in Column
DEN: Column Density
MIN: Minimum Column Value
MAX: Maximum Column Value
TYPE: Histogram Type
#BKTS: Histogram Buckets
UNCOMPBKTS: Histogram Uncompressed Buckets
ENDPTVALS: Histogram End Point Values
OOR: Out-of-Range Predicate
TABLE: Table Name
ALIAS: Table Alias
INDEX: Index Name
QBS: Query Block Signature
LVLS: Index Levels
#LB: Number of Leaf Blocks
#DK: Number of Distinct Keys
LB/K: Average Number of Leaf Blocks Per Key
DB/K: Average Number of Data Blocks Per Key
CLUF: Clustering Factor
INDEX_COLS: Index Column Numbers
COST: Cost of the Join
CARD: Cardinality of the Join
BC: Best Cost
LINE#: Line Number in the 10053 Trace File Where Cost Value is Located
JOIN#: Join Number in the 10053 Trace File Associated With Key
STATUS: If Permutation was Computed for all Table Joins the Status = COMPL. If Not, status = ABORT
*: In ANY Column Indicates Value Not Found in File
Freq 频率直方图
HtBal 高度平衡直方图
关于 UNCOMPBKTS和ENDPTVALS
当直方图类型为frequency histograms( Histogram: Freq)时UncompBkts 等于统计信息中采样的总行数-NULLS(Card: Original- NULLS,因为dbms_stats默认是auto_sample_size采样,所以这栏其实是采样到的原始Card-NULLS), 而EndPtVals 等于bucket总数,或者说NDV,因为frequency histograms中 NDV=number of buckets
当直方图类型为height balanced histograms (Histogram: HtBal) UncompBkts 等于bucket的数目(其实也等于10053 trace中#Bkts的数目),而EndPtVals 等于已经被压缩的Histogram的大小,其实是等于: select count(*) from dba_tab_histograms where table_name=’YOUR_TABLE_NAME’ and column_name=’YOUR_COLUMN_NAME’的实际总和。 通过这2个值对比,可以了解到popular值的多少以及数据的倾斜度, 是有多个大量重复的值(popular value)还是仅有一个巨大的重复值。
很牛x