02

05

ArcSDE底层研究[转]

最近研究ArcSDE的后台,使用的是ArcSDE9.2 + oracle9.2.0.4 情况如下:
通过ArcSDE打开一张Feature表DLTB_H_320982100,所对应的F表是F93,所对应的S表是S93 看如下情况:
1.在ArcMap中通过ArcSDE加载该图层的全部数据,以下SQL是通过Oracle后台抓的
Select /*+ FULL (F_) */ F_.fid,F_.numofpts,F_.entity,F_.points,F_.rowid
FROM DF.F93 F_
说明:OK,没有问题,全表扫描

2.打开图层的属性表
Select  OBJECTID,  BSM,  YSDM,  TBYBH,  TBBH,  DLBM,  DLMC,  QSXZ,  ZLDWDM,  ZLDWMC,  QSDWDM,  QSDWMC,  GDLX,  KCLX,  KCDLBM,  TKXS,  TBMJ,  XZDWMJ,  LXDWMJ,  TKMJ,  TBDLMJ,  PZWH,  BGJLH,  BGRQ,  SSQY,  FLDM,  XZQDM,  QSZDH,  SYZDH,  SZTFH,  PCMJ,  JLRQ,  GXRQ,  BZ,  SJYT,  PDJB,  YDLBM,  SHAPE.AREA,  SHAPE.LEN
FROM  DF.DLTB_H_320982100 , DF.F93 SHAPE
Where (OBJECTID in (22677,22679,22680,22708,22719,22732,22742,22783,22810,18630,23281,9202,9205,9221,24210,24227,24259,24276,24302,24309,5128,9,10,12,16,39,67,73,79,98,118,122,150,159,187,219,234,237,256,305,343,345,348,351,388,457,1118,403,405,416,1659,1121,2439,2449,3424,4089,4779,5108,5116,5125,317,320,324,338,1665,3176,3183,3196,3519,4133,4138,466,469,481,495,23825,23833,23844,23848,23849,23861,23887,23901,23926,23953,23958,17354,20015,20017,20027,20036,20039,20063,20104,20111,20127,20146,16464,22517,22535,22580,22602,22613,22616,22639,22641,22655,22673,10329,10336,10337,10365,10369,10379,10384,10401,10420,10450,10471,5983,5988,5996,6018,6412,6415,7593,7600,12950,12967,12973,13236,13245,13278,13296,13350,13424,13426,13436,13478,13485,23122,23135,23185,23198,23199,23202,23232,23256,23271,23274)) and SHAPE.FID(+) = DF.DLTB_H_320982100.SHAPE
说明:ArcSDE比较聪明,只查了大概150的数据,在我们滚动滚动条时,再继续查后面或前面的数据,只有滚动到一定程序才会查!

3.通过一个属性过滤一部分数据:如查询属性SSQY='320982100211'的所有记录(注:SDE后台只查看图形,不涉及到属性)
Select  SHAPE,   SHAPE.fid,SHAPE.numofpts,SHAPE.entity,SHAPE.points,SHAPE.rowid
FROM  DF.DLTB_H_320982100 , DF.F93 SHAPE
Where (( SSQY = '320982100211' )) and SHAPE.FID(+) = DF.DLTB_H_320982100.SHAPE

收集统计信息,查看执行计划:
Description                             研究对象     对象名称                耗费        基数        字节        IO耗费
------------------------------------------------------------------------------------------------------------
Select STATEMENT, GOAL = CHOOSE                                                147        907        47164        147        
HASH JOIN OUTER                                                        147        907        47164        147        
  TABLE ACCESS FULL                        DF        DLTB_H_320982100        89        907        27210        89        
  TABLE ACCESS FULL                        DF        F93                        55        25385        558470        55

我觉得很有问题,关于这条语句,因为如果当这个图层数据量非常庞大的时侯,需要对business表进行全表扫描,对相应的F表也进行全表扫描,然后再 HASH连接,这样就搞得Oracle I/O非常大,速度也很慢。如果一开始打开多张图层的话,比如10个图层,而这10个图层数据量比较大的话,那就恐怖了!

4.在打开的图形中平移操作
Select /*+ LEADING INDEX(S_ S93_IX1) INDEX(SHAPE F93_UK1) INDEX(DLTB_H_320982100 A93_IX1) */  SHAPE  ,S_.eminx,S_.eminy,S_.emaxx,S_.emaxy ,SHAPE.fid,SHAPE.numofpts,SHAPE.entity,SHAPE.points,SHAPE.rowid
FROM (Select  /*+ INDEX(SP_ S93_IX1) */ DISTINCT sp_fid, eminx, eminy, emaxx, emaxy
        FROM DF.S93 SP_
        Where SP_.gx >= :1
        AND SP_.gx <= :2
        AND SP_.gy >= :3
        AND SP_.gy <= :4
        AND SP_.eminx <= :5
        AND SP_.eminy <= :6
        AND SP_.emaxx >= :7
        AND SP_.emaxy >= :8) S_ ,  DF.DLTB_H_320982100 , DF.F93 SHAPE
Where S_.sp_fid = SHAPE.fid
AND S_.sp_fid = DF.DLTB_H_320982100.SHAPE
AND  (( SSQY = '320982100211' ))

说明:没有这几个参数的值,不知道怎么算的.

关于索引和全表扫描的问题,大家可以参考Oracle的论坛,这里就不多说了!
不过,如果通过某一字段访问大一点的数据量,就很容易引起全表扫描。

打个比方说吧:表里的数据行数是100W行,而只想通过某个字段去过滤访问2000行记录,如果使用B*Tree索引,很有可能是全表扫描。除非表中数据行按这个索引字段,规则排布!
而查看过ArcSDE在Oracle中的架构,基本上F表上的索引和S表上的索引都是B*Tree索引,感觉很不行啊!

有什么方法,可以不进行全表扫描吗?
难不成,从这条语句就把ArcSDE给定死了,图层中的数据越少越好?
Select  SHAPE,   SHAPE.fid,SHAPE.numofpts,SHAPE.entity,SHAPE.points,SHAPE.rowid
FROM  DF.DLTB_H_320982100 , DF.F93 SHAPE
Where (( SSQY = '320982100211' )) and SHAPE.FID(+) = DF.DLTB_H_320982100.SHAPE
Description                                                           对象所有者     对象名称                          耗费        基数        字节        IO耗费
-------------------------------------------------------------------------------------------------------------------------------------------------------------------
Select STATEMENT, GOAL = CHOOSE                                                                        147        907        47164        147      
HASH JOIN OUTER                                                                                                              147        907        47164        147      
  TABLE ACCESS FULL                                          DF        DLTB_H_320982100         89        907        27210           89      
  TABLE ACCESS FULL                                          DF        F93                                          55        25385        558470    55

ArcSDE没这么落后吧!

接我自已昨天的研究,昨天可能忘了在SSQY上建索引,今天换了一张表作测试。
数据量如下:

SQL> select count(*) from D15_DLZJ_H_320412;

  COUNT(*)
----------
    691728

D15_DLZJ_H_320412        : 368M
F233                        : 72M
S233                        : 62M

SQL> show parameter db_file_multiblock_read_count

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
db_file_multiblock_read_count        integer     16

SQL> show parameter db_block_size;

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
db_block_size                        integer     16384


Select  SHAPE,  Element,   SHAPE.fid,SHAPE.numofpts,SHAPE.entity,SHAPE.points,SHAPE.rowid
        FROM  WJ.D15_DLZJ_H_320412 , WJ.F233 SHAPE
        Where (( ( (NOT Status = 1 or Status is NULL) ) AND ( SSQY = '320412100215' ) ))
        and SHAPE.FID(+) = WJ.D15_DLZJ_H_320412.SHAPE

未建索引:

Description                            对象所有者     对象名称                耗费        基数        字节        IO耗费
------------------------------------------------------------------------------------------------------------
Select STATEMENT, GOAL = CHOOSE                                                2733        1651        320294        2733        
HASH JOIN OUTER                                                        2733        1651        320294        2733        
  TABLE ACCESS FULL                        WJ        D15_DLZJ_H_320412        2228        1651        285623        2228        
  TABLE ACCESS FULL                        WJ        F233                        414        691728        14526288        414        


建B*Tree索引

Description                            对象所有者     对象名称                耗费        基数        字节        IO耗费
------------------------------------------------------------------------------------------------------------
Select STATEMENT, GOAL = CHOOSE                                                1027        1651        320294        1027        
HASH JOIN OUTER                                                        1027        1651        320294        1027        
  TABLE ACCESS BY INDEX ROWID                WJ        D15_DLZJ_H_320412        522        1651        285623        522        
   INDEX RANGE SCAN                        WJ        IDX_DLZJ_SSQY                7        1651                7        
  TABLE ACCESS FULL                        WJ        F233                        414        691728        14526288 414

建Bitmap索引 
Description                            对象所有者     对象名称                耗费        基数        字节        IO耗费
------------------------------------------------------------------------------------------------------------
Select STATEMENT, GOAL = CHOOSE                                                917        1651        320294        917        
HASH JOIN OUTER                                                        917        1651        320294        917        
  TABLE ACCESS BY INDEX ROWID                WJ        D15_DLZJ_H_320412        412        1651        285623        412        
   BITMAP CONVERSION TO ROWIDS                                                        
    BITMAP INDEX SINGLE VALUE                WJ        IDXBMP_DLZJ_SSQY                                        
  TABLE ACCESS FULL                        WJ        F233                        414        691728        14526288 414        

建了位图索引,查询效率是最优,是未建索引的1/3,建B*Tree索引比建位图索引大概高了100个IO

查看数据分布情况:
Select DISTINCT SSQY, COUNT(*) FROM D15_DLZJ_H_320412 GROUP BY SSQY;

SQL> Select DISTINCT SSQY, COUNT(*) FROM D15_DLZJ_H_320412 GROUP BY SSQY;

SSQY                                                                               COUNT(*)
-------------------------------------------------------------------------------- ----------
320412100200                                                                           1256
320412100201                                                                            409
320412100202                                                                            188
320412100203                                                                           2437
320412100204                                                                           3198
320412100205                                                                            779
320412100206                                                                           2141
320412100207                                                                           1582
320412100208                                                                           1506
320412100209                                                                            681
320412100210                                                                            710
320412100211                                                                            499
320412100212                                                                            769
320412100213                                                                           1265
320412100214                                                                           1536
320412100215                                                                           1838
320412100216                                                                             38
320412100217                                                                            552
320412100218                                                                            579
320412100219                                                                             85

SSQY                                                                               COUNT(*)
-------------------------------------------------------------------------------- ----------
320412100220                                                                            850
320412100221                                                                           1681
320412100222                                                                           1466
320412100223                                                                            435
320412100224                                                                            284
320412100225                                                                            610
320412100226                                                                            748
320412100227                                                                           2153
320412100228                                                                            383
320412100229                                                                            383
320412100230                                                                            357
320412100231                                                                            127
320412100232                                                                            709
320412100233                                                                            123
320412100234                                                                            639
320412100235                                                                            284
320412100236                                                                            148
320412100237                                                                            333
320412100238                                                                           1015
320412100239                                                                            736
320412100240                                                                            120

SSQY                                                                               COUNT(*)
-------------------------------------------------------------------------------- ----------
320412100241                                                                            246
320412100242                                                                            623
320412100243                                                                            199
320412100244                                                                            174
320412100245                                                                            381
320412100246                                                                             91
320412100247                                                                            256
320412100248                                                                            123
320412101200                                                                           1507
320412101201                                                                           1666
320412101202                                                                           1285
320412101203                                                                           4103
320412101204                                                                           2958
320412101205                                                                           2557
320412101206                                                                           3375
320412101207                                                                           3571
320412101208                                                                           1698
320412101209                                                                           1396
320412101210                                                                           3405
320412101211                                                                           2154
320412101212                                                                           3434

.....还有很多

所以,如果数据分布得比较散,建索引还是有效果的,如果数据比较集中,建索引可能就没有效果了

总结来说:
在60多W条记录中,访问2000条记录左右,IO使用次数可以控制在900多次,个人感觉,还是IO高

下面是小表的数据分布情况:

SQL> select count(*) from DLTB_H_320982100;

  COUNT(*)
----------
     25385

SQL> Select SSQY, COUNT(*) FROM DLTB_H_320982100 GROUP BY SSQY;

SSQY                                                                               COUNT(*)
-------------------------------------------------------------------------------- ----------
320982100001                                                                            280
320982100018                                                                             80
320982100019                                                                            813
320982100200                                                                           1311
320982100201                                                                           1492
320982100202                                                                           1151
320982100203                                                                           1018
320982100204                                                                           1110
320982100205                                                                            752
320982100206                                                                           1253
320982100207                                                                           1122
320982100208                                                                            930
320982100209                                                                            821
320982100210                                                                           1203
320982100211                                                                           1201
320982100212                                                                            464
320982100213                                                                            595
320982100214                                                                           1553
320982100215                                                                            964
320982100216                                                                            877

SSQY                                                                               COUNT(*)
-------------------------------------------------------------------------------- ----------
320982100217                                                                            644
320982100218                                                                           1192
320982100219                                                                           1193
320982100220                                                                            687
320982100221                                                                            968
320982100222                                                                            779
320982100223                                                                            526
320982100224                                                                            406

接上一部分研究:

关于浏览数据:

Select /*+ LEADING INDEX(S_ S233_IX1) INDEX(SHAPE F233_UK1) INDEX(DLTB_H_320982100 A93_IX1) */  SHAPE  ,S_.eminx,S_.eminy,S_.emaxx,S_.emaxy ,SHAPE.fid,SHAPE.numofpts,SHAPE.entity,SHAPE.points,SHAPE.rowid
FROM
(Select  /*+ INDEX(SP_ S93_IX1) */ DISTINCT sp_fid, eminx, eminy, emaxx, emaxy
        FROM S233 SP_
        Where SP_.gx >= 931017
        AND SP_.gx <= 931020
        AND SP_.gy >= 275724
        AND SP_.gy <= 275726
        AND SP_.eminx <= 291967531713
        AND SP_.eminy <= 96467438080
        AND SP_.emaxx >= 291967636415
        AND SP_.emaxy >= 86467443711)S_ ,  WJ.d15_dlzj_h_320412 , WJ.F233 SHAPE
Where S_.sp_fid = SHAPE.fid
AND S_.sp_fid = WJ.d15_dlzj_h_320412.SHAPE
AND  (( SSQY = '320982100211' ))




# 空间索引搜索
Select  /*+ INDEX(SP_ S93_IX1) */ DISTINCT sp_fid, eminx, eminy, emaxx, emaxy
        FROM S233 SP_
        Where SP_.gx >= 931017
        AND SP_.gx <= 931020
        AND SP_.gy >= 275724
        AND SP_.gy <= 275726
        AND SP_.eminx <= 291967531713
        AND SP_.eminy <= 96467438080
        AND SP_.emaxx >= 291967636415
        AND SP_.emaxy >= 86467443711;

                                 SP_FID                                   EMINX                                   EMINY                                   EMAXX                                   EMAXY
--------------------------------------- --------------------------------------- --------------------------------------- --------------------------------------- ---------------------------------------
                                   1689                            291967531712                             86467438080                            291967636416                             86467443712
                                  42290                            291967340608                             86467503616                            291967637312                             86467604928



查看执行计划:
Description                            对象所有者     对象名称                耗费        基数        字节        IO耗费
------------------------------------------------------------------------------------------------------------
Select STATEMENT, GOAL = CHOOSE                                                4        1        46        4        
SORT UNIQUE                                                                4        1        46        4        
  INDEX RANGE SCAN                        WJ                S233_IX1        2        2        92        2        

为什么会这么快?
原因:ArcSDE把空间数据全部编入索引

Select b.INDEX_NAME, b.TABLE_NAME, b.COLUMN_NAME FROM user_indexes a,user_ind_columns b  
  where a.index_name=b.index_name
  AND a.table_name = 'S233';

INDEX_NAME                     TABLE_NAME                     COLUMN_NAME
------------------------------ ------------------------------ --------------------------------------------------------------------------------
S233_IX1                       S233                           GX
S233_IX1                       S233                           GY
S233_IX1                       S233                           EMAXX
S233_IX1                       S233                           EMAXY
S233_IX1                       S233                           EMINX
S233_IX1                       S233                           EMINY
S233_IX1                       S233                           SP_FID
S233_IX2                       S233                           SP_FID

8 rows selected




Select /*+ LEADING INDEX(S_ S233_IX1) INDEX(SHAPE F233_UK1) INDEX(DLTB_H_320982100 A93_IX1) */  SHAPE  ,S_.eminx,S_.eminy,S_.emaxx,S_.emaxy ,SHAPE.fid,SHAPE.numofpts,SHAPE.entity,SHAPE.points,SHAPE.rowid
FROM
(Select  /*+ INDEX(SP_ S93_IX1) */ DISTINCT sp_fid, eminx, eminy, emaxx, emaxy
        FROM S233 SP_
        Where SP_.gx >= 931017
        AND SP_.gx <= 931020
        AND SP_.gy >= 275724
        AND SP_.gy <= 275726
        AND SP_.eminx <= 291967531713
        AND SP_.eminy <= 96467438080
        AND SP_.emaxx >= 291967636415
        AND SP_.emaxy >= 86467443711)S_ ,  WJ.d15_dlzj_h_320412 , WJ.F233 SHAPE
Where S_.sp_fid = SHAPE.fid
AND S_.sp_fid = WJ.d15_dlzj_h_320412.SHAPE
AND  (( SSQY = '320982100211' ))

查看执行计划:
Description                            对象所有者     对象名称                耗费        基数        字节        IO耗费
------------------------------------------------------------------------------------------------------------
Select STATEMENT, GOAL = CHOOSE                                                37        1        124        37        
NESTED LOOPS                                                                37        1        124        37        
  NESTED LOOPS                                                                35        1        103        35        
   VIEW        WJ                                                                33        1        73                
    SORT UNIQUE                                                                33        1        40        33        
     INDEX RANGE SCAN                        WJ        S233_IX1                31        2        80        31        
   TABLE ACCESS BY INDEX ROWID                WJ        D15_DLZJ_H_320412        2        1        30        2        
    INDEX UNIQUE SCAN                        WJ        A233_IX1                1        419                1        
  TABLE ACCESS BY INDEX ROWID                WJ        F233                        2        1        21        2        
   INDEX UNIQUE SCAN                        WJ        F233_UK1                1        1                1        

该处有DISTINCT,如果数据量比较大,引起的排序,也是比较可观的



极端情况如下:
Select  /*+ INDEX(SP_ S233_IX1) */ DISTINCT sp_fid, eminx, eminy, emaxx, emaxy
        FROM S233 SP_
        Where SP_.gx >= 830117
        AND SP_.gx <= 1131020
        AND SP_.gy >= 175724
        AND SP_.gy <= 375726;

Description                            对象所有者     对象名称                耗费        基数        字节        IO耗费
------------------------------------------------------------------------------------------------------------
Select STATEMENT, GOAL = CHOOSE                                                15048        1204609        48184360        15048        
SORT UNIQUE                                                                15048        1204609        48184360        15048        
  INDEX RANGE SCAN                        WJ        S233_IX1                8640        1204609        48184360        8640

终于通过跟踪事件,从SQL追踪文件中查询出空间索引的值:

Select /*+ LEADING INDEX(S_ S12_IX1) INDEX(SHAPE F12_UK1) INDEX(D11_MZDL_H_320412 A12_IX1) */
SHAPE,
S_.EMINX,
S_.EMINY,
S_.EMAXX,
S_.EMAXY,
SHAPE.FID,
SHAPE.NUMOFPTS,
SHAPE.ENTITY,
SHAPE.POINTS,
SHAPE.ROWID
        FROM (Select /*+ INDEX(SP_ S12_IX1) */
                                DISTINCT SP_FID, EMINX, EMINY, EMAXX, EMAXY
                                        FROM WJ.S12 SP_
                                 Where ((SP_.GX >= 111271 AND SP_.GX <= 111282 AND SP_.GY >= 32955 AND SP_.GY <= 32963)
                or
                                                          (SP_.GX >= 16784634 AND SP_.GX <= 16784634 AND SP_.GY >= 16779413 AND SP_.GY <= 16779413))
                                         AND SP_.EMINX <= 292006298550
                                         AND SP_.EMINY <= 86496938779
                                         AND SP_.EMAXX >= 291977117786
                                         AND SP_.EMAXY >= 86474245629) S_,
                         WJ.D11_MZDL_H_320412,
                         WJ.F12 SHAPE
Where S_.SP_FID = SHAPE.FID
         AND S_.SP_FID = WJ.D11_MZDL_H_320412.SHAPE
         AND ((SSQY = '320412100200'))

查询的结果,与ArcMap相同,不知道ArcMap怎么算的

AND SP_.EMINX <= 292006298550
AND SP_.EMINY <= 86496938779
AND SP_.EMAXX >= 291977117786
AND SP_.EMAXY >= 86474245629
这些值好算,肯定是ArcMap的视图范围,只是以内部System unit来存的

GX, GY,格网值,就不知道怎么算的了!



[本日志由 独木舟 于 2009-02-05 00:38 AM 编辑]
文章来自: 本站原创
引用通告: 查看所有引用 | 我要引用此文章
Tags: GIS
相关日志:
评论: 0 | 引用: 0 | 查看次数: -
发表评论
昵 称:
密 码: 游客发言不需要密码.
内 容:
验证码: 验证码
选 项:
虽然发表评论不用注册,但是为了保护您的发言权,建议您注册帐号.