Performance/load performance analysis
A group of test data
Interesting contrast analysis data SAX parser processing
ODF handler | ||||||
End | 12759
|
11805
|
13233
|
11312
|
13279
|
|
Start | 6230
|
5245
|
6758
|
5378
|
6541
|
|
6529
|
6560
|
6475
|
5934
|
6738
|
6447.2
| |
End | 31305
|
36262
|
30083
|
28452
|
28584
|
|
Start | 24852
|
30694
|
23611
|
22018
|
22148
|
|
6453
|
5568
|
6472
|
6434
|
6436
|
6272.6
|
Empty handler | ||||||
End | 354
|
362
|
355
|
370
|
312
|
|
Start | 30
|
30
|
29
|
30
|
24
|
|
324
|
332
|
326
|
340
|
288
|
322
| |
End | 635
|
644
|
633
|
651
|
588
|
|
Start | 354
|
362
|
355
|
370
|
312
|
|
281
|
282
|
278
|
281
|
276
|
279.6
|
The first “start – end” contain load library cost ,the second “start – end” do not contain。
All test result are warm .
We can find the OOo processing more waste than empty XdocumentHandler。
Interesting contrast analysis data Dual-core processing
The test with 912 xml files ( total size 44.2 MB )
Single thread | ||||||
End | 003604
|
003549
|
003500
|
003538
|
003492
|
|
Start | 000029
|
000029
|
000029
|
000023
|
000029
|
|
3575
|
3520
|
3471
|
3515
|
3463
|
3508.8
| |
End | 007076
|
006957
|
006924
|
007025
|
006897
|
|
Start | 003604
|
003550
|
003500
|
003538
|
003492
|
|
3472
|
3407
|
3424
|
3487
|
3405
|
3439
|
End | 002696
|
002763
|
002560
|
002827
|
002133
|
|
Start | 000018
|
000026
|
000028
|
000030
|
000033
|
|
2678
|
2737
|
2532
|
2797
|
2100
|
2568.8
| |
End | 005318
|
005471
|
005224
|
005174
|
004816
|
|
Start | 002696
|
002764
|
002560
|
002827
|
002133
|
|
2622
|
2707
|
2664
|
2347
|
2683
|
2604.6
|
Two thread | ||||||
End | 002314
|
002332
|
002356
|
002377
|
002411
|
|
Start | 000027
|
000029
|
000029
|
000030
|
000030
|
|
2287
|
2303
|
2327
|
2347
|
2381
|
2329
| |
End | 004568
|
004611
|
004622
|
004639
|
004731
|
|
Start | 002314
|
002332
|
002356
|
002377
|
002411
|
|
2254
|
2279
|
2266
|
2262
|
2320
|
2276.2
|
Interesting contrast analysis data Mononuclear processing
Single thread | ||||||
End | 005062
|
005531
|
005516
|
005062
|
005110
|
|
Start | 000062
|
000062
|
000047
|
000062
|
000047
|
|
5000
|
5469
|
5469
|
5000
|
5063
|
5200.2
| |
End | 010016
|
010703
|
010469
|
010062
|
010078
|
|
Start | 005062
|
005531
|
005516
|
005062
|
005110
|
|
4954
|
5172
|
4953
|
5000
|
4968
|
5009.4
|
End | 005391
|
005391
|
005422
|
005797
|
005516
|
|
Start | 000063
|
000047
|
000047
|
000063
|
000063
|
|
5328
|
5344
|
5375
|
5734
|
5453
|
5446.8
| |
End | 010719
|
010672
|
010703
|
011453
|
010797
|
|
Start | 005391
|
005391
|
005422
|
005797
|
005516
|
|
5328
|
5381
|
5281
|
5656
|
5281
|
5385.4
|
Two thread | ||||||
End | 005110
|
005062
|
005516
|
005531
|
005062
|
|
Start | 000047
|
000062
|
000047
|
000062
|
000062
|
|
5063
|
5000
|
5469
|
5469
|
5000
|
5200
| |
End | 010078
|
010062
|
010469
|
010703
|
010016
|
|
Start | 005110
|
005062
|
005516
|
005531
|
005062
|
|
4968
|
5000
|
4953
|
5172
|
4954
|
5009.4
|
Overview
Through the above late two sets of data we can get the follows conclusion:
When Mononuclear, a task divide into multiple small tasks;these are same cost, processing with single thread, one task one thread and two thread.( one task one thread performance relatively low),
When Dual-core, a task divide into multiple small tasks;the waste is close to 0.7-0.8 with one task one thread process ,it is close to 0.66 with two threads process, a single thread process as a benchmark.
Through the first group of data we can get a conclusion:
That possess great proportion cost, OOo process ODF.
How to do
How can we improve ODF loading performance.
When multi-core cpu, Can we split process “content.xml” to some small tasks. Process with multi-thread , and merger the result.
When mononuclear, Can we split process “content.xml” to some small tasks. Asynchronous load document .
* Of course, At the ODF parser will not improve performance?( But now we don't discuss )
How can we split process “content.xml” to some small tasks:
a) Split “content.xml” content
b) Colloect the result with sax parser, and Split that
Description
Split “content.xml” content
Change the document structure, complex nodules defined as a reference link to a single file,the main file contains only simple structure.
It will produce the ODF layout :
content.xml complex/fragment1.xml /fragment2.xml /fragment3.xml /fragment4.xml …
And “content.xml” will be :
… <table:table xlink:href= “complex/fragment1.xml” /> …
I think it will produce the same performance optimization when save documentation, this way.
Colloect the result with sax parser, and Split that
From the above test data, SAX process with relatively less cost itself. We can colloect SAX parser result, and split it to multi part.
End | 005706
|
005510
|
005200
|
005824
|
005527
|
|
Start | 000029
|
000019
|
000019
|
000012
|
000011
|
|
5677
|
5491
|
5181
|
5812
|
5516
|
5695.4
| |
End | 010814
|
010685
|
010297
|
010950
|
010794
|
|
Start | 005706
|
005510
|
005200
|
005824
|
005527
|
|
5108
|
5175
|
5097
|
5126
|
5267
|
5154.6
|
Empty handler | ||||||
End | 003788
|
003763
|
003908
|
004044
|
003702
|
|
Start | 000015
|
000011
|
000013
|
000014
|
000016
|
|
3773
|
3752
|
3895
|
4030
|
3686
|
3821.2
| |
End | 008330
|
007444
|
007836
|
007776
|
007465
|
|
Start | 003788
|
003764
|
003908
|
004044
|
003703
|
|
3542
|
3680
|
3924
|
3732
|
3762
|
3728
|
We can see it is less cost that colloect the SAX parser result. So we can split that throught some rules and multi-thread to process .
Key points
The collected information segmentation algorithm
Now for OOo process is not very clear, do n't know whether be expected. Waits further research
Appendix
Here is simple test, with no other considerations, just to test the feasibility of the technology.
End | 005560
|
005563
|
005551
|
005791
|
005558
|
|
Start | 000033
|
000029
|
000030
|
000030
|
000023
|
|
5527
|
5534
|
5521
|
5761
|
5535
|
5575.6
| |
End | 010952
|
011093
|
011069
|
011268
|
010972
|
|
Start | 005560
|
005563
|
005551
|
005791
|
005558
|
|
5392
|
5530
|
5518
|
5477
|
5414
|
5466.2
|
End | 028326
|
028438
|
028115
|
27835
|
28054
|
|
Start | 000035
|
000035
|
000030
|
30
|
20
|
|
28291
|
28403
|
28085
|
27805
|
28034
|
28123.6
| |
End | 055763
|
056048
|
056294
|
056474
|
056538
|
|
Start | 028326
|
028438
|
28115
|
027835
|
028054
|
|
27437
|
27610
|
28179
|
28639
|
28484
|
28069.8
|
A group of large data test
OOo calc “content.xml” 35.1 MB / 3.8M
ODF handler | ||||||
End | 38220
|
37340
|
37140
|
66316
|
35378
|
|
Start | 6559
|
6292
|
7014
|
38431
|
6513
|
|
31661
|
31048
|
30126
|
27885
|
28865
|
29917
| |
End | 79549
|
82462
|
81397
|
107576
|
873537
|
|
Start | 47461
|
49710
|
50479
|
78099
|
844944
|
|
32088
|
32752
|
30918
|
29477
|
28593
|
30765.6
|
Empty handler | ||||||
End | 2082
|
2178
|
2118
|
2106
|
2082
|
|
Start | 16
|
16
|
13
|
15
|
14
|
|
2066
|
2162
|
2105
|
2091
|
2068
|
2098.4
| |
End | 4070
|
4752
|
4106
|
4088
|
4294
|
|
Start | 2082
|
2178
|
2118
|
2106
|
2082
|
|
1988
|
2574
|
1988
|
1982
|
2212
|
2148.8
|
OOo writer “content.xml” 62.6 MB / 2.0M
ODF handler | ||||||
End | 108876
|
108729
|
108997
|
108901
|
110140
|
|
Start | 6201
|
6060
|
6240
|
7029
|
8415
|
|
102675
|
102669
|
102757
|
101872
|
101725
|
102339.6
| |
End | 229982
|
228416
|
235011
|
234573
|
233124
|
|
Start | 126660
|
125567
|
132174
|
132292
|
130810
|
|
103322
|
102849
|
102837
|
102281
|
102314
|
102720.6
|
Empty handler | ||||||
End | 4825
|
5595
|
5546
|
5658
|
6072
|
|
Start | 20
|
59
|
12
|
23
|
629
|
|
4805
|
5536
|
5534
|
5635
|
5443
|
5390.6
| |
End | 9899
|
11815
|
11280
|
11300
|
11394
|
|
Start | 4825
|
5595
|
5546
|
5658
|
6072
|
|
5074
|
6220
|
5734
|
5642
|
5322
|
5598.4
|