Performance/load performance implement
From Apache OpenOffice Wiki
< Performance
Revision as of 08:59, 7 December 2009 by Zengliangjun (talk | contribs)
|
---|
Quick Navigation Team Communication Activities |
About this template |
Please view Performance/Odf_document_load_performance_increase_feasibility_analysis about analysis. This document explain implement and some discussion.
Implement
Collect data throught sax:
- The first step is simple, and no more problem.
- We can collect valid data information to a data structure throught sax parser, the structure's elements can be queried quickly by location identified. I select the vector, now.
- The element has these base fields:
- sal_uInt16 m_prefix;
- rtl_uString* mp_localName;
- sal_uInt32 m_local;
- sal_uInt32 m_distance;
- sal_uInt32 m_count;
- and other fields such as namespace / full name / cached token enum ...
Processing method
- We can go through the result's element by element's "m_local" "m_distance" "m_count"; Get parent element and subelements.
- Every element's processing that has two process steps:
- Process itself , process the attribute and process subelements.
- The result information commit to the parent element.
- processElement() : Process element
- _startElement() : Start element
- processSubContexts() : If this element is a parent element, subelements process; every subelement processing will be divided into three steps.
- _createChildContext(): Create son context
- _processSubContext(): Process subelement
- _collectSubContext(): Collect the subelement's result data
- _characters(): Process element's content
- _endElement(): End element
- commit(): commit the result data to parent element
- processElement() : Process element
- An element has three solutions, Serial / Parallel / delayed processing.
- Serial processing: every element will be same processing. As current processing end, back to parent. The parent will jump to next son processing if has next, or back to parent's parent.
- Parallel processing: One element that has many subelements, will split every subelement's "_processSubContext()" into different work thread. When all subelements end, the "_collectSubContext()" will be serial processing.
- Delayed processing: One element that has many subelements, the first subelement will process and the others will delay to the document processing end.
Processing
- It is serial processing to the interests of the whole.
- We know an odf document will be four base parts : "meta" "settings" "styles" "content".
- The dependent relation is : "content" -> "styles" -> "settings" -> "meta"( I think it is no more problem -:) )
- So It will be same as now "meta" -> "settings" -> "styles" -> "content".
- To every part parallel processing ,that will be possible.
- Meta part, Settings part and Styles part those can be "Parallel processing".
- I think it can be "Parallel processing" or "Delayed processing", Conent part; No other part depend this part.
Difficulty
Meta part
- We know "<office:meta>" , it's subelements is like "<meta:*>". I think the subelements has no correlation between. That can be parallel processing.
- Currently, this part process a DOM object, I do't know why. So this part is serial processing, now.
Settings part
- The "<office:settings>", every subelement of that will get an "beans::PropertyValue". It can be parallel processing.
Styles part and Conent part
- The object from sfx2 , sd , sc and sw; It is complex.
Plan
- Implement almost source code about sd to plan.
- Debug the Serial processing process is right.
- Try to test Parallel processing.
- Try to analysis the Delayed processing is feasibility.