Difference between revisions of "Calc/To-Dos/Statistics/Miscellaneous Data Analysis"
m (→Goal: fixed wiki link format.) |
m (→Goal) |
||
Line 3: | Line 3: | ||
= Goal = | = Goal = | ||
− | One of the most important tasks in data analysis is to describe optimally the data. Other important issues include extracting all the useful information from the original data set and describe complex relationships/ variability. Some of these are accomplished using statistical techiques (especially in biomedical sciences, see [[ | + | One of the most important tasks in data analysis is to describe optimally the data. Other important issues include extracting all the useful information from the original data set and describe complex relationships/ variability. Some of these are accomplished using statistical techiques (especially in biomedical sciences, see [[Statistical Data Analysis Tool]]), yet most will use different techniques, which will be described here. |
Unfortunately, I lost any contact with mathematics more than 10 years ago. Therefore, my comments will be very brief, and I hope that people with interest and knowledge will develop this page further. | Unfortunately, I lost any contact with mathematics more than 10 years ago. Therefore, my comments will be very brief, and I hope that people with interest and knowledge will develop this page further. |
Revision as of 02:38, 21 August 2006
Front matter....
Goal
One of the most important tasks in data analysis is to describe optimally the data. Other important issues include extracting all the useful information from the original data set and describe complex relationships/ variability. Some of these are accomplished using statistical techiques (especially in biomedical sciences, see Statistical Data Analysis Tool), yet most will use different techniques, which will be described here.
Unfortunately, I lost any contact with mathematics more than 10 years ago. Therefore, my comments will be very brief, and I hope that people with interest and knowledge will develop this page further.
...
Specific Techiques
Summarizing Data
Methods to summarize the information in a limited number of components, e.g. linear dimension reduction
- Principal Component Analysis:
- most variability is extracted from the original data;
- the resulting variables are non-correlated;
- optimal linear transformation
- disadvantage: resulting variables might be difficult to interpret (do not have any logical meaning)
- see http://en.wikipedia.org/wiki/Principal_components_analysis
- Varimax
- see http://de.wikipedia.org/wiki/Varimax
- see also: http://sekhon.berkeley.edu/stats/html/varimax.html (an R-implementation: package stats)
- Simple Component Analysis:
- variables are not necassarily non-correlated
- but are easier to understand/ to interpret
- see Rousson V, Gasser T. Simple component analysis. Appl. Statist. 2004; 53:539–555, http://www.unizh.ch/biostat/Manuscripts/simpcomp.pdf
- R-implementation: http://www.maths.lth.se/help/R/.R/library/sca/html/00Index.html (package sca)
Energy-Frequency Analysis
- Fourier Transform (limited to stationary and linear data)
- wavelet analysis
- Wigner-Ville distribution
- Empirical Mode Decomposition: more robust (a detailed description is freely available somewhere on the net; The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis. Huang et al. Proc. R. Soc. Lond. A 1998; 454:903-995)
Resources
Links
- ...