CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Wiki > Introduction to turbulence/Statistical analysis/Generalization...

Introduction to turbulence/Statistical analysis/Generalization to the estimator of any quantity

From CFD-Wiki

(Difference between revisions)
Jump to: navigation, search
Line 1: Line 1:
 +
{{Turbulence}}
Similar relations can be formed for the estimator of any function of the random variable say <math>f(x)</math>. For example, an estimator for the average of <math>f</math> based on <math>N</math> realizations is given by:
Similar relations can be formed for the estimator of any function of the random variable say <math>f(x)</math>. For example, an estimator for the average of <math>f</math> based on <math>N</math> realizations is given by:
-
<table width="100%"><tr><td> 
+
:<math>F_{N}\equiv\frac{1}{N}\sum^{N}_{n=1}f_{n}</math>   
-
:<math>  
+
-
F_{N}\equiv\frac{1}{N}\sum^{N}_{n=1}f_{n}
+
-
</math>   
+
-
</td><td width="5%">(2)</td></tr></table>
+
where <math>f_{n}\equiv f(x_{n})</math>. It is straightforward to show that this estimator is unbiased, and its variability (squared) is given by:
where <math>f_{n}\equiv f(x_{n})</math>. It is straightforward to show that this estimator is unbiased, and its variability (squared) is given by:
-
<table width="100%"><tr><td> 
+
:<math>
-
:<math>  
+
\epsilon^{2}_{F_{N}}= \frac{1}{N} \frac{var \left\{f \left( x \right) \right\}}{\left\langle f \left( x \right) \right\rangle^{2} }
\epsilon^{2}_{F_{N}}= \frac{1}{N} \frac{var \left\{f \left( x \right) \right\}}{\left\langle f \left( x \right) \right\rangle^{2} }
</math>   
</math>   
-
</td><td width="5%">(2)</td></tr></table>
 
'''Example:''' Suppose it is desired to estimate the variability of an estimator for the variance based on a finite number of samples as:
'''Example:''' Suppose it is desired to estimate the variability of an estimator for the variance based on a finite number of samples as:
-
<table width="100%"><tr><td> 
 
:<math>     
:<math>     
var_{N} \left\{x \right\} \equiv \frac{1}{N} \sum^{N}_{n=1} \left( x_{n} - X \right)^{2}
var_{N} \left\{x \right\} \equiv \frac{1}{N} \sum^{N}_{n=1} \left( x_{n} - X \right)^{2}
</math>   
</math>   
-
</td><td width="5%">(2)</td></tr></table>
 
(Note that this estimator is not really very useful since it presumes that the mean value, <math>X</math>, is known, whereas in fact usually only <math>X_{N}</math> is obtainable).
(Note that this estimator is not really very useful since it presumes that the mean value, <math>X</math>, is known, whereas in fact usually only <math>X_{N}</math> is obtainable).
Line 27: Line 20:
'''Answer'''
'''Answer'''
-
Let <math>f=(x-X)^2</math> in equation 2.55 so that <math>F_{N}= var_{N}\left\{ x \right\}</math>, <math>\left\langle f \right\rangle = var \left\{ x \right\} </math> and <math>var \left\{f \right\} = var \left\{ \left( x-X \right)^{2} - var \left[ x-X \right] \right\}</math>. Then:
+
Let <math>f=(x-X)^2</math> in equation the equation for <math>\epsilon^{2}_{F_{N}}</math> above so that <math>F_{N}= var_{N}\left\{ x \right\}</math>, <math>\left\langle f \right\rangle = var \left\{ x \right\} </math> and <math>var \left\{f \right\} = var \left\{ \left( x-X \right)^{2} - var \left[ x-X \right] \right\}</math>. Then:
-
<table width="100%"><tr><td> 
 
:<math>     
:<math>     
\epsilon^{2}_{F_{N}}= \frac{1}{N} \frac{var \left\{ \left( x-X \right)^{2} - var \left[x \right] \right\} }{ \left( var \left\{ x \right\} \right)^{2} }
\epsilon^{2}_{F_{N}}= \frac{1}{N} \frac{var \left\{ \left( x-X \right)^{2} - var \left[x \right] \right\} }{ \left( var \left\{ x \right\} \right)^{2} }
</math>   
</math>   
-
</td><td width="5%">(2)</td></tr></table>
 
This is easiest to understand if we first expand only the numerator to oblain:
This is easiest to understand if we first expand only the numerator to oblain:
-
<table width="100%"><tr><td> 
 
:<math>     
:<math>     
var \left\{ \left( x- X \right)^{2} - var\left[x \right] \right\} = \left\langle \left( x- X \right)^{4} \right\rangle  - \left[ var \left\{ x \right\} \right]^2
var \left\{ \left( x- X \right)^{2} - var\left[x \right] \right\} = \left\langle \left( x- X \right)^{4} \right\rangle  - \left[ var \left\{ x \right\} \right]^2
</math>   
</math>   
-
</td><td width="5%">(2)</td></tr></table>
 
Thus
Thus
-
<table width="100%"><tr><td> 
 
:<math>     
:<math>     
\epsilon^{2}_{var_{N}} = \frac{\left\langle \left( x- X \right)^4 \right\rangle}{\left[ var \left\{ x \right\} \right]^2 } - 1
\epsilon^{2}_{var_{N}} = \frac{\left\langle \left( x- X \right)^4 \right\rangle}{\left[ var \left\{ x \right\} \right]^2 } - 1
</math>   
</math>   
-
</td><td width="5%">(2)</td></tr></table>
 
-
Obviuosly to proceed further we need to know how the fourth central moment relates to the second central moment. As noted earlier, in general thi is ''not'' known. If, however, it is reasonable to assume that <math>x</math> is a Gaussian distributed random variable, we know from section 2.3.4 that the kirtosis is 3. Then for Gaussian distributed random variables,
+
Obviuosly to proceed further we need to know how the fourth central moment relates to the second central moment. As noted earlier, in general thi is ''not'' known. If, however, it is reasonable to assume that <math>x</math> is a Gaussian distributed random variable, we know from the previous section on [[Probability in turbulence#Skewness and kurtosis|skewness and kurtosis]] that the kirtosis is 3. Then for Gaussian distributed random variables,
-
<table width="100%"><tr><td> 
 
:<math>     
:<math>     
\epsilon^{2}_{var_{N}} = \frac{2}{N}
\epsilon^{2}_{var_{N}} = \frac{2}{N}
</math>   
</math>   
-
</td><td width="5%">(2)</td></tr></table>
 
Thus the number of independnt data required to produce the same level of convergence for an estimate of the variance of a Gaussian distributed random variable is <math>\sqrt{2}</math> times that of mean, It is easy to show that the higher the moment, the more the amount of data required.
Thus the number of independnt data required to produce the same level of convergence for an estimate of the variance of a Gaussian distributed random variable is <math>\sqrt{2}</math> times that of mean, It is easy to show that the higher the moment, the more the amount of data required.
As noted earlier, turbulence problems are not usually Gaussian, and in fact values of the kurtosis substantionally greater than 3 are commonly encountered, especially for the moments of differentiated quantities. Clearly the non-Gaussian nature of random variables can affect the planning of experiments, since substantially greater amounts of data can be required to achieved the necessary statistical accuracy.
As noted earlier, turbulence problems are not usually Gaussian, and in fact values of the kurtosis substantionally greater than 3 are commonly encountered, especially for the moments of differentiated quantities. Clearly the non-Gaussian nature of random variables can affect the planning of experiments, since substantially greater amounts of data can be required to achieved the necessary statistical accuracy.
 +
 +
{| class="toccolours" style="margin: 2em auto; clear: both; text-align:center;"
 +
|-
 +
| [[Statistical analysis in turbulence|Up to statistical analysis]] | [[Estimation from a finite number of realizations|Back to estimation from a finite number of realizations]]
 +
|}
 +
 +
{{Turbulence credit wkgeorge}}
 +
 +
[[Category: Turbulence]]

Revision as of 12:11, 18 June 2007

Turbulence
Nature of turbulence
Statistical analysis
Reynolds averaging
Study questions

... template not finished yet!

Similar relations can be formed for the estimator of any function of the random variable say f(x). For example, an estimator for the average of f based on N realizations is given by:

F_{N}\equiv\frac{1}{N}\sum^{N}_{n=1}f_{n}

where f_{n}\equiv f(x_{n}). It is straightforward to show that this estimator is unbiased, and its variability (squared) is given by:

  
\epsilon^{2}_{F_{N}}= \frac{1}{N} \frac{var \left\{f \left( x \right) \right\}}{\left\langle f \left( x \right) \right\rangle^{2} }

Example: Suppose it is desired to estimate the variability of an estimator for the variance based on a finite number of samples as:

    
var_{N} \left\{x \right\} \equiv \frac{1}{N} \sum^{N}_{n=1} \left( x_{n} - X \right)^{2}

(Note that this estimator is not really very useful since it presumes that the mean value, X, is known, whereas in fact usually only X_{N} is obtainable).

Answer

Let f=(x-X)^2 in equation the equation for \epsilon^{2}_{F_{N}} above so that F_{N}= var_{N}\left\{ x \right\}, \left\langle f \right\rangle = var \left\{ x \right\} and var \left\{f \right\} = var \left\{ \left( x-X \right)^{2} - var \left[ x-X \right] \right\}. Then:

    
\epsilon^{2}_{F_{N}}= \frac{1}{N} \frac{var \left\{ \left( x-X \right)^{2} - var \left[x \right] \right\} }{ \left( var \left\{ x \right\} \right)^{2} }

This is easiest to understand if we first expand only the numerator to oblain:

    
var \left\{ \left( x- X \right)^{2} - var\left[x \right] \right\} = \left\langle \left( x- X \right)^{4} \right\rangle  - \left[ var \left\{ x \right\} \right]^2

Thus

    
\epsilon^{2}_{var_{N}} = \frac{\left\langle \left( x- X \right)^4 \right\rangle}{\left[ var \left\{ x \right\} \right]^2 } - 1

Obviuosly to proceed further we need to know how the fourth central moment relates to the second central moment. As noted earlier, in general thi is not known. If, however, it is reasonable to assume that x is a Gaussian distributed random variable, we know from the previous section on skewness and kurtosis that the kirtosis is 3. Then for Gaussian distributed random variables,

    
\epsilon^{2}_{var_{N}} = \frac{2}{N}

Thus the number of independnt data required to produce the same level of convergence for an estimate of the variance of a Gaussian distributed random variable is \sqrt{2} times that of mean, It is easy to show that the higher the moment, the more the amount of data required.

As noted earlier, turbulence problems are not usually Gaussian, and in fact values of the kurtosis substantionally greater than 3 are commonly encountered, especially for the moments of differentiated quantities. Clearly the non-Gaussian nature of random variables can affect the planning of experiments, since substantially greater amounts of data can be required to achieved the necessary statistical accuracy.

Up to statistical analysis | Back to estimation from a finite number of realizations

Credits

This text was based on "Lectures in Turbulence for the 21st Century" by Professor William K. George, Professor of Turbulence, Chalmers University of Technology, Gothenburg, Sweden.

My wiki