How to compute the cumulative distribution functions and the percent point functions of various commonly used distributions in Excel, R and Python.
I use Excel (in conjunction with Tanagra or Sipina), R and Python for the practical classes of my courses about data mining and statistics at the University. Often, I ask students to perform hypothesis tests or to calculate confidence intervals, etc.
We work on computers, it is obviously out of the question to use the statistical tables to obtain the quantile or p-value of the commonly used distribution functions. In this tutorial, I present the main functions for normal distribution, Student's t-distribution, chi-squared distribution and Fisher-Snedecor distribution. I realized that students sometimes find it difficult to match the reading of statistical tables with the functions they have difficulty identifying in software. It is also an opportunity for us to verify the equivalences between the functions proposed by Excel, R (stats package) and Python (scipy package). Whew! At least on the few illustrative examples given in our document, the results are consistent.
Keywords: excel, r, stats package, python, scipy package, p-value, quantile, cdf, cumulative distribution function, ppf, percent point function, quantile function
Tutorial: CDF and PPF
Subscribe to:
Post Comments (Atom)
Find us on Facebook
Find us on Google Plus
Labels
- Association rules (8)
- Clustering (14)
- Data file handling (17)
- Decision tree (21)
- Exploratory Data Analysis (17)
- Feature Construction (6)
- Feature Selection (8)
- PLS Regression (5)
- Python (11)
- Regression analysis (13)
- Sipina (23)
- Software Comparison (49)
- Statistical methods (3)
- Supervised Learning (67)
- Tanagra (13)
- Text Mining (2)



No comments:
Post a Comment