1.
1.1. In a petrol-producing company, an analysis of octan number was made in 5 tanks, with
results 90.0, 89.8, 89.6, 90.1, 90.0 octans.
Compute 95 %...
6 downloads
13 Views
133KB Size
1. 1.1. In a petrol-producing company, an analysis of octan number was made in 5 tanks, with results 90.0, 89.8, 89.6, 90.1, 90.0 octans. Compute 95 % confidence interval for the mean value of octan number. Further, test the hypothesis (on 5 % level) that the mean octan number is 90, against the alternative hypothesis that it differs from 90. 1.2. In order to estimate this year inventory (excess inventory), a tires company sampled 6 its dealers, getting the following data (X=inventory last year, Y=inventory this year): X 70 260 150 100 20 60
Y 60 320 230 120 50 60
In order to show relation among this year and last year inventory, a) Find the correlation coefficient r (x,y). b) Find the parameters of the regression line y = a + b*x c) Plot the regression line and the data. d) Compute also the residual variance s^2. e) Assume, that from complete data for all 20 company dealers, the total last year inventory was 3500 tires. Estimate the mean and variance of this year total inventory (use estimated a, b, s as true parameters). Remark, not important for solution: From C.L.Th. it follows that total inventory distribution is approximately normal.
2. 2.1. Contents of SIO2 in a material is measured by two different techniques. 5 analytic measurements were made, with results 20.1, 19.6, 20.0, 19.7, 20.1 (units), while 6 fotometric measurements gave the following results: 20.6, 20.1, 20.6, 20.5, 20.7, 20.5. Perform the tests of hypotheses (on 10 % level) whether both methods yield: a) the same mean values, b) the same variances (this is, actually, the necessary condition for validity of the test ad a) ).
2.2. To see how productivity was related to the level of maintenance, a firm selected randomly 5 of its machines. Each of them was randomly assigned a different level of maintenance X, and its average number of stoppages was observed: Hours of maintenance Stoppages per week per week, X Y 4 1.6 6 1.1 8 0.8 10 0.6 12 0.5 a) Find the correlation coefficient r (x,y). b) Find the parameters of the parabolic regression y = a + b*x +c*x^2 c) Plot the regression curve and the data. d) Compute also the residual variance s^2. e) If one hour of maintenance costs about 50 Euro and one stoppage costs in average 600 Euro, what seems to be an optimal level of maintenace per week?
3. 3.1. Let objects.
Xi : 6.5, 6.8, 6.7, 6.7, 7.0, 6.9, 7.1
be 7 measurements of weights of certain
a) Compute 95 % confidence interval for the mean value of the weight b) Using a), test the hypothesis that theoretical mean (Expectation) is 7, against the alternative that it differs from 7. ---------------------------------------------------
3.2. During the 1950s, radioactive waste leaked from the storage area near Hanford, WA, into the Columbia River. For 9 counties downstream in Oregon, an index of exposure, X, and the cancer mortality Y (per 100 000 person-year, through 1959-1964), were calculated: Exposure index Mortality X Y 8.3 210 6.4 180 3.4 130 3.8 170 2.6 130 11.6 210 1.2 120 2.5 150 1.6 140 In order to show relation among X and Y, a) Find the correlation coefficient r (x,y). b) Find the parameters of the regression line y = a + b*x c) Plot the regression line and the data. d) Compute also the residual variance s^2. e) Estimate the 'natural' cancer mortality if X=0, i.e. its mean level; what is its variance? (take estimated a, b, s as true parameters)
4. 4.1. From a high school, a set of 8 students has been selected for examination. Their scores (in points, from 100 being the maximum) were 65, 60, 54, 49, 71, 62, 57, 70. a) Compute the 95 % confidence interval for the mean score (assuming that the score has normal distribution). b) The good standard score is 70. Are the students significantly worse i.e. test the hypothesis H0: µx =70 against that µx <70.
4.2. Consider a machine park with 5 machines, let variable X denote the (preventive) service (hours per week) and variable Y the number of failures per week (averaged): X: 2 Y: 1.8
4 1.3
6 0.9
8 0.7
10 0.6
12 0.5
Analyse the dependence of Y on X: a) Find the correlation coefficient r (x,y). b) Find the parameters of the regression line y = a + b*x c) Plot the regression line and the data. d) Compute also the residual variance s^2. e) Assume that x hours of preventive service cost x^2 times 50 CzK while the repair of one failure costs (in average) 15000 CzK. Estimate the optimal length of service time per week (from the point of minimal expected expenses).
5. 5.1. From a high school, a set of 8 students has been selected for examination. Their scores (in points, from 100 being the maximum) were Xi = 65, 60, 54, 49, 71, 62, 57, 70. The same 8 students were then intensively trained. After one month, their results were the following: Yi = 70, 69, 71, 72, 68, 75, 85, 82. Is the improvement of the score significant? I. e. test the hypothesis H0: µx = µy against that µx < µy (on 5% level of significance of the test).
5.2. Assume that the total repair time (Y) is (more-less) proportional to number of repaired items (X). From 9 repair shops the following data are available: Number of repaired items, X: 7 6 5 1 5 4 7 3 4 Elapsed time (hours) Y: 97 86 78 10 75 52 101 39 53 a) Find the correlation coefficient r (x,y). b) Find the parameters of the regression line y = a + b*x c) Plot the regression line and the data. d) Compute also the residual variance s^2. e) Estimate the time needed for the repair of 10 items, i.e. its mean; what is its variance? (take estimated a, b, s as true parameters)
6. 6.1. Two types of glass were examined, namely their resistance against high temperature. 6 specimen of 1-st type have broken at temperatures Xi = 395, 410, 390, 370, 415, 390 (oC), while 5 specimen of 2-nd type broke at Yi = 400, 420, 415, 425, 420 (oC). Decide (by statistical tests) whether the mean breaking temperatures of 1-st and 2-nd type differ significantly, test also the equality of variances in both groups.
6.2. In a survey, incomes and savings of families were investigated. Let us have the data from 5 families (with the same number of children, say): Income X 30 70 20 30 50
Saving Y (Thousands USD/ year) 3 6 2 1 3
In order to show relation between Y and X, a) Find the correlation coefficient r (x,y). b) Find the parameters of the regression line y = a + b*x c) Plot the regression line and the data. d) Compute also the residual variance s^2. e) Assume, that there is 100 such families of this kind, with mean year income 40 000 USD. Estimate the mean and variance of the sum of their year's savings (use estimated a, b, s as true parameters). Remark, not important for solution: From C.L.Th. it follows that total saving distribution is approximately normal.
7. 7.1. In Liberec, last year, 960 newborn children were registered, from them 550 boys. Are these numbers in contradiction with the known general proportion of 53 boys from 100 children? (Check it by statistical test of hypothesis H0: p= 0.53, in the framework of binomial distribution).
7.2. In a pilot study of a new fertilizer, 4 levels were randomly assigned to 4 experimental fields, resulting in the following yields of corn: Fertilizer (kg/ha) Yield (100 kg/ha) X Y 10 30 20 37 40 45 50 50 a) b) c) d) e)
Find the correlation coefficient r (x,y). Find the parameters of the regression line y = a + b*x Plot the regression line and the data. Compute also the residual variance s^2. Estimate the yield corresponding to 30 kg per hectar of fertilizer, i.e. its mean; what is its variance? f) Predict the increase of yield for one additional 10 kg of fertilizer. (for e) and f) take estimated a, b, s as true parameters)
8. 8.1. In a petrol-producing company, an analysis of octan number was made in 6 tanks, with results 95.0, 94.7, 94.8, 95.1, 95.1, 94.6 octans. Compute 95 % confidence interval for the mean value of octan number. Further test the hypothesis (on 5 % level) that the mean octan number is 95, against the alternative hypothesis that it differs from 95.
8.2. In a study of how wheat yield depends on fertilizer, suppose that data are available from only 7 observations: Fertilizer (kg/ha) Yield (100 kg/ha) X Y 10 40 20 50 30 50 40 70 50 65 60 65 70 80 a) Find the correlation coefficient r (x,y). b) Find the parameters of the regression line y = a + b*x c) Plot the regression line and the data. d) Compute also the residual variance s^2. e) Estimate the yield corresponding to 35 kg per hectar of fertilizer, i.e. its expectation; what is its variance? f) Predict the increase of yield for one additional 10 kg of fertilizer. (for e) and f) take estimated a, b, s as true parameters)
9. 9.1. In an American university, 10 men and 5 women professors were sampled randomly (in 1969), yielding annual salaries (in thousand dollars); Men: 13, 11, 19, 15, 22, 20, 14, 17, 14, 15. Women: 9, 12, 8, 10, 16. Decide (by statistical tests) whether the mean salaries of men and women differ significantly, test also the equality of variances in both groups.
9.2. The number of drinks consumed and the corresponding blood alcohol concentrations (in %) are listed for several people (of approximately the same body weight): number of drinks X: blood alcohol contrentration, Y:
2 0,05
2 0,06
4 0,11
5 0,13
Analyze the relationship: a) Find the correlation coefficient r (x,y). b) Find the equation of the regression line y = a + b*x c) Plot the regression line and the data. d) Compute also the residual variance s^2. e) Estimate the level of blood alcohol after 3 such drinks, i.e. its expectation; what is its variance? (take estimated a, b, s as true parameters)
8 0,18
10. 10.1. From following data (heights of 10 randomly selected students, in cm), compute 90% confidence interval for the mean height: 174, 178, 183, 168, 163, 175, 178, 177, 169, 182. Further, test the hypothesis the the mean height is 177 cm, or is significantly different.
10.2. The table below lists the values (in 10^9 USD) of U.S. exports and incomes on foreign investments for various sample years. Exports X: 16 20 27 39 56 63 Incomes Y: 2 3 4 7 11 11 Analyze the relationship: a) Find the correlation coefficient r (x,y). b) Find the equation of the regression line y = a + b*x c) Plot the regression line and the data. d) Compute also the residual variance s^2. e) Predict the increase of income for additional 5*10^9 USD of export. (take estimated a, b, s as true parameters)
11. 11.1. In a sample survey, among 1000 voters there were observed 276 voters of Liberál party (LP). Compute 95% confidence interval for proportion p of LP voters in the whole population. Then decide, whether an optimistic assumption of the LP leader, that in elections the LP will obtain 30% notes, is realistic – i.e. test the hypothesis that p=0.3. ---------------------------------------------------
11.2. The following table lists midterm and final exam scores for randomly selected students in a math. course: Midterm X: 82 65 93 70 80 Final Y: 94 77 94 79 91 a) Find the correlation coefficient r (x,y). b) Find the equation of the regression line y = a + b*x c) Plot the regression line and the data. d) Compute also the residual variance. e) Predict the final score for a student having midterm score 75, i.e. its expectation; what is its variance? (take estimated a, b, s as true parameters)
12. 12.1. To measure the effect of a fitness campaign, a sport club selected randomly 5 members and compares their weights before and after the campaign: Before: 84, 92, 78, 90, 82, After: 79, 85, 80, 86, 78. Decide, by a statistical test, whether the mean weight loss was significant.
12.2. A psychologist suspects that there is a relationship between the scores on a test for motivation and the scores on a personality test. The following sample data was collected: Motivation test Personality test
x: y:
49 16
36 14
62 18
50 20
51 17
42 12
a) Find the correlation coefficient r (x,y). b) Find the equation of the regression line y = a + b*x c) Plot the regression line and the data. d) Compute also the residual variance. e) Estimate the personality test score for a person having motivation test score 40, i.e. its expectation; what is its variance? (take estimated a, b, s as true parameters)
13. 13.1. Let the following data represent times to failure [in hours of work] of 10 identical devices : 16, 43, 13, 7, 12, 14, 6, 26, 10, 54. Compute 95% confidence interval for the variance σ2 and then test the hypothesis whether σ = 10, or whether σ differs significantly from 10. In what direction?
13.2. The following table shows the dependence of a car mean gas consumption (l/100 km) on its speed (km/hour): Speed X 70 80 90 100 110 120
Consumption Y 5 5.1 5.4 6.1 7.1 8.3
a) Find the correlation coefficient r (x,y). b) Find the parameters of the parabolic regression y = a + b*x +c*x^2 c) Plot the regression curve and the data. d) Compute also the residual variance s^2. e) Predict the consumption for the speed 130 km/hour, i.e. its expectation; what is variance of such a prediction? (take estimated a, b, c, and s as true parameters)
14. 14.1. . For the following data (reaction times - in seconds - of 12 drivers to sudden appearence on an animal on the road), compute 95 % ckonfidence interval for the mean reaction time. Then, test the hypothesis that the mean reaction time (the parameter µ of underlying distribution of probability) is shorter than 0.7 second.: Data: 0.74 0.71 0.41 0.82 0.74 0.85 0.99 0.71 0.57 0.85 0.57 0.55 .
14.2. 3 The table comtains measurementsof dependence of watwr density d (in kg/dm ) on its temperature o t (in C):
t: 0 10 20 30 40 50 60 70 80 90 100 d: 1,000 0,999 0,997 0,996 0,993 0,987 0,983 0,978 0,973 0,964 0,958 a) b) c) d)
Find the correlation coefficient r (x,y). Find the equation of parabolic regression d = a + b*t+c* t2 . Plot the regression curve and the data. Compute also the residual variance.
15. 15.1. From a high school, a set of 8 students has been selected for examination. Their scores (in points, from 100 being the maximum) were Xi = 65, 60, 54, 49, 71, 62, 57, 70. From another school, 10 students were selected and examined. Their results were the following: Yi = 70, 69, 71, 72, 68, 75, 85, 82, 69, 75. A) Is the difference of the scores significant? I. e. test the hypothesis H0: µx = µy against that µx < µy (on 5% level of significance of the test). B) Estimate also variances in both samples and test hypothesis σX = σY (also use 5% level of the test).
15.2. To see how productivity was related to the level of maintenance, a firm selected randomly 5 of its machines. Each of them was randomly assigned a different level of maintenance X, and its average number of stoppages was observed: Hours of maintenance Stoppages per week per week, X Y 4 1.6 6 1.1 8 0.8 10 0.6 12 0.5 a) Find the correlation coefficient r (x,y). b) Find the parameters of the parabolic regression y = a + b*x +c*x^2 c) Plot the regression curve and the data. d) Compute also the residual variance s^2. e) If one hour of maintenance costs about 50 Euro and one stoppage costs in average 600 Euro, what seems to be an optimal level of maintenace per week?
16. 16.1. Let objects.
Xi : 65, 68, 67, 67, 70, 69, 71, 67
be 8 measurements of weights of certain
a) Compute 95 % confidence interval for the mean value of the weight b) Using a), test the hypothesis that theoretical mean (Expectation) is 70, against the alternative that it differs from 70. --------------------------------------------------16.2. During the 1950s, radioactive waste leaked from the storage area near Hanford, WA, into the Columbia River. For 9 counties downstream in Oregon, an index of exposure, X, and the cancer mortality Y (per 100 000 person-year, through 1959-1964), were calculated: Exposure index Mortality X Y 8.3 210 6.4 180 3.4 130 3.8 170 2.6 130 11.6 210 1.2 120 2.5 150 1.6 140 In order to show relation among X and Y, a) Find the correlation coefficient r (x,y). b) Find the parameters of the regression line y = a + b*x c) Plot the regression line and the data. d) Compute also the residual variance s^2. e) Estimate the 'natural' cancer mortality if X=0, i.e. its mean level; what is its variance? (take estimated a, b, s as true parameters)
17. 17.1. Contents of SIO2 in a material is measured by two different techniques. 5 analytic measurements were made, with results 20.1, 19.6, 20.0, 19.7, 20.1 (units), while 6 fotometric measurements gave the following results: 20.6, 20.1, 20.6, 20.5, 20.7, 20.5. Perform the tests of hypotheses (on 5 % level) whether both methods yield: a) the same mean values, b) the same variances (this is, actually, the necessary condition for validity of the test ad a) ).
17.2. Consider a machine park with 5 machines, let variable X denote the (preventive) service (hours per week) and variable Y the number of failures per week (averaged): X: 2 Y: 1.8
4 1.3
6 0.9
8 0.7
10 0.6
12 0.5
Analyse the dependence of Y on X: a) Find the correlation coefficient r (x,y). b) Find the parameters of the regression line y = a + b*x c) Plot the regression line and the data. d) Compute also the residual variance s^2. e) Assume that x hours of preventive service cost x^2 times 50 CzK while the repair of one failure costs (in average) 15000 CzK. Estimate the optimal length of service time per week (from the point of minimal expected expenses).
18. 18.1. In a petrol-producing company, an analysis of octan number was made in 5 tanks, with results 90.0, 89.8, 89.6, 90.1, 90.0 octans. Compute 95 % confidence interval for the mean value of octan number. Further, test the hypothesis (on 5 % level) that the mean octan number is 90, against the alternative hypothesis that it differs from 90.
18.2. Assume that the total repair time (Y) is (more-less) proportional to number of repaired items (X). From 9 repair shops the following data are available: Number of repaired items, X: 7 6 5 1 5 4 7 3 4 Elapsed time (hours) Y: 97 86 78 10 75 52 101 39 53 a) Find the correlation coefficient r (x,y). b) Find the parameters of the regression line y = a + b*x c) Plot the regression line and the data. d) Compute also the residual variance s^2. e) Estimate the time needed for the repair of 10 items, i.e. its mean; what is its variance? (take estimated a, b, s as true parameters)