Source: UNIVERSITY OF CALIFORNIA submitted to
STATISTICAL RESEARCH METHODS: MODEL FORMULATION: THE ANALYSIS OF LINEAR AND NONLINEAR STATISTICAL METHODS
Sponsoring Institution
State Agricultural Experiment Station
Project Status
TERMINATED
Funding Source
Reporting Frequency
Annual
Accession No.
0078861
Grant No.
(N/A)
Project No.
CA-R*-STA-3711
Proposal No.
(N/A)
Multistate No.
(N/A)
Program Code
(N/A)
Project Start Date
Oct 1, 2004
Project End Date
Sep 30, 2005
Grant Year
(N/A)
Project Director
Beaver, R. J.
Recipient Organization
UNIVERSITY OF CALIFORNIA
(N/A)
RIVERSIDE,CA 92521
Performing Department
STATISTICS
Non Technical Summary
Analysis of date is often limited by the lack of a model that adequately characterizes the data set. This project examines models that display skewness and their applicability to analysis of agricultural data.
Animal Health Component
50%
Research Effort Categories
Basic
50%
Applied
50%
Developmental
(N/A)
Classification

Knowledge Area (KA)Subject of Investigation (SOI)Field of Science (FOS)Percent
90173102090100%
Goals / Objectives
1. To extend the work on parameter estimation and testing using the family of skew-distributions as models. 2. To introduce these and other new methodologies to researchers dealing with data analysis. 3. To determine when and how skew-normal (as well as other) models can be used in place of the normal distribution in analyses of data involving linear/loglinear models.
Project Methods
Statistical models and procedures will be developed on theoretical grounds and then subjected to evaluation using both simulations and actual data when available. Most of these procedures can be completed using desktop PCs and in some cases computers with larger memories and faster cpu times available through the Department of Statistics or its outsources.

Progress 10/01/04 to 09/30/05

Outputs
Administrative termination, no longer at UCR

Impacts
Administrative termination, no longer at UCR

Publications

  • No publications reported this period


Progress 01/01/04 to 12/31/04

Outputs
Research conducted under this project for 2004 was concerned with parameter estimation for both skewed normal models and a variant of this family, called the skew-Normal-Cauchy model. The usual maximum likelihood approach to estimation in both of these families does not provide accurate estimates of underlying parameters, especially the skewing parameter(lambda)due to the relatively flat likelihood surface. Techniques have been developed to ensure that maximum likelihood estimtes exist with a high probability (eg > .99) by appropriate changes in the estimating equations. We have been very successful in the area of estimation in the case of the skew-Normal-Cauchy model. Based upon this same approach, we have been able to make some progress in estimation in the case of the skew-Normal models. This work is still in progress and results should be available this summer.

Impacts
The development of new models appropriate for describing data exhibiting various degrees of skewing allows the applied statistician more latitude in the ways that the data can be analyzed. Accurate methods of parameter estimation are required when using model based analyses. The estimation methods we have developed allow a statistician to assess the utility of these models for analyses in which data can be described by these models that can vary from an ordinary normal distribution to those whose distributions exhibit anywhere from little to extreme skewing.

Publications

  • Arnold, B.C. and R.J. Beaver. 2004. Some additive component to skewness models. Journal of Probability and Statistical Science 2, 139-147.
  • Arnold, B.C. and R.J. Beaver. 2004. Alternative constructions of skewed multivariate distributions. Acta et Commentationes Universitatis Tartuensis de Mathematica, 8, to appear.


Progress 01/01/03 to 12/31/03

Outputs
Research conducted under this project for the year 2003 was concerned with several models that are useful when the data at hand exhibits various degrees of skewing. The family of densities is in fact very rich and can be generated using various mathematical approaches. The first approach was to look at skewed models that involved multiple constraints, that is, the vector of random variables, X, will be observed only if X satisfies a system of linear constraints such as, a + bX > Z . This approach produces a family of densities that includes a new class described as the family of closed skewed normals, and of course, a totally new class when the basic densities are not normal. In addition, an additive construction approach whereby a skewed variable Xo was added to each of the variables in the vector X produced an equivalent family of densities that includes the skewed normal, skewed Cauchy and skewed Laplace densities. Another extension of the family of skewed normal (and nonnormal) densities to the class of elliptical densities was a further topic of research. Again it was assumed that the random variable X was observed only if it satisfied a system of probabilistic linear constraints. For generally elliptically contoured models, the various construction scenarios lead to different families of joint densities, in contrast to the normal case in which they all lead to the same model. Professor S. R. Spindler (Department of Biochemistry) and his research group found that in the comparison of the survival rates for animals that are prone to cancer, the survival of a group subjected to caloric restriction was significantly increased by as much as 4-to-5 months when compared with the control group. This analysis employed a linear-spline model to describe the relevant data comprised of survival times of mice commencing at time zero and continuing for 22 months at which time all the animals had died.

Impacts
The development of new models appropriate for describing data exhibiting various degrees of skewing allows the applied statistician more latitude in the ways that the data can be analyzed. A statistician can assess the utility of these models only by using them in analyzing appropriate data described by these densities. We are at that stage whereby we are looking into the use of these models in analysis. The results of the Spindler finding will have various impacts in cancer treatment and management.

Publications

  • Arnold, B.C. and R.J. Beaver (2003). Some additive component skewness models. To appear in the Journal of Probability and Statistical Sciences.
  • Arnold, B.C. and R.J. Beaver (2003). Skew models involving multiple constraints. Technical Report 287, Department of Statistics, University of California, Riverside.
  • Arnold, B.C. and R.J. Beaver (2003). Elliptical models subject to hidden truncation or selective sampling. To appear as Chapter 8 in the volume Skewed-Elliptical Distributions: A Journey Beyond Normality, edited by Marc Genton, NCSU.
  • Spindler, S.R. Spindler Research Group, R.J. Beaver (2003). Temporal linkage between the phenotypic and genomic responses to caloric restriction. To appear in the Proceedings of the National Academy of Sciences.


Progress 01/01/02 to 12/31/02

Outputs
Research conducted under this project for the year 2002 was concerned with further work in parametric inference using generalized ranked set data, a useful approach when ranking individuals in a sample can be easily accomplished through visual inspection or a simple measuring technique. Rank set sampling proceeds as follows: A sample of size n is selected and the smallest observation in the set is identified and the corresponding measurement recorded. A second sample of size n is selected and this time the second smallest observation in the set is identified and recorded. This continues until the last sample of size n is selected, and the largest observation in this set is identified and measured. Parametric inference in this situation was based upon the use of the E-M Algorithm and the use of the Gibbs sampler. Tables of simulated critical values of two suggested test statistics and their power in differentiating among competing models using different probability distributions are reported in Technical Report 268, Department of Statistics, UCR. Further work investigated multivariate survival models with hidden truncation in which the observations are viewed as survival times of components in a system. When the components had exponential distributions, it was not possible to determine if there had been any hidden truncation, and, in fact, the resulting distribution behaves as if there had been a simple scale change in the variables. A second part of my project as the current Director of the Statistics Consulting Laboratory is to serve as a statistical resource person for students, faculty and staff within the College of Natural and Agricultural Sciences and across campus. During the year 2002 I have consulted with researchers from the Departments of, Biology, Botany and Plant Sciences, Entomology, Environmental Toxicology, Plant Pathology, the Anderson School of Management, Eden Bioscience Corporation, and the Lanterman Development Center, Pomona.

Impacts
The methods developed using generalized rank set sampling provide a useful tool in gathering information quickly and easily. Analysis of rank set data can utilize the tables that we have created through simulation. The impact of our research involving exponential survival models is that we are not able to determine if the data collected has been selected using truncation based on other variables.

Publications

  • Arnold, B. C., R. J. Beaver, E. Castillo and Jose Maria Sarabia. 2002. Percentiles and power of goodness of fit tests based on generalized ranked set data. Technical Report No. 268, Department of Statistics, University of California, Riverside, 22 pages.
  • Arnold, B. C. and R. J. Beaver. 2002. "Multivariate survival models involving hidden truncation." In Distributions with Given Marginals and Statistical Modelling, 9-19, Kluwer Academic Publishers, Netherlands
  • Arnold, B. C., R. J. Beaver. 2002. Alternative construction of skewed multivariate distributions. Technical Report No. 270, Department of Statistics, University of California, Riverside, 20 pages.


Progress 01/01/01 to 12/31/01

Outputs
Research conducted under this project for the year 2001 was concerned with parametric inference using generalized ranked set data, which is useful when ranking of individuals in a sample can be easily accomplished through visual inspection or a simple measuring technique.. Rank set sampling proceeds as follows: A sample of size n is selected and the smallest observation in the set is identified and the corresponding measurement recorded. A second sample of size n is selected and this time the second smallest observation in the set is identified and recorded. This continues until the last sample of size n is selected, and the largest observation in this set is identified and measured. The resulting n observations consist of independent order statistics from the sampled population F, which may depend upon unknown parameters. Parametric inference in this situation was based upon the use of the E-M Algorithm and the use of the Gibbs sampler. Both techniques produce parameter estimates based upon approaching the problem as one with missing units whereby only n measurements are recorded from the n2 observations taken. Tables of simulated critical values of two suggested test statistics and their power in differentiating among competing models using different probability distributions. A second part of my project as the current director of the Statistics Consulting Laboratory is to serve as a statistical resource person for students, faculty and staff within the College of Natural and Agricultural Sciences and across campus. During the year 2001 I have consulted with researchers from the Departments of Biochemistry, Botany and Plant Sciences, Entomology, Geology and Plant Pathology, the Anderson School of Management, and the Jerry Pettis Veteran's Hospital at Loma Linda.

Impacts
The research dealing with skewed variants of both normal and nonnormal models opens the way for alternative distributions available for modeling data that are unimodal, but skewed. The research on the analysis of intercropping experiments demonstrates that the use of correlation structure in the analysis increases the precision of parameter estimates.

Publications

  • Arnold, B.C. and Beaver, R.J. 2002. Parametric Inference with Generalized Rank Set Data. (26 manuscript pages). To appear in the Golden Jubilee Volume: Emerging Areas in Probability, Statistics and Operations Research. Indian Institute of Technology, Kharagpur, India.


Progress 01/02/00 to 12/31/00

Outputs
Research conducted under this project for the year 2000 was concerned with (1) a family of distributions within a class of skewed distributions based upon hidden truncation, (2) multivariate survival models incorporating hidden truncation, and (3) an assessment of goodness-of-fit (GOF) based upon record data and rank set sampling data. The distributions in (1) can be used as alternatives to the normal distribution, the distribution upon which many statistical procedures are based. The specific distribution included in this class was the Cauchy distribution. The family of multivariate survival models (2) conditioned on a linear combination of the variables being less than or equal to the value of an independent exponential random variable produced interesting results in that if the original variables are mutually independent, so are the new variables after conditioning. The results of (3) assessing the GOF based upon either record data or rank set sampling data showed that the expected c2 approximation was actually poor in approximating the actual distribution of three different GOF statistics. A second part of my project as the current director of the Statistics Consulting Laboratory is to serve as a statistical resource person for students, faculty and staff within the College of Natural and Agricultural Sciences and across campus. I have consulted with researchers from the School of Education, the Graduate School of Management, and the Departments of Anthropology, Biochemistry, Entomology, Geology and Sociology during the year 2000.

Impacts
The research dealing with skewed variants of both normal and nonnormal models opens the way for alternative distributions available for modeling data that are unimodal, but skewed. The research on the analysis of intercropping experiments demonstrates that the use of correlation structure in the analysis increases the precision of parameter estimates.

Publications

  • Arnold, B. C. and Beaver, R. J. Multivariate survival models incorporating hidden truncation. 14 msp pages. (To appear in the Proceedings of the Conference on Distributions with Given Marginals, July, 2000, Barcelona Spain.).
  • Arnold, B. C. and Beaver, R. J. et al. Goodness-of-fit tests based on record data and generalized rank set sampling. 13 msp pages. (To appear in the Proceedings of the Conference GOF2001, honoring Karl Pearson: June, 2000, Paris, France.)


Progress 01/01/99 to 12/31/99

Outputs
Research conducted under this project for 1998 was concerned with work on a class of skewed distributions. The distributions in this class can be used as alternative to the normal distribution, the distribution upon which many statistical procedures are based. We have generalized the univariate and the multivariate normal distributions to a class of skewed distributions. One specific distribution included in this generalization was the Cauchy distribution. Three papers on skewed distributions have been accepted for publication and will appear in 2000. Further work continues in this vein whereby we look at multivariate distributions for which a linear combination of the variables is less than or equal to the value of an independent exponential random variable. The interesting result here is that if the original variables are mutually independent, so are the new variables after conditioning. A paper dealing with the analysis of intercropping data was accepted and appeared in 1999. A second part of my project is to serve as a statistical resource person for students, faculty and staff within the College of Natural and Agricultural Sciences. I have consulted with researchers from the Departments of Biology, Entomology, and Plant Pathology during 1999.

Impacts
The research dealing with skewed variants of both normal and nonnormal models opens the way for alternative distributions available for modeling data that are unimodal, but skewed. The research on the analysis of intercropping experiments demonstrates that the use of correlation structure in the analysis increases the precision of parameter estimates.

Publications

  • Beaver, R .J. and Melgar, M. 1999. Analysis of Yield-Density Models for intercropping Experiments. Biometrical Journal 41(8): 995-1011.


Progress 01/01/98 to 12/01/98

Outputs
Research conducted under this project for 1998 was concerned with work on a class of skewed distributions. The distributions in this class can be used as alternatives to the normal distribution, upon which many statistical procedures are based. We have generalized the univariate skewed normal distribution based on hidden truncation, and have also a produced a multivariate generalization of the skewed normal in the same way. In addition, we have continued to examine skewed Cauchy distributions and have produced results analogous to those for the normal distribution. Another area of further work consists of designs for intercropping experiments. One paper in this area (currently listed as Technical Report 235) is accepted by the Biometrical Journal pending minor revision. A second aspect of my project is to serve as a statistical resource person for students, faculty and staff within the College of Natural and Agricultural Sciences. I have consulted with persons from the Departments of Entomology, Neurosciences, Plant Pathology, Nematology and Soils & Environmental Sciences. These consultations ranged over the statistical areas of sampling designs, design of experiments, correspondence analysis as well as analysis of data gathered in scientific experiments.

Impacts
(N/A)

Publications

  • ARNOLD, B.C. and BEAVER, R. J. 1998. Hidden Truncation Models. Technical Report No. 259, Department of Statistics, University of
  • ARNOLD, B.C. and BEAVER, R. J. 1998. The Multivariate Skew-Cauchy Distribution. Technical Report 250, Department of Statistics, University of California, Riverside.


Progress 01/01/97 to 12/01/97

Outputs
Research conducted under this project for 1997 was concerned with work on a class of skewed distributions. The distributions in this class can be used as alternatives to the normal distribution, upon which many statistical procedures are based. We have generalized the univariate skewed normal distribution based on hidden truncation, and have also produced a multivariate generalization of the skewed normal in the same way. In addition, we have examined skewed Cauchy distribution and produced results analogous to those for the normal distribution. A second aspect of my project is to serve as a statistical resource person for students, faculty and staff within the College of Natural and Agricultural Sciences. I have consulted with persons from the Departments of Entomology, Neurosciences, Plant Pathology and Soils and Environmental Sciences.

Impacts
(N/A)

Publications

  • ARNOLD, B.C. and BEAVER, R.J. 1997. Hidden truncation models. Proceedings of the Third International Triennial Calcutta Symposium on Probability and Statistics.
  • ARNOLD, B.C. and BEAVER, R.J. 1997. Some skewed multivariate models. Technical Report No. 249, Department of Statistics, University of
  • ARNOLD, B.C. and BEAVER, R.J. 1997. The multivariate skew-Cauchy distribution. Technical Report No. 250, Department of Statistics, University of California, Riverside.


Progress 01/01/96 to 12/30/96

Outputs
Research conducted under this project for 1996 was concerned with (1) the study of designs for yield-density intercropping experiments and (2) initial work on a class of skewed distributions gotten from conditional distributions with hidden censoring. Under (1) optimal designs were investigated using the criteria of rotability and D-optimality within replacement series designs for the yield of one or more crops and densities of planting. Research conducted under (2) concerned the investigation of properties of univariate and multivariate distributions that exhibit skewness as a function of conditioning on one or more variables or a linear combination of hidden variables.

Impacts
(N/A)

Publications

  • BEAVER, R. J. and MELGAR, M. 199(5. Analysis of Density-Yield Models for Intercropping Experiments. Technical Report 235, Department of Statistics, University of California, Riverside, CA.
  • BEAVER, R. J. and MELGAR, M. 1996. Optimal Designs for Yield-Density Intercropping Experiments. Technical Report 237, Department of Statistics, University of California, Riverside, CA.


Progress 01/01/95 to 12/30/95

Outputs
Research conducted under this project for 1995 was concerned with (1) the study of yield-density models for intercropping experiments and (2) semiparametric estimation of a density function. Under (1) yield-density models with correlated error structures were explored in describing the relationship between the yield of one or more crops and densities of planting.

Impacts
(N/A)

Publications


    Progress 01/01/94 to 12/30/94

    Outputs
    Research under this project for 1994 deals with studies of the Lorenz curve associated with the generalized logistic-Burr system of distributions. Specifically, a major portion of this work was concerned with Bayesian estimation of the parameters of the generalized logistic distribution using dependent priors as well as independent priors were considered using a step-loss function. With this loss function, the estimates were found to be the mode of the posterior distribution. A Monte Carlo study showed that the estimation procedure worked very well with both independent priors and dependent priors. Further, the procedure worked well even if the prior was mis-specified, with the accuracy increasing as the sample size increased.

    Impacts
    (N/A)

    Publications


      Progress 01/01/93 to 12/30/93

      Outputs
      The most important scientific results produced by this research project in the last 5 years are those associated with the research of Mario Melgar (1993), who devised new indices for use in comparing yields for intercropping and monocropping regimes. That research involved the joint analysis of inter- and mono-cropping yields using a nontrivial correlation structure. He addressed D-optimal, orthogonal, rotatable and uniform precision designs for these kinds of analysis. The second most important research problem addressed during this 5-year period dealt with parametric and nonparametric estimation of survival functions (M. Lee, 1993). The form of the survival function incorporated unknown parameters as well as one or more covariates. A novel method of estimating unknown parameters used the Gibbs sampler for the Bayesian estimation of posterior means. These problems are related to objective 3, development of statistical methods for nonstandard problems in agricultural research. Several other research problems were related to objectives 1 and 2: the development of models applicable to paired comparison experiments (K. de Ruiz, 1990) together with designs for paired comparison experiments involving mixtures (Ghani and Beaver, 1993). Proportions rather than absolute amounts of various components of the mixture are used in the analysis.

      Impacts
      (N/A)

      Publications


        Progress 01/01/91 to 12/30/91

        Outputs
        Research supported by this project during the current reporting period is listedas follows. 1. The work with B. C. Arnold and R. A. Groeneveld regarding skewed normal distributions has been rewritten and submitted to the Journal of Mathematical Psychology. 2. A manuscript is in preparation regarding the work on generalized paired comparison models with Kaye DeRuiz. 3. A revised manuscript concerning the use of ridge regression in the analysis of paired comparisons with mixtures by I. Ghani and R. J. Beaver is available and will be submitted to Technometrics. 4. Implementation of Bayesian estimation procedures for survival distributions for Dirichlet or gamma process priors is the focus of the work with B. C. Arnold and M. Lee. The implementation will use the Gibbs sampler and importance sampling.

        Impacts
        (N/A)

        Publications


          Progress 01/01/90 to 12/30/90

          Outputs
          Research supported by this project during the current reporting is listed as follows: 1. Work with B. C. Arnold and R. A. Groeneveld concerning the properties of both moment and maximum likelihood estimators of the parameters in the class of "skewed normal" distributions is completed. This work is reported in a manuscript submitted to the JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION. 2. Work with Kaye de Ruiz concerning models for a continuous response variable defined on the interval (0,1) for paired comparison using a generalization of the Bradley-Terry model resulted in her Ph.D. dissertation, submitted December, 1990. 3. Results of the work with I. A. Ghani concerning ridge regression and ridge analysis in paired comparison experiments with mixtures is being re-examined to clarify which of several model formulations and analyses should be used under various circumstances. 4. A new area of research is concerned with Bayesian analysis of survival functions using Dirichelet or gamma processes. This work, with M. Lee and B.C. Arnold is in its formative stages.

          Impacts
          (N/A)

          Publications


            Progress 01/01/89 to 12/30/89

            Outputs
            Research supported by this project during the current reporting year involved inthe following topics: Paired comparison experiments with mixtures have been analyzed by transforming p mixture components x(1),x(2),...,x(p)(i ix(i) = 1) to (p-1) linearly independent components w(1,w(2),...,w(p-1). This research reports the results of an analysis involving the original mixture components through the use of ridge regression and ridge analysis. This paper deals with the analysis of water from 151 irrigation and stock wells in the southern coast range bounded by Alameda and San Joaquin counties in the north and Ventura and Santa Barbara counties in the south. Selenium concentrations were found to be associated with nearly Pliocene and Miocene marine rocks, indicating that either or both Pliocene and Miocene rocks are selenium sources, or that Pliocene rocks are positional markers locating recently uplifted, selenium-bearing Miocene rocks. Work has begun on a project (with Kaye De Ruiz) to investigate how a repondent processes information in arriving at preference ratings in the binary comparison of a set of items or treatments. This research is being conducted within the framework of the Thurstone-Mosteller and Bradley-Terry paired comparison models. The work (with Barry Arnold) involving estimation of the parameters in a family of "skewed normal" distributions is not yet complete. It is expected that results in this area will be available during the 1/90 - 12/90 reporting period.

            Impacts
            (N/A)

            Publications


              Progress 01/01/87 to 12/30/87

              Outputs
              Research supported by this project during the current reporting period involved the following topics: Paired comparison experiments with mixtures (with I. Ghani): Mixture experiments refer to situations in which q components are blended in proportions x(1), x(2),...,x(q) with 0 less than or equal to x(i) and x(i) less than or equal to 1, sigma(i)x(i) = 1. This later constraint produces a loglinear regression model of less than full rank. Our work has centered on the use of ridge regression as an alternative method of analysis in this context. Skewed normal distributions (with B. Arnold and D. Anderson): A class of "skewed normal" distribution results when investigating the conditional distribution of one variable, given the second lies within a given interval, say (alpha, beta) and their joint distribution is bivariate normal with p = 0. We have developed parameter estimators based on the method of moments and the method of maximum likelihood. A computer simulation to compare the properties of these estimators is under way. Analysis of selenium in well waters (Coastal Range of California) (with J. Tracey and J. Oster): This work involved the analysis of major and minor elements in 107 irrigation wells and 44 livestock wells in the California Coastal Range. Selenium was shown to be significantly associated with nearby surface Pliocene, Miocene and Eocene marine rocks as measured by Pearson's chi-square analysis.

              Impacts
              (N/A)

              Publications


                Progress 01/01/86 to 12/30/86

                Outputs
                Paired comparison experiments with mixtures (with R. Charnet): Mixture experiments refer to situations in which q components are blended in proportions x(1), x(2), ..., x(q) with 0 less than or equal to x(i) less than or equal to 1, sigma(i)x(i) = 1. We have produced models for paird comparison experiments in which the various treatments or items under comparison correspond to various mixtures of the same q components. Paired comparison designs which are optimal in the sense of being rotatable and/or minimum bias designs are given for paired comparisons with mixtures. Estimating the number of classes in a population (with B. Arnold): This is a continuation of the work reported earlier, with a broader application of the theory that has been developed. Determining the number of instar stages in naval orangeworm (with J. Sanderson): The E-M algorithm was used to determine whether a field-collected set of observations on the head-capsule width of navel orangeworm was best fitted by a mixture of 5 versus 6 normal subpopulations. These results were then used to determine the thermal summation during individual stadia for the naval orangeworm. Effects of mineral nutrition on components of reproduction in Clarkia unguiculata (with F. Vasek and V.

                Impacts
                (N/A)

                Publications


                  Progress 01/01/85 to 12/30/85

                  Outputs
                  Research supported by this project during the current reporting period has centered on the following areas: Paired comparison models and designs for mixture experiments (with R. Charnet): Mixture experiments refer to situations in which q components are blended in proportions x(1),x(2),..., x(q) with 0 less than or equal to x (i) less than or equal to 1, sigma (i)x(i) = 1. We have produced models for paired comparison experiments in which the treatments under test correspond to mixtures of q components. We are presently investigating designs for paired comparison experiments which give rise to optimal designs (ie D-, E-, or A-optimality). Estimating the number of dies that are used in minting coins (with B. Arnold): This work is concerned with point and interval estimation of the number of dies used in minting coins based on a sample of coins which are classified according to the die used. Unlike earlier work on this problem, no restrictive assumptions are used in the development; large sample exact and approximate interval estimates appear to be better than those given by earlier authors. Determining the number of instars present in naval orangeworm on two different substrates (with J. Sanderson): The EM algorithm is an iterative technique for fitting two or more normal populations to a set of data arising as a mixture of normals. This technique was used to determine the number of instars of the naval orangeworm, as well as cut-off points between populations, based on head-capsule width measurements.

                  Impacts
                  (N/A)

                  Publications


                    Progress 01/01/84 to 12/30/84

                    Outputs
                    Research supported by this project has expanded to include not only paired comparison modelling and design, but also some applications in the area of statistical genetics. Work with B. Sirotnik on the inclusion or exclusion of pairs of identical items in experiments with order effects showed that including pairs of identical items in the experiment increased precision in detecting and eliminating order effects and decreased the precision in differentiating among the item worth parameters. The application of paired comparisons within the framework of mixture experiments is under investigation. It is hoped that some optimal designs within this context can be developed. Modelling of multiple matings in Drosophila pseudoobscura (with M. Morrison) using binomial mixtures has shown that the usual estimators of allele frequencies under random mating are biased when multiple matings occur. Numerical methods indicate that estimators of allele frequencies under the mixture model are unique, a result that we have not been able to establish mathematically. The use of weighted least squares in generation means analysis is a technique which although widely used, is a point of controversy. We (Beaver and Mosjidis) have presented a statistically sound approach to this analysis in cases where the technique is appropriate to use.

                    Impacts
                    (N/A)

                    Publications


                      Progress 01/01/83 to 12/30/83

                      Outputs
                      During this reporting period the focus of the work on this project centered on the design of paired comparison experiments, and the more general problem of recovery of interblock information in nested multidimensional block designs. In the work with B. W. Sirotnik concerning the inclusion or exclusion of pairs of identical items in paired comparison experiments when order effects are present, the inclusion of identical pairs increases precision in detecting order effects, but decreases precision in differentiating among item worth parameters, while just the opposite is true if the pairs of identical items are not included. Groundwork has been laid for investigating paired comparison experiments within the context of mixture experiments, that is, the items to be compared are actually mixtures of several components. This research synthesizes the approaches and results of paired comparison and mixture experiments. We have shown that the nested multidimensional block designs introduced by J.N. Srivastava in 1981 are superior to the classical incomplete block designs in reducing the variance ofestimable linear contrasts among the treatment means. Work with K.-S. Lii on finding a consistent estimator of the rate parameter in an exponential decay model when the observations are not identically and independently distributed is nearing completion.

                      Impacts
                      (N/A)

                      Publications


                        Progress 01/01/82 to 12/30/82

                        Outputs
                        During this reporting period, work on this project centered around multivariate paired comparison models, the use of minimum discriminant information (MDI) in the analysis of paired comparison data, and the application of aired comparison techniques in establishing threshold levels in taste-testing experiments, and in problems involving mixture experiments. One paper (with J. Davoodzadeh) dealing with multivariate paired comparison models having additive association structure has been accepted for publication in the Journal of Mathematical Psychology. A second paper dealing with multivariate paired comparison models having loglinear association structure has been accepted for publication in the Journal of Statistical Planning and Inference. A third paper (with D.V. Gokhale and B.W. Sirotnik) dealing with model-robust MDI analysis of paired comparison experiments has been accepted for publication in Communication in Statistics A: Theory and Methods. A fourth paper (with B.W. Sirotnik) has been submitted to the British Journal of Mathematical and Statistical Psychology: this paper deals with the efficiency of designs for isolating within-pair order effects. The research involving paired comparison techniques as applied to threshold levels in tastetesting experiments and to mixture experiments was pursued at the University of Florida during the fall quarter of my sabbatical leave and is still in preliminary stages.

                        Impacts
                        (N/A)

                        Publications


                          Progress 01/01/81 to 12/30/81

                          Outputs
                          The focus of our research in multivariate paried comparison experiments with ties has been in the area of model development and model-based analyses. For models employing log linear association structure, iterative scaling (IS) techniques and in particular, minimum discrimination information (MDI) techniwues are especially appropriate. We have successfully used the MID approaph in analyzing models with log linear association structure. For models with additive association structure, the usual analysis involving maximum likelihood and likelivehood ration tests are applicable, but present some difficult analytical problems because of the implicit nature of the likelihood equations. We have programmed some approximate techniques for obtaining parameter estimates in this case. These approximate solutions work fairly well in practice, and provide a reasonable fit to the observed data. Using the MDI approach, we have analyzed univariate paired comparison experiments usinga model-free approach. A manuscript reporting this research will be avaliable shortly.

                          Impacts
                          (N/A)

                          Publications


                            Progress 01/01/80 to 12/30/80

                            Outputs
                            Our research in the area of paired comparison experiments has focused on the development of models for multivariate paired comparison experiments with ties, and the efficiency of paired comparison designs in detecting order effects when they are present. We have produced four different models for multivariate paired comparisons with ties. Three of these models have additive association structure while the fourth has log-linear association structure. The analysis of multivariate paried comparison data with ties has shown that testing and estimation results are comparable for the model with additive structure. Further computer work is in progress to evaluate results using the log-linear association structure. Our research into design efficiency has shown that one achieves asymptotic relative efficiency greater than one in detecting order effects when the pairs consisting of identical items are included in the design. Work is continuing on model-free analysis of paired comparison experiments with or without ties and/or order effects. Computer programs have been developed for analyzing multivariate paired comparison data with ties, and for analyzing extended paired comparison designs using several methods of analysis which include maximum likelihood, conditional maximum likelihood, weighted least squares for categorical data, and iterative scaling. Work is continuing on the analysis of multivariate paired comparison data using iterative scaling.

                            Impacts
                            (N/A)

                            Publications


                              Progress 01/01/79 to 12/30/79

                              Outputs
                              The thrust of our research into paired comparison ordered preference structures has been in the following three areas: first, the development of multivariate paired comparison tie-models having additive association structure; second, the development of multivariate paired comparison tie models having loglinear association structure; and third, the investigation of designs and models to be used in the investigation of paired comparison experiments with order effects. A fairly extensive FORTRAN computer program was developed for use in the analysis of multivariate paired comparison experiments with ties; further programs are being developed to accommodate the analysis of data based on models and procedures we have developed. We are presently interested in investigating a model-free approach to the analysis of paired comparison data with or without the presence of ties and/or order effects. This work resulted in one technical report and, in addition, is the basis for two doctoral dissertations to be completed in early 1980.

                              Impacts
                              (N/A)

                              Publications