The study uses some jackknife and generalized variance techniques to estimate standard errors. While I am not familiar with these techniques, I understand there are subtleties but that these are generally acceptable. However, what I want to understand is whether the confidence intervals that result would be symmetric or not. It seems to me that with such low prevalence (on the order of 1% of the sample), that the confidence intervals would be asymmetric, with more upside range than downside. Is there any basis for this?

Thanks.]]>

Thanks!

Dave]]>

I am wondering how to calculate confidence interval for pooled variance/SD. I know that if a variable follows normal distribution then it's varaince/SD follows Chi-square disribution. Let me explain you the experiment i am dealing with.

There are five different runs each having 50 observation and i wanted to construct one sided confidence bound for SD as an acceptance criteria. I would highly appriciate any detail help or reference for the question asked.

Thanks in advance!

DJ]]>

A related question: the data is this skewed because it is really a combination of two things: a frequency distribution (with mostly zeros) and a severity distribution that is itself skewed. It has been analyzed (not by me) using the CLT to produce a confidence interval for the mean. My preference is to model the frequency and severity separately. Can you provide any guidance on the relative merits of these two approaches to the data?

Thanks.]]>

Thank you for your help!

David]]>

Thank you for any help and guidance...

David]]>

I have been searching for calculation steps for slope, intercept and their 95% CI based on Passing Bablok regression for method comparison study.

can you suggest me some easily available link or ready reference for this?

Thanks in advance

DJ]]>

Hi. My name's Grant Bostrom and I graduated from UCLA in 2003 and I was

hoping you could help me with a probability question. I'm working with a

non-profit on some marketing materials and they've always done a 50/50

raffle at their gala event. One suggestion was to do a contest such as guess

how many jelly beans are in a jar. Then the question came up: which game

gives people better odds of winning?

If 30,000 raffle tickets are sold and you bought one... your odds are 1 in

30,000...

But what if there's 30,000 people who guess and there's 30,000 jelly beans

in a jar... is there a way to compare those two probabilities? What are the

odds that you could guess the right number if a car had 30,000 jelly beans

in it?

I know this is a very random question... :)

From the marketing side I was trying to prove our tagline, "It's better odds

than a raffle?"]]>

In other words, if there are many data points is area under the curve the best method to quantify the sensitivity of my measurement.

Thank you,

Danny Gossett]]>

Sincerely,

Michael]]>

QuotePrinceton DSS

In regression with multiple independent variables, the coefficient tells you how much the dependent variable is expected to increase when that independent variable increases by one, holding all the other independent variables constant.

When one says, "holding all the other independent variables contant," does that mean holding them constant at their means?]]>

Can anyone provide the formula of smaple size for one sided McNemar test of two correlated proportions. Here is the story:

There are two situations BEFORE and AFTER, gives proportion p1 and p2, respectively performed on same samples.

Need to test the H0:p2-p1=0.05 vs H1:p2-p1>0.05 with 5% level of significance and 80% power.

Please let me know how to carry out sample size in this case.

Thanks in advance.

DJ]]>

During my graduate stats class, we learned about the different models for calculating sums of squares (SSTYPE1, SSTYPE2, and SSTYPE3). Oddly, the actual models are the inverse, where SSTYPE1 = Model 3, SSTYPE2 = Model 2, and SSTYPE3 = Model 1. The professor was unaware of why the programmers did this.

Does anyone know why SPSS uses this reverse coding? Does it have a logical or special meaning?

Cheers,]]>

The problem I am facing, however, is that 95% of my observations are between 0 and 1 and theoretically they should be between 0 and 1. The other 5% of the observations are above 1 (max is 1.1). The observations are approx. normally distributed.

Since the assumption is that these obs. should between 0 and 1, should I do a probit regression with some type of correction (what would that correction be?)? Or, should I resort to a simple linear regression? Or, should I just throw out that 5% of the data?]]>

She is using three likert-type scales, hence each participant responds on all- 2 of these are 5 point scales and the last one is a three-point scale.

In research related to psychology in general, what analysis would one prefer - treating information from this data-set as ordinal or interval - hence the choice between parametric or non-parametric test?

Thanks in advance,

Diana]]>

I have frequencies of 16 Subjects with a the repeated measures factors rep (same different) and kat (a,b,c). I would like to know if rep depends on kat. Is a log-linear model with Subjects(1 16) feature(1 3) rep (1 2) right or do I need to run a gee?]]>

Thanks so much,

Dave]]>

Secondly- I don't want to overfit my model by throwing in too many unnecessary independent variables to test for. But how can you take out any potential variables (even after having possibly done a univariate test and found no strict correlation to the dependent response variable), without knowing of any potential interaction effects that could arise among variables being tested? One variable might be tested with a basic univariate test and thrown out as having no correlation to my response variable, but that variable WITH another one might have been important!

Thank you so much for your help...

-MP]]>

I've attemtped to read some documentation on Rasch tests - the jargon is overwhelming and I cannot afford the time to become a psychometrician myself (nor do I care to do that, from what I have seen). I am looking for an intuitive explanation of how confidence intervals for individuals' test scores can be derived and what such confidence intervals mean. Given that repeated tests are not administered to any sample of students - a straightforward way of obtaining a confidence interval for the variation in test scores from one test to another - how is a confidence interval for an individual obtained from a single test administration?

The interpretation that they offer for the confidence interval is that "If [student name] were to take a similar test multiple times, the range of these scores would fall between xxx and yyy 80% of the time."

Thanks for any assistance you can provide.]]>

Thank you in advance for your inputs.

Eins]]>

We believe using imputation is incorrect, because we have a low sample-size and choosing a distribution seems impractical with so few observations compared to our baseline NHANES, which has 600 or so observations. Do you have any suggestions for the analysis?]]>

The data:

All variables are dichotomous including the criterion.

The criterion mean is estimated to be around 0.0157.

Expectations:

We are looking to detect a change to that mean as low as 5% with a sample size of 1200000 and be able to calculate each factors contribution to the change along with identifying the optimum configuration of factors and levels and estimating the results at specific confidence level intervals.

This is my question:

What would be the best type of analysis for the experiment results?

a) ANOVA

b) General Multivariate Linear Regression

c) Logistical Regression

Should I be considering any other type of analysis?

Thanks]]>

32 participants (16 in two groups) answered a question. Subsequent to answering this question, participants indicated whether their answer was based on a guess, a feeling or a memory - so a categorical variable with 3 levels. Each participant provided 22 of these categorical ratings. I thought initially, I could analyse the data by converting it into relative proportions and use a mixed analysis of variance with 1 between and 1 within. However, of course, the observations are not independent of one another. Moreover, they average out to 33.33% across the within-subject. It doesn't seem like a chi-square test is a good idea though, as each of the 22 for each subject would be more related to one another than any of the other observations. What analysis can I use - or is it only possible to explore this data?]]>

In one case, the rates are cost/effectiveness ratios such as REVENUE over EXPENSE, and these are typically much greater than 1, perhaps even in the thousands.

To get to my point, the most general case I have is FOLLOWERS over FOLLOWING which, as a rate, follows more of a power law with lots of between 0 and 1, few greater than that.

Is there a general method (probably non-parametric) I can use to compare two rates? In the cases where the rates are computed from samples, the samples are typically pretty small (10-20).]]>