How to create an effective Index with this query?

Survey research: How can I create a meaningful index from a diverse set of items?

  • SurveyResearchFilter: Creating an index when response items are on different (and somewhat non-sensical) scales. I am working with a large survey data set. There are several questions that are of interest to me and these questions hang quite well together in a principal-components factor analysis; therefore, I'd love to make an index of the responses to this question to use a single dependent variable. Unfortunately, the questions are on different scales and some of the scales don't make sense. Two of the questions are dichotomous (agree/disagree). Two of the questions are on a semi-Likert-style scale (1: Strongly agree - 4: Strongly disagree --- I know, it is weird not to have a neutral middle point). One question has the scale 1: Agree, 2: Disagree, 3: Depends. Obviously, I can't just throw all of the items into an index because the Likert-style questions would be weighted more heavily than the dichotomous or trichotomous questions. Further, I can't really tell what I should do with the trichotomous scale to make it make more sense. Interestingly, there exists a good deal of research using these exact questions and, puzzlingly, this methodological problem has not occurred to previous authors. How can I create an index from these items while maintaining their respective proportional impact and, further, how can I make that trichotomous scale make sense?

  • Answer:

    Transform the weird ones (a, dis, dep) to (a, dep, dis), do principal components (which you already have working) and use the component value.

proj at Ask.Metafilter.Com Visit the source

Was this solution helpful to you?

Other answers

Weighted averages?

k8t

Oh, PCA is just a weighted average (it's a linear transformation) with statistical respectability. The wiki article is fine for this topic, and it's correct that looking at factor analysis could do a similar job for you.

a robot made out of meat

Yeah, I'm not sure why the first-dimension principal components wouldn't work for you. Most PCA routines normalize the variances to 1 by default, so each variable will be weighted equally. PCA is just a variance-maximizing average, so it's not fundamentally different from any other index you would build in a more ad hoc manner. They will have the same measurement properties. Factor scores, 99 times out of 100, will not be substantially different, but bear in mind that those are estimates (with error) of a score on an underlying latent variable, not a direct transformation of the data.

shadow vector

Also let me add that it's unlikely that you need to be too worried about finding the absolute best index. If you're confident that all the individual items measure some common underlying variable t, then any reasonable index will be a more reliable measure of t than any one item would be, through the magic of the sampling distribution.

shadow vector

Great answers. Thanks!

proj

Related Q & A:

Just Added Q & A:

Find solution

For every problem there is a solution! Proved by Solucija.

  • Got an issue and looking for advice?

  • Ask Solucija to search every corner of the Web for help.

  • Get workable solutions and helpful tips in a moment.

Just ask Solucija about an issue you face and immediately get a list of ready solutions, answers and tips from other Internet users. We always provide the most suitable and complete answer to your question at the top, along with a few good alternatives below.