I conducted a survey of members of a professional group, using an online survey service. The data from the survey was provided in a spreadsheet, along with simple breakdowns of the results (for example, numbers of males and females, or numbers of people in each age group).
In addition, I need to make tables or graphs showing combinations of the data (for example, number of males in each age group or number of females in each salary group).
Here is a sample of the data from the survey. Each row contains one person’s responses to the questions.
Below are some examples of the data I want to extract.
I’ve had very little experience using spreadsheets. (The last time I used one was around 8 years ago, and then I wasn’t doing anything very complicated.)
But I wasn’t concerned, because lots of people do this sort of basic analysis, so I assumed that the information would be readily available. Surely I could find an appropriate spreadsheet function or formula and apply it to my data.
No such luck. I’m still sure the information is out there, but I haven’t been able to find it. I wasn’t surprised to not find much detail about OpenOffice.org Calc usage, but I did expect to find a clue in Excel literature. I waded through the descriptions of functions in the OpenOffice.org online help, but many of the descriptions didn’t make much sense to me and even when they did make sense, the examples didn’t seem to fit my situation.
The closest match I could find was DCOUNT. I could get it to work on any one combination of cells (for example, females in age group 3), but not in a way that would generate a table of results without doing a lot of manual work. Perhaps I’ve missed some trick to make this function do what I want.
COUNTIF also seemed like a good candidate, but I couldn’t find a way to get that to work on two columns of data instead of just one.
SUMPRODUCT does the job
Finally I stumbled upon a great article by John Walkenbach titled “Count and Sum Your Data in Excel 2002″ (http://www.microsoft.com/office/previous/xp/columns/column10.asp), which gave me the answer.
Scroll down a ways in the article until you get to “Conditional Counting and Summing Using Multiple Conditions.” This does exactly what I want, but there were still some tricks to learn before I could get it to work for me.
Here’s the basic formula for calculating the number of females (gender group 1) in age group 1:
To make that formula work on my data, I had to make it an array formula. To do this, place the cursor in the cell where you want the results to appear, then type the formula into the Formula Bar, and finally press CTRL+SHIFT+ENTER.
The answer appears in the results cell, and the formula is shown in the Formula Bar. Note that the formula bar shows braces (curly brackets) around the formula you entered. These braces indicate that this is an array formula.
If you copy or change this formula, remember to press CTRL+SHIFT+ENTER when saving the changes.
A different (longer) way to input the formula is to use the AutoPilot: Functions icon on the Formula Bar:
- In the AutoPilot: Functions dialog, choose Array in the Category list on the Functions tab.
- In the Function list, select SUMPRODUCT.
- Put the cursor between the parentheses in the Formula box and type the rest of the formula.
- Select the Array checkbox. This step is essential, but you can do it at any time while the AutoPilot: Functions dialog is open. Figure 1 shows what the dialog looks like now.
- Click OK. The formula will be inserted into your spreadsheet.
Figure 1. The completed AutoPilot: Functions dialog
Replicating the formula into other cells
Now that I had the basic formula to do the calculation for one cell of my results table, I needed to replicate the formula in the other results cells.
Here are the formulas for the other cells in the first row of the Gender-Age Group table:
And here are the formulas for the second row of that table:
These were easy to input by copying and pasting the basic formula and changing the parameters as needed (remembering to press CTRL+SHIFT+ENTER), but it’s still a tedious and time-consuming process when you have lots of survey questions. I’m still hoping to find a trick to doing it with less manual fiddling about. When I do, I’ll write it up and put it on this website.