Back in the 1930s, it was established that smoking cigarettes is in some way correlated with lung cancer. But tobacco companies had a ready reply – “Ok, while that MIGHT be true, a mere correlation doesn’t prove that cigarettes CAUSE lung cancer.”
This example shows both the pros and cons of correlational research. While it can be very useful in preliminary research, allowing us to determine that there’s some type of relationship between two variables, it’s not suitable for proving a cause-and-effect relationship between those variables.
Researchers and companies often rely on correlational research to avoid wasting money. How come? Well, you first want to demonstrate that there’s some kind of correlation between variables before pouring money and spending time on more substantial research.
But before we go into more detail, let’s start by defining correlational research and the types of correlation that can exist between variables.
What is correlational research?
Correlational research is a non-experimental type of research where we measure two variables to understand and assess the statistical relationship (correlation) between them without attempting to influence, control, or change the variables.
Since the two variables are in correlation, when you see one variable changing, you can have a good idea of how the other is expected to change. There are three kinds of correlation between variables:
- Positive correlation – When both variables change in the same direction (as height increases, weight also increases)
- Negative correlation – When the two variables change in opposite directions As coffee consumption increases, tiredness decreases
- Zero correlation – When the two variables are in no relation (Coffee consumption is not correlated with height)
Why and when to use correlational research (examples)
There are two main cases when you would prefer to use correlational research rather than experimental:
1. When you are looking to determine whether there is some kind of a relationship between two variables, but you don’t expect it to be causal.
Let’s say that you want to find out whether people with lower income are more likely to smoke cigarettes. While you don’t assume that income causes smoking (or the other way around), finding a relationship between the two could help better understand factors that influence people’s nicotine intake choices.
Another example – let’s say you’re looking to learn whether there’s any correlation between people’s marital status and which political option they are in favor of. Even though your premise is not that being married causes people to vote in a certain way, it’s highly probable that both of these variables are influenced by other factors like age, ethnicity, ideology, religion, socioeconomic situation, and more. So if there’s a correlation between marital status and political views, you can explore some other related factors as well to try to predict voting patterns.
2. When you think that there’s a causal relationship between the variables but it might be undoable (unethical or impractical) to conduct experiments that would manipulate the given variables.
For example, if you conduct research to determine whether passive smoking can lead to asthma in children, it would of course be unethical to conduct such an experiment that would involve exposing children to cigarette smoke on purpose.
A workaround would be to do correlational research to discover whether children whose parents are smokers are more likely to be diagnosed with asthma than those whose parents don’t consume cigarettes.
How to collect data for correlational research
With correlational research, you are not allowed to manipulate any of the variables. But you’re free to collect the data in any way you find most purposeful. Here, we’re going to discuss some of the most common (and most reliable) data collection methods for correlational research – surveys, naturalistic observation, and secondary (archival) data.
The quickest and simplest way to obtain data for correlational research is by using surveys and questionnaires. You can distribute surveys by mail, in person – or most efficiently, online. You can simply send out surveys asking respondents various questions related to variables you’re studying. After you’ve gathered the responses, you can statistically analyze your surveys and determine whether there’s a correlation between the variables or not.
For example, if you’re looking to research the correlation between age and physical activity, you could send out a survey like this to learn more about your respondents’ training routine and habits:
Make sure to send out the survey to a sample of people from different age groups. After you’ve collected your correlational research data, you need to statistically analyze the responses to find out whether people of a certain age tend to be more physically active.
As you could see from all of the examples above, in correlational research, one variable will often come in the form of some piece of demographic information. Ideally, you would already have this data readily-available (from another research someone else did) but in case you need to do it yourself, here’s a demographic survey you can use:
I’d like to emphasize the word “naturalistic” here. It means that you collect data about a certain phenomenon in its natural environment without manipulating the behavior or intervening in any way.
Naturalistic observation includes recording, counting, describing, and categorizing what you hear and see. Even though it entails both quantitative and qualitative material, for correlational research, you need to focus on quantitative data (such as amounts, durations, frequencies, and more).
Compared to surveys, it can be more reliable (depending on the proficiency of the researcher) but it’s also more time-consuming and difficult to predict and control.
Secondary (archival) data
One way to approach correlational research is by collecting original data. Another, faster and equally valid way is to use data that has already been gathered for another purpose (this could be official records, previous research, various public polls, and so on).
This method of obtaining correlational data works best in cases where you’re looking to research a gradual change or progression over a longer period of time. In other words, it’s much faster and easier to analyze data for the previous 3 years than to start gathering data now and wait for 3 years to complete your study. Of course, this isn’t always applicable.
How to calculate correlation? The Pearson Correlation Coefficient
In statistics, there are several types of correlation coefficients, but the most commonly used and most widely accepted is the Pearson correlation coefficient.
Also known as the Pearson product-moment correlation coefficient, it shows a linear relationship between two variables and can hold a value between +1 and −1 (positive or negative correlation, or no correlation at all).
To learn more about how to calculate and interpret the correlation coefficient, check out this guide on the Pearson correlation coefficient formula. And start doing your own correlational research right away!