In order to diagnose and treat patients, psychologists use assessments. But how well do psychological tests work? In this lesson, we’ll look closer at three areas that inform a good psychological assessment: reliability, validity, and bias.

Psychological Assessment

Imagine that you are a psychologist and Kevin comes to see you. He’s been having trouble concentrating recently. It’s gotten so bad that even in the middle of a conversation, he sometimes forgets who he’s talking to and what they are talking about. He thinks something might be very wrong with him.

You think Kevin might be right that something’s wrong. But what? There are many psychological disorders that could cause the symptoms that Kevin has. In order to diagnose him, you need to assess him. That is, you need to gather more information about Kevin and his symptoms.

Psychological assessments come in many different forms. Interviewing, brain imaging, genetic testing and intelligence testing are just a few types of psychological assessments. Each of these assessments has its strengths and weaknesses, and a single psychologist might use several assessments with a patient.So, let’s say that you choose which assessments to use on Kevin. How do you know if the assessments you choose will actually help you? What if they don’t give you a good answer? What if they say he has a disorder, but he really doesn’t? In order to make sure a test is a good assessment, it must have reliability and validity. In addition, it should be free of bias.

Let’s look at each of these things a little more in-depth.


Remember Kevin? He comes to you with a problem, and you choose an assessment to help you diagnose him. For example, let’s say that you want to do an fMRI – a type of brain scan – to see if Kevin has a tumor. So you put Kevin in the MRI machine and the computer says that nope, Kevin doesn’t have a tumor. But wait! Last week, Kevin had the same test done. That time, it showed that he does have a tumor.

What’s going on?In this case, the assessment (the fMRI) lacks reliability. Reliability is when an assessment consistently delivers the same results. Of course, usually fMRI machines have high reliability, but Kevin’s case is an exception.

Let’s look at another example of reliability. Imagine that you have a bathroom scale. You get on it one morning and it says that you weigh 157 pounds. A minute later, you get on it and it says that you weigh 208 pounds. A third time, it tells you that you weigh 81 pounds. You’re not changing your weight from minute to minute, so the scale is not reliable.If, however, you get on the scale three times in a row and each time the scale reads 150 pounds, it is reliable.

Of course, there’s usually some variance, so even if the scale reads 155, 157 and 156, you could say that it still has a high reliability. There are two major types of reliability that are important in psychological assessments: inter-rater reliability and test-retest reliability.Inter-rater reliability is when two people will come up with the same answer when using the same assessment. For example, let’s say that you have a questionnaire that will help you diagnose Kevin. You ask him the questions and he answers.

From his answers, you diagnose him with attention deficit disorder.But then Kevin goes down the street to another psychologist. That psychologist has the same questionnaire. She asks the same questions and gets the same responses from Kevin. But she diagnoses him with obsessive-compulsive disorder. In this example, the questionnaire has a low inter-rater reliability.

If it had a high inter-rater reliability, each time a different psychologist used the questionnaire and got the same answers from Kevin, he or she would come up with the same diagnosis.The other type of reliability is test-retest reliability, which is when an assessment yields the same answer over and over. Remember your bathroom scale? If the readings on the scale are 157, 208 and 81 pounds, it has a low test-retest reliability because every time you get on the scale, the number is very different. On the other hand, if it reads 150 every time, it has a high test-retest reliability.


Obviously, it’s important for an assessment to have high reliability. But just because an assessment has reliability doesn’t mean it’s a good measure of what’s going on. Let’s go back to the bathroom scale example.

If your bathroom scale reads 150 pounds every time you step on it, it has a high reliability. But what if you actually weigh 200 pounds? In this case, your scale is reliable, but it’s not valid.Validity is when an assessment accurately measures what it’s supposed to measure. Your bathroom scale has low validity because it is not accurately measuring your weight. Notice that the definition for validity has two parts: an assessment has to measure accurately and it has to measure what it is supposed to measure. Your bathroom scale may be measuring weight (what it’s supposed to measure) but not measuring it accurately.On the other hand, what if your scale was accurately measuring something else? Let’s say that instead of measuring your weight, like it’s supposed to, your scale is actually measuring the weight of only the right half of your body? You might get an accurate reading, but it’s not the accurate reading you’re looking for.

Let’s go back to Kevin for a moment. Again, let’s say that you have a questionnaire to help you diagnose whether Kevin has attention deficit disorder or not. There are two ways that this questionnaire could fail to have validity. Maybe it’s not accurate and says that everyone has ADD. Or, maybe it’s not really measuring ADD at all.

The question of construct validity, or whether an assessment is measuring what it is supposed to measure, is a common one in psychology. For example, do IQ tests really measure intelligence? Is a test that’s meant to diagnose depression actually measuring pessimism? These are questions of validity.


As we’ve already seen, a good psychological assessment has both reliability and validity. But it should also be free of bias. Assessment bias is when different groups of people consistently have different outcomes on a test.

Remember that questionnaire you have to see if Kevin has ADD? What if the way the questions were worded meant that blonde people are more likely to be diagnosed with ADD than brunettes?Though this might seem like a silly example, assessment bias is a serious issue in psychological testing. Studies have shown race and gender bias, as well as other biases, in many types of assessments. Intelligence tests, psychological evaluations and many other assessments might have a bias.

However, the best psychological assessments are free of bias, valid and reliable.

Lesson Summary

Psychological assessments help psychologists diagnose and treat patients. There are three major issues in psychological testing: reliability, validity and bias.

Reliability is when a test consistently delivers the same results, either over time or across psychologists. Validity is when a test accurately measures what it’s supposed to measure. Finally, tests should be free of bias, which is when different groups of people consistently have different scores on an assessment.

Learning Outcomes

After finishing this lesson, you may find that you can:

  • Understand reliability in a psychological assessment
  • Discern the importance of validity in measurement
  • Impart the necessity of a bias-free assessment

