Grouping Variable (Coding Variable)

Types of Variables > Grouping Variable

What is a Grouping Variable?

A grouping variable (also called a coding variable, group variable or by variable) sorts data within data files into categories or groups. It tells a computer system how you’ve sorted data into groups. Grouping variables can be:

Usually, you can name a group anything, as long as it makes sense to you (and that you tell the software about your naming convention). For example, if you have an experimental group and a control group you could name the groups:

  • 0 1(binary).
  • EXPERIMENTAL CONTROL(categorical). You could also name them EXPER. and CONTR. or E and C.
  • 1,2.(numerical).

You could even categorize your groups as X And Y, although it makes more sense to keep the category names meaningful. If you look back at your data in 10 years, you’ll be glad you created names that jog your memory.

Note: Sometimes, an author might use “grouping variable” as a synonym for the independent variable in tests like MANOVA.

Use in Software

Grouping variables are typically used in software. Each software has its own quirks and requirements when it comes to naming variables. For example:

grouping variable
Data grouped into categories under “Method.”

  • In Statistica, usually the group is identified by a number (i.e. Group 1, 2 or 3) or by a categorical label, like MALE or FEMALE. These values are called codes and you can specify up to 1,000 of them.
  • In SPSS, grouping variables are defined on the worksheet and specified within a test window (for example, the Independent Samples T Test or Tests for Several Independent Samples window). For example, let’s say your worksheet has 1 for male and 2 for female. In the T Test window, move the independent variable down to the Grouping Variable box. Click “Define Groups” and enter your labels (i.e. 1,2).
  • In MATLAB, you’ve got many options in addition to numeric or categorical variables. For example, character arrays can store multiple characters, while cell arrays can store multiple strings in the same variable (“cell array of strings”).

Comments? Need to post a correction? Please Contact Us.