What are Identifier Variables?
Identifier variables are categorical variables that have a single individual per category. For example:
- A Social Security Number.
- Interviewer ID number.
- Employee ID number.
As identifier variables are singular, it’s impossible to perform any data analysis on them. Instead, they are used to identify results. For example, if you are performing a series of 12 tests, you would get 12 results. Plug all those results into any program (i.e. SPSS, Excel) and the program doesn’t know the first piece of data from the last. Adding unique identifier variables to each of the 12 pieces of information will mean that you can run data analysis.
Identifier Variables and Unique Identifiers
Identifier variables are commonly used as unique identifiers for data collection purposes. A unique identifier is a variable that identifies a particular data collection time, a single interview or other “moment in time” related occurrence. Multiple identifier variables can make up a unique identifier. For example, a series of interviews could be :
- Employee 657, Interviewer 1
- Employee 658, Interviewer 1
- Employee 657, Interviewer 2
- Employee 658, Interviewer 2
You should use as many identifier variables as necessary to uniquely identify the situation. In the simplest situation, such as a single person interviewing a series of employees, only one variable may be needed for the unique identifier (i.e. just the employee numbers).
Use in SDTM
Identifier variables are one type of variable used in Study Data Tabulation Model (SDTM), a standard structure for organizing data collected from clinical trials. The format is needed for submission of product applications to the FDA and other authorities. They identify:
- The study.
- Subject of the observation.
- The domain.
- The sequence number of the record.
Identifier Variables in Programming
Identifier variables are used in programming (i.e. in STATA) in the same way they are used in data analysis: they are needed to identify unique occurrences of something. For example, “DAY”, “DAY_OF_WEEK”, “MONTH”, “MONTH_NAME”, and “YEAR” could all be used to identify a particular data. How these variables are used in various computer programs is outside of the scope of this site, but you can find more examples of how they are used on the Novell Linux website.
If you prefer an online interactive environment to learn R and statistics, this free R Tutorial by Datacamp is a great way to get started. If you're are somewhat comfortable with R and are interested in going deeper into Statistics, try this Statistics with R track.Comments are now closed for this post. Need help or want to post a correction? Please post a comment on our Facebook page and I'll do my best to help!