The Zygocity questionnaire was developed by the Norwegian Public Health Institute (FHI; Folkehelseinstituttet) for their twin registry studies. Its a series of questions probing the similarities between twins, to determine if they are mono- or dizygotic.
zygo_compute(
data,
twin_col,
cols,
recode = TRUE,
prefix = "zygo_",
keep_all = FALSE
)
dara.frame with the relevant data
column that codes for twin pairs. Each twin should have the same identifier here.
columns that contain the zygocity data. Use tidy-selectors
logical indicating if data should be recoded from 1-5(7) to -1. 0. 1.
string to prefix column names of computed values
logical, append to data.frame
data.frame with computed values
This note contains a brief description of the algorithm used to determine zygocity in recruitment in the 2000s.
Name | Answer questions about... | Used for |
Drop | You and your twin were like two drops of water in childhood | Pairs and singles |
Stranger | Strangers had trouble telling the difference when you were children | Pairs and singles |
Eye | Similarity in terms of eye color | Pairs |
Voice | Similarity in terms of voice | Single |
Dexter | Similarity in Dexterity | Pairs and Singles |
Belief | What you believe yourself | Pairs and Singles |
"Single" twins here means those who have responded alone, i.e. there is no data available for both in the pair. The similarity questions that are not found in the table above, e.g. whether or not family members had problems distinguishing the twins is not used in the classification.
During calculations of the entire zygocity score, weights are applied to the different categories, depending on whether one or both twins have responded to the questionnaire.
Name | Answer questions about... | Factor single | Factor pair |
Drop | You and your twin were like two drops of water | 1.494 | 2.111 |
Stranger | Strangers had trouble seeing the difference | 0.647 | 0.691 |
Eye | Similarity in terms of eye color | 0.394 | |
Voice | Similarity in terms of voice | 0.347 | |
Dexter | Dexterity Similarity | 0.458 | 0.366 |
Belief | What you believe yourself | 0.417 | 0.481 |
Constant term in the formula | 0.007 | - 0.087 |
"Form value" is the value the answer option has in the data file. "Score value" is the value used in the algorithm when zygocity is calculated.
Variable | Answer option | Form value | Score value |
Drop | Like two drops of water | 1 | 1 |
Like most siblings | 2 | -1 | |
Don't know | 3 | 0 | |
Stranger | Often | 1 | 1 |
Occasionally | 2 | 0 | |
Never | 3 | -1 | |
Don't know | 4 | 0 | |
Belief | Monozygotic | 1 | 1 |
Dizygotic | 2 | -1 | |
Don't know | 3 | 0 | |
Eye, Voice & Dexter | Exactly the same | 1 | 1 |
Almost like | 2 | 0 | |
Different | 3 | -1 | |
Don't know | 4 | 0 |
No answer option is used directly in the calculations, only the score values. In the following, it is these values (-1, 0 or 1) that are used in the algorithms. E.g. has Drop in the formula value 1 for a positive answer to whether the twins were equal to two drops of water.
The higher the absolute value of the final score, the more certain / clearer the classification. For answers that reveal greater uncertainty about the similarity (e.g. a greater proportion of "almost" and "don't know"), the value will be closer to zero.
For pairs where both have answered, the pair's average values for all score values are first calculated. That is Drop = (Drop1 + Drop2) / 2, etc., where Drop1 is the score value of the response from twin 1 and Drop2 is the score value of the response from twin 2 in the same pair.
The sign of this "pair score" is then used to determine zygocity in the same way as for "single": Negative value means double, positive value means single.
If only one twin in the pair has responded, the following is calculated:
The sign of this "single score" is then used to determine the zygocity: Negative value means double egg, positive value means single egg.
By default, the functions assume that columns have names in the manner of zygocity_XX
where XX
is a zero-padded (i.e. zero in front of numbers below 9, eg. 09
) question number of the inventory.
You may have column names in another format, but in that case you will need to supply to the functions the names of those columns using tidy-selectors (see the tidyverse packages for this).
The columns should adhere to some naming logic that is easy to specify.