Education variables are one of the more complicated variables in the NOAS, as they have been asked in several different ways and can be coded in a multitude of ways. This is even more complicated by the lifespan aspect of our data, where not all participants have been attending school for the same number of years despite completing mandatory schooling in Norway. While the education variables in the NOAS attempt to reflect the real educational levels of the participants, the changes in the Norwegian school system means the variables need to be interpreted with care.

The categorical variables are nice because they treat all participants the same not matter which shcool system they were under. They are also gross simplifications. The numerical columns might be more precise, but participants interpret these differently, and as they are them selves asked to count their years of education. So while more fine grained, these also pose difficulties.

Norwegian school history

In Norway, a law for 9 years of mandatory schooling was implemented in 1969. Before this, there was a (largely) 7 year primary school (folkeskole) in most districts. After 1969 mandatory school (grunnskole) was implemented which was split into two parts primary school (barneskolen, years 1-6) and secondary school (ungdsomsskolen, year 7-9).

M-74

In 1974, a school reform called “Mønsterplanen” (M74) was implemented as a regulation of education. This was an effort to get the schooling in the various districts to become more equal, but still had large focus on the teachers choosing what materials in schools were to be taught. Certain subjects were made mandatory, while the curriculum was decided more by the teachers. English was made mandatory for all, and arts and crafts were introduced.

Reforms in 1987 og 1997

In 1987 a new plan was implemented, which more precisely dictated what curriculum should be used in the schools. One of the primary reasons for the change was the view that it had become more difficult to grow up and that the schools should compensate for some of this. Schools should now represent a protective environment and culture against the new waves in society. Important elements in M87 were care, work on attitudes and class environments.

Only 10 years later a new reform in primary school was implemented, L97, where the mandatory school was extended to 10 years, and children started school 1 year earlier (at 6 years of age rather than 10).

Sources

Tønnessen (1995): Norsk utdanningshistorie. En innføring. Høigård & Ruge (1971): Den norske skolens historie. En oversikt.

NOAS data

NOAS variable Explanation
edu_years Education Years to highest completed
edu_total Education Years of total education rounded down to closest integer
edu_coded_unknown Education Coded - unknown
edu_coded4 Education Coded 4 categories
edu_coded10 Education Coded 10 categories
edu_desc Education Free text description

The functions created in this package is to make it transparent and easier to work with NOAS functions, particularly the categorical ones. All the functions start with edu, and should have names to indicate what they do.

Assessing factor levels

There are several functions to help assess the different coding schemes we have (currently two, 4 and 9). To inspect the coding schemas, you can call the edu_levels() function, with either 4 or 9 as argument.

library(questionnaires)
edu_levels(4)
#>                  Primary school (9 years) 
#>                                         9 
#>                               High school 
#>                                        12 
#> University/University college (< 4 years) 
#>                                        16 
#> University/University college (> 4 years) 
#>                                        19
edu_levels(9)
#>                                     Pre-school/No schooling 
#>                                                           0 
#>                                    Primary school (6 years) 
#>                                                           6 
#>                                  Secondary school (9 years) 
#>                                                           9 
#>                                      High school (12 years) 
#>                                                          12 
#>                              High school diploma (13 years) 
#>                                                          13 
#>                             High school addition (14 years) 
#>                                                          14 
#> Lower level University/University college degree (16 years) 
#>                                                          16 
#>        Upper level University/University college (19 years) 
#>                                                          19 
#>                                            Ph.D. (21 years) 
#>                                                          21

These are the schemas that the remaining edu_functions require to work properly and aplpy the correct schema. The levels scheme is adaptable, and should we adopt another we do not use now, a new schemas needs to be specified and hopefully should work farily well with the other functions. edu_levels returns a names numeric vector, meaning that the vector it self contains numbers. This means you can do computations with the values returned. To access the names you need to explicitly ask for them through the names function.

names(edu_levels(4))
#> [1] "Primary school (9 years)"                 
#> [2] "High school"                              
#> [3] "University/University college (< 4 years)"
#> [4] "University/University college (> 4 years)"

edu_levels(9) %>% 
  names()
#> [1] "Pre-school/No schooling"                                    
#> [2] "Primary school (6 years)"                                   
#> [3] "Secondary school (9 years)"                                 
#> [4] "High school (12 years)"                                     
#> [5] "High school diploma (13 years)"                             
#> [6] "High school addition (14 years)"                            
#> [7] "Lower level University/University college degree (16 years)"
#> [8] "Upper level University/University college (19 years)"       
#> [9] "Ph.D. (21 years)"

Recoding vector into factor

While the NOAS is pre-curated, these functions are also applied to the NOAS-data to ensure consistency across projects. The edu_factorise functions help evaluate the data and create correct factors. These functions require the data inputed to be either numeric or in character, and can handle is data has been mixed (i.e. numeric mixed with character).

edu_factorise(c("3", "High school", "University/University college (> 4 years)", 
"University/University college (> 4 years)", "University/University college (> 4 years)", 
"University/University college (> 4 years)"), levels = 4)
#> [1] University/University college (< 4 years)
#> [2] High school                              
#> [3] University/University college (> 4 years)
#> [4] University/University college (> 4 years)
#> [5] University/University college (> 4 years)
#> [6] University/University college (> 4 years)
#> 4 Levels: Primary school (9 years) ... University/University college (> 4 years)
#> Numeric levels: 9 12 16 19

edu_factorise(c(7,7,8,4,2,5), levels = 9)
#> [1] Lower level University/University college degree (16 years)
#> [2] Lower level University/University college degree (16 years)
#> [3] Upper level University/University college (19 years)       
#> [4] High school (12 years)                                     
#> [5] Primary school (6 years)                                   
#> [6] High school diploma (13 years)                             
#> 9 Levels: Pre-school/No schooling ... Ph.D. (21 years)
#> Numeric levels: 0 6 9 12 13 14 16 19 21

The main edu_factorise function takes two arguments, the vector (x) and levels of the coding scheme to apply (4 or 9). The function it will return will have the correct labels, and their levels will correspond to the number of years generally needed to complete said education. There are specific functions for the two schema, if you want to call them directly. You will find many of the edu-functions have this system, a main function needing a levels argument, or specialized functions for the two coding schema. The specialized functions always call the main one, with pre-set values to levels.

edu4_factorise(c("3", "High school", "University/University college (> 4 years)", 
"University/University college (> 4 years)", "University/University college (> 4 years)", 
"University/University college (> 4 years)"))
#> [1] University/University college (< 4 years)
#> [2] High school                              
#> [3] University/University college (> 4 years)
#> [4] University/University college (> 4 years)
#> [5] University/University college (> 4 years)
#> [6] University/University college (> 4 years)
#> 4 Levels: Primary school (9 years) ... University/University college (> 4 years)
#> Numeric levels: 9 12 16 19

edu9_factorise(c(7,7,8,4,2,5))
#> [1] Lower level University/University college degree (16 years)
#> [2] Lower level University/University college degree (16 years)
#> [3] Upper level University/University college (19 years)       
#> [4] High school (12 years)                                     
#> [5] Primary school (6 years)                                   
#> [6] High school diploma (13 years)                             
#> 9 Levels: Pre-school/No schooling ... Ph.D. (21 years)
#> Numeric levels: 0 6 9 12 13 14 16 19 21

Alter factor to year

Data from coding schema, punched as levels of the factors (as seen in edu_levels) can be transformed into year equivalents with the edu_to_year functions. While these functions do not require data to already be in factors to work, but double checking that the expected values are returned when turning data into factors, before converting the factors to numbers is recommended.

# Check that factors are as expected
c("3", "High school", "University/University college (> 4 years)", 
"University/University college (> 4 years)", "University/University college (> 4 years)", 
"University/University college (> 4 years)") %>% 
  edu4_factorise()
#> [1] University/University college (< 4 years)
#> [2] High school                              
#> [3] University/University college (> 4 years)
#> [4] University/University college (> 4 years)
#> [5] University/University college (> 4 years)
#> [6] University/University college (> 4 years)
#> 4 Levels: Primary school (9 years) ... University/University college (> 4 years)
#> Numeric levels: 9 12 16 19

# or call the main function directly, specifying the levels of the schema
c("3", "High school", "University/University college (> 4 years)", 
"University/University college (> 4 years)", "University/University college (> 4 years)", 
"University/University college (> 4 years)") %>% 
  edu_factorise( levels = 4)
#> [1] University/University college (< 4 years)
#> [2] High school                              
#> [3] University/University college (> 4 years)
#> [4] University/University college (> 4 years)
#> [5] University/University college (> 4 years)
#> [6] University/University college (> 4 years)
#> 4 Levels: Primary school (9 years) ... University/University college (> 4 years)
#> Numeric levels: 9 12 16 19

# convert to factor then to year
c("3", "High school", "University/University college (> 4 years)", 
"University/University college (> 4 years)", "University/University college (> 4 years)", 
"University/University college (> 4 years)") %>% 
  edu4_factorise() %>% 
  edu4_to_years()
#> [1] 16 12 19 19 19 19

# you can skip factorizing as a middle step, 
# if you are certain of the schema being applied correctly
c("3", "High school", "University/University college (> 4 years)", 
"University/University college (> 4 years)", "University/University college (> 4 years)", 
"University/University college (> 4 years)") %>% 
  edu4_to_years()
#> [1] 16 12 19 19 19 19

Converting between schemas

Since edu_Coded10 has many more categories than edu_Coded4 and they have not been collected simultaneously, you can also reduce the 10 categories to 4 by the function edu10_reduce. The categories are reduced by the following heuristic:

edu_map(from = 9, to =4)
#> # A tibble: 8 × 2
#>    from    to
#>   <dbl> <dbl>
#> 1     0    NA
#> 2     9     9
#> 3    12    12
#> 4    13    12
#> 5    14    12
#> 6    16    16
#> 7    19    19
#> 8    21    19

edu_map_chr(from = 9, to =4)
#> # A tibble: 8 × 2
#>   from                                                        to                
#>   <chr>                                                       <chr>             
#> 1 Pre-school/No schooling                                     NA                
#> 2 Secondary school (9 years)                                  Primary school (9…
#> 3 High school (12 years)                                      High school       
#> 4 High school diploma (13 years)                              High school       
#> 5 High school addition (14 years)                             High school       
#> 6 Lower level University/University college degree (16 years) University/Univer…
#> 7 Upper level University/University college (19 years)        University/Univer…
#> 8 Ph.D. (21 years)                                            University/Univer…
c(7,7,8,4,2,5) %>% 
  edu9_reduce(to = 4)
#> [1] University/University college (< 4 years)
#> [2] University/University college (< 4 years)
#> [3] University/University college (> 4 years)
#> [4] High school                              
#> [5] <NA>                                     
#> [6] High school                              
#> 4 Levels: Primary school (9 years) ... University/University college (> 4 years)
#> Numeric levels: 9 12 16 19

c(7,7,8,4,2,5) %>% 
  edu_reduce(from = 9, to = 4)
#> [1] University/University college (< 4 years)
#> [2] University/University college (< 4 years)
#> [3] University/University college (> 4 years)
#> [4] High school                              
#> [5] <NA>                                     
#> [6] High school                              
#> 4 Levels: Primary school (9 years) ... University/University college (> 4 years)
#> Numeric levels: 9 12 16 19

Compute education years from all available schemas

Some participants will have provided education information to us several times, likely using different coding schema. This means we might have better, finer, information regarding participants education at later points of data collection. Particularly when it comes to estimating actual years of education, the more levels, the better precision we have. Also, in later data collection we directly ask participants to calculate the number of total years they have been in full-time education. As there are currently three main sources: edu4, edu10 and edu_years, we use this to get to the best estimate we can. The edu_compute function requires a full dataset with at least three columns that include this information, they do not have to have the same column-names as in the NOAS, as you will need to specify this your self. The function will do data checks in the following order:

  1. if edu_years is not NA, use this data
  2. if edu10 is not NA, turn factor to years
  3. if edu4 is not NA, turn factor to years
edu_data <- data.frame(
  edu4 = c("3", "High school", 1, NA,
           "University/University college (> 4 years)", NA, 
           "University/University college (< 4 years)"),
  edu9 = c(7,7,8,NA,"Primary school (6 years)",5, 9),
  edu_years = c(NA, 12, 9, NA, 19, 19, NA),
  mother = c("3", "High school", 1, NA,
             "University/University college (> 4 years)", "University/University college (> 4 years)", 
             "University/University college (< 4 years)"),
  father = c(7,7,8,4,"Primary school (6 years)",5, 10),
  stringsAsFactors = FALSE
)

edu_data
#>                                        edu4                     edu9 edu_years
#> 1                                         3                        7        NA
#> 2                               High school                        7        12
#> 3                                         1                        8         9
#> 4                                      <NA>                     <NA>        NA
#> 5 University/University college (> 4 years) Primary school (6 years)        19
#> 6                                      <NA>                        5        19
#> 7 University/University college (< 4 years)                        9        NA
#>                                      mother                   father
#> 1                                         3                        7
#> 2                               High school                        7
#> 3                                         1                        8
#> 4                                      <NA>                        4
#> 5 University/University college (> 4 years) Primary school (6 years)
#> 6 University/University college (> 4 years)                        5
#> 7 University/University college (< 4 years)                       10

edu_compute(edu_data,
            edu4 = edu4, 
            edu9 = edu9, 
            edu_years = edu_years)
#> New names:
#>  `edu_years` -> `edu_years...3`
#>  `edu_years` -> `edu_years...8`
#>                                        edu4                     edu9
#> 1                                         3                        7
#> 2                               High school                        7
#> 3                                         1                        8
#> 4                                      <NA>                     <NA>
#> 5 University/University college (> 4 years) Primary school (6 years)
#> 6                                      <NA>                        5
#> 7 University/University college (< 4 years)                        9
#>   edu_years...3                                    mother
#> 1            NA                                         3
#> 2            12                               High school
#> 3             9                                         1
#> 4            NA                                      <NA>
#> 5            19 University/University college (> 4 years)
#> 6            19 University/University college (> 4 years)
#> 7            NA University/University college (< 4 years)
#>                     father
#> 1                        7
#> 2                        7
#> 3                        8
#> 4                        4
#> 5 Primary school (6 years)
#> 6                        5
#> 7                       10
#>                                                    edu_coded9
#> 1 Lower level University/University college degree (16 years)
#> 2 Lower level University/University college degree (16 years)
#> 3        Upper level University/University college (19 years)
#> 4                                                        <NA>
#> 5                                    Primary school (6 years)
#> 6                              High school diploma (13 years)
#> 7                                            Ph.D. (21 years)
#>                                  edu_coded4 edu_years...8
#> 1 University/University college (< 4 years)            16
#> 2                               High school            12
#> 3                  Primary school (9 years)             9
#> 4                                      <NA>            NA
#> 5 University/University college (> 4 years)            19
#> 6                                      <NA>            19
#> 7 University/University college (< 4 years)            21

You will see that the specified edu_years column wil have been populated with data available in the other two columns.

Compile education from several sources

Lastly, because some of our participants are children, using their education is not possible as they are still completing lower-level education. In these cases it is nice to have a variable that actually spans educational information from participants and parents for kids, for easier reporting and analysis. For this we can use the information we have from parents to fill in gpas in pediatric data. The rule for this filler is Participant - Mother - Father, so that only those missing all of these will have NA in the data. The columns this is done in are called edu_Compiled, and there are three of them:

edu_Compiled_Coded4 - 4 category coded education from participant, mother, or father
edu_Compiled_Years - Year approximation of edu_Compiled_Coded4 edu_Compiled_Source - The source of the two above, one of “Participant”, “Mother”, or “Father”.

edu_compile(data = edu_data, 
            participant = edu4, 
            mother = edu4_factorise(mother), 
            father = father
)
#>                                        edu4                     edu9 edu_years
#> 1                                         3                        7        NA
#> 2                               High school                        7        12
#> 3                                         1                        8         9
#> 4                                      <NA>                     <NA>        NA
#> 5 University/University college (> 4 years) Primary school (6 years)        19
#> 6                                      <NA>                        5        19
#> 7 University/University college (< 4 years)                        9        NA
#>                                      mother                   father
#> 1                                         3                        7
#> 2                               High school                        7
#> 3                                         1                        8
#> 4                                      <NA>                        4
#> 5 University/University college (> 4 years) Primary school (6 years)
#> 6 University/University college (> 4 years)                        5
#> 7 University/University college (< 4 years)                       10
#>                          edu_compiled_code4 edu_compiled_years
#> 1 University/University college (< 4 years)                 16
#> 2                               High school                 12
#> 3                  Primary school (9 years)                  9
#> 4 University/University college (> 4 years)                 19
#> 5 University/University college (> 4 years)                 19
#> 6 University/University college (> 4 years)                 19
#> 7 University/University college (< 4 years)                 16
#>   edu_compiled_source
#> 1         Participant
#> 2         Participant
#> 3         Participant
#> 4              Father
#> 5         Participant
#> 6              Mother
#> 7         Participant