var_name
dataset_id
APSNO2GRIDPTS
APSL1AD
APSNO2MEAN
APSL1AD
APSNO2MEDIAN
APSL1AD
APSNO2STD
APSL1AD
APSPM25GRIDPTS
APSL1AD
APSPM25MEAN
APSL1AD
APSPM25MEDIAN
APSL1AD
APSPM25STD
APSL1AD
APSPM25THRES
APSL1AD
This workflow goes through the steps needed to generate FAIR SALURBAL data.
Due to the heterogeneity in existing SALURBAL data/codebooks, the process of how each dataset is renovate will differ. The Renovation manuals pages will provide guidance for the FAIR renovation of SALURBAL data as well as provide instructions for new SALURBAL datasets. The steps are summarized in the list below.
var_name
to each variableThe deliverables at each step are displayed in the tabs below. Within each step there are tabs that contain examples (table + downloadable csv) of deliverables for three different datasets.
var_name.csv
is a table summarizing var_name called . It should contain two columns:
var_name
dataset_id
strata.csv
is a table that contains all possible strata_id for each variable. This will organize strata information ‘long’ meaning if a variable is stratified there should be multiple rows per variable. It should contain the following columns
var_name
strata_1_name
: name of the first strata. Should have no spaces and no underdashes ’_’ all text should be in Pascal case.strata_1_value
value of the first strata. Should have no spaces and no underdashes ’_’ all text should be in Pascal case.strata_2_name
name of the second strata. Should have no spaces and no underdashes ’_’ all text should be in Pascal case.strata_2_value
value of the second strata. Should have no spaces and no underdashes ’_’ all text should be in Pascal case.linkage.csv
is a table that describes how the linkage for each of the codebook fields. Starting with this template (📥 linkage.csv), for each codebook field (row) you should write a value of ‘1’ in the column cells if any variable falls under that linkage type.
All codebook fields are linkable only by_variable for the APS dataset so we for all codebook fields we only check (fill out the cell as ‘1’) the by_var column.
Most of the codebook fields in the CNS dataset are linkable only be variable except for:
source
vary by var_name
+iso2
for some variables but other do not for other variables; so this row has both by_var
and by_var_iso2
filled out.Most of the codebook fields in the SVY dataset are linkable only be variable except for:
var_def
vary by var_name
+strata
for some variables but other do not for other variables; so this row has both by_var
and by_var_strata
filled out.source
vary by var_name
+iso2
for some variables but other do not for other variables; so this row has both by_var
and by_var_iso2
filled out.For each data set review the metadta linkage evaluation. For each unique linkage that is present in your dataset you will need to prepare the assosiated codebook. The thought process and deliverables for our three example datasets can be seen below.
Based on the APS codebook evaluation we saw that all the metadata are categorized as simple; therefor our step -4-codebooks deliverable for this dataset contains one file - codebook_simple.csv
Based on the CNS codebook evaluation we saw:
Therefore we need to prepare two codebooks for this dataset (see below).
Based on the SVY codebook evaluation we saw:
Therefore we need to prepare three codebooks for this dataset (see below).
For each dataset you should have a data.csv as a deliverable
aps_data.csv
cns_data.csv
svy_data.csv
Based on the SVY codebook evaluation we saw:
Therefore we need to prepare three codebooks for this dataset (see below).
So for the three examples we provided here are the final deliverable files.