Step 4: Renovate codebooks

Now we have a profile of they the metadata linkage for a dataset we can start preparing those individual linkage specific codebooks.

Things you will need

Step 4 template: 📥 codebook_template.csv
Codebook field definitions

Description

After we have evaluated the metadata linkage for a dataset. We will know which codebook and codebook variations to prepare. For each dataset we could potentiall have up to four:

codebook_simple.csv: (Very common) will link to the data via only a single identifer var_name and contain all the metadata fields that were categorized as ‘simple’.
codebook_by_country.csv (Very common) will link to the data via var_name and iso2; it will contain all the metadata fields that were categorized as ‘by_country’
codebook_by_year.csv (Uncommon?) will link to the data via var_name and year; it will contain all the metadata fields that were categorized as ‘by_country’
codebook_by_country.csv (Rare) will link to the data via var_name and strata_id; it will contain all the metadata fields that were categorized as ‘by_country’. Metadata links to data via

Deliverable

For each data set review the metadta linkage evaluation. For each unique linkage that is present in your dataset you will need to prepare the assosiated codebook. The thought process and deliverables for our three example datasets can be seen below. Note these codebooks are from an older template, please use the template provided below.

Important

Step 4 template: 📥 codebook_template.csv

Based on the APS codebook evaluation we saw that all the metadata are categorized as simple; therefor our step -4-codebooks deliverable for this dataset contains one file - codebook_simple.csv

codebook_simple.csv

var_name

domain

subdomain

var_label

var_def

value_type

units

coding

interpretation

strata_description

source

dataset_notes

limitations

acknowledgment

file

working_group

longitudinal

public

censor

fair

APSPM25GRIDPTS

Physical and Natural Enviro...

Air Pollution

Number of PM2.5 grid point ...

Number of grid point centro...

999998 = No data available

A higher number indicates m...

Atmospheric Composition Ana...

Citation: Hammer, M. S.; et...

raw-data/physical environme...

Climate - UHC

TRUE

FALSE

APSPM25MEAN

Physical and Natural Enviro...

Air Pollution

Average PM2.5 concentration

Arithmetic mean of PM2.5 me...

micromilligrams per cubic m...

999998 = No data available

A higher value indicates hi...

Atmospheric Composition Ana...

Citation: Hammer, M. S.; et...

raw-data/physical environme...

Climate - UHC

TRUE

FALSE

APSPM25STD

Physical and Natural Enviro...

Air Pollution

Standard Deviation of PM2.5

Standard deviation of PM2.5...

micromilligrams per cubic m...

999998 = No data available

A higher value indictates m...

Atmospheric Composition Ana...

Citation: Hammer, M. S.; et...

raw-data/physical environme...

Climate - UHC

TRUE

FALSE

APSPM25MEDIAN

Physical and Natural Enviro...

Air Pollution

Median PM2.5 concentration

Median of PM2.5 measurement...

micromilligrams per cubic m...

999998 = No data available

A higher value indicates hi...

Atmospheric Composition Ana...

Citation: Hammer, M. S.; et...

raw-data/physical environme...

Climate - UHC

TRUE

FALSE

APSPM25THRES

Physical and Natural Enviro...

Air Pollution

Average PM2.5 is above WHO ...

Indicator if the average PM...

0=10 or less 1=Greater tha...

A value of 1 indicates the ...

Atmospheric Composition Ana...

Citation: Hammer, M. S.; et...

raw-data/physical environme...

Climate - UHC

TRUE

FALSE

APSNO2GRIDPTS

Physical and Natural Enviro...

Air Pollution

Number of NO2 grid point ce...

Number of grid point centro...

999998 = No data available

A higher number indicates m...

Atmospheric Composition Ana...

Citation: Hammer, M. S.; et...

raw-data/physical environme...

Climate - UHC

TRUE

FALSE

APSNO2MEAN

Physical and Natural Enviro...

Air Pollution

Area-weighted average NO2

Weighted arithmetic mean of...

parts per billion (ppb)

999998 = No data available

A higher value indicates hi...

Atmospheric Composition Ana...

Citation: Hammer, M. S.; et...

raw-data/physical environme...

Climate - UHC

TRUE

FALSE

APSNO2STD

Physical and Natural Enviro...

Air Pollution

Area-weighted Standard Devi...

Weighted standard deviation...

parts per billion (ppb)

999998 = No data available

A higher value indictates m...

Atmospheric Composition Ana...

Citation: Hammer, M. S.; et...

raw-data/physical environme...

Climate - UHC

TRUE

FALSE

APSNO2MEDIAN

Physical and Natural Enviro...

Air Pollution

Area-weighted median NO2

Weighted median of NO2 meas...

parts per billion (ppb)

999998 = No data available

A higher value indicates hi...

Atmospheric Composition Ana...

Citation: Hammer, M. S.; et...

raw-data/physical environme...

Climate - UHC

TRUE

FALSE

Based on the CNS codebook evaluation we saw:

17 simple fields
2 by_country fields (source, public)

Therefore we need to prepare two codebooks for this dataset (see below).

codebook_simple.csv
codebook_by_iso2.csv

var_name

domain

subdomain

var_label

var_def

value_type

units

coding

interpretation

strata_description

dataset_notes

limitations

acknowledgment

file

working_group

longitudinal

censor

fair

CNSCROWD25BR

Social, Economic, and Servi...

Housing

Overcrowding: Proportion of...