Getting Started • champsmortality

The goal of champsmortality is to provide functions for calculating factor-adjusted mortality fractions and rates at CHAMPS sites and catchments for various causes.

Installation

You can install the development version of champsmortality with the following:

install.packages("remotes") # one time only
remotes::install_github("ki-tools/champs-mortality")

Once installed, load the library:

library(champsmortality)

Data setup

The first time you use this package, you need to place the appropriate data files in a data directory that the package will pull from to perform the calculations. A function create_dataset_directory() is provided to help get this set up.

Here we will create the dataset directory in a temporary location.

data_dir <- tempfile()
create_dataset_directory(data_dir)
#> ✔ The directory '/tmp/RtmpE72IRs/file317416d132b8' is ready for the appropriate
#>   data files to be placed in it. The following datasets should be placed in
#>   this directory:
#> 
#> 1. CHAMPS Analytics Dataset: This dataset is available as a downloadable file
#>    from LabKey and is continuously updated. It contains most of the CHAMPS
#>    variables that are needed for the analysis.
#> 2. Maternal Registry Forms table: This dataset is also available as a
#>    downloadable file from LabKey and contains information about maternal age
#>    and education.
#> 3. CHAMPS vocabulary: This dataset provides a lookup table for all CHAMPS
#>    codes, providing a corresponding 'name' and 'preferred name' for each. This
#>    file is accessible from the CHAMPS L2 dataset from dataverse.
#> 4. DSS: This dataset contains counts of cases from the demographic surveillance
#>    system (DSS) corresponding to each CHAMPS site and catchment area, only for
#>    DSS cases that are not in the CHAMPS data. These counts are broken down by
#>    age group, year, location of death, season of death, maternal education, sex
#>    of child, and verbal autopsy cause of death.
#> 5. Season definition: This dataset is a csv file containing rainy and dry
#>    season date ranges for each site, which will be used to classify the season
#>    in which each case occurs. A dataset with known season definitions,
#>    '/tmp/RtmpE72IRs/file317416d132b8/seasons.csv', has been provided. Please
#>    update that file if necessary.
#> 6. Catchment lookup: This dataset is a csv file containing mappings from
#>    catchment codes to catchment names, used to link the DSS data, which uses
#>    catchment names, to the CHAMPS analysis dataset, which uses catchment IDs. A
#>    dataset with known catchment lookups,
#>    '/tmp/RtmpE72IRs/file317416d132b8/catchment_lookup.csv', has been provided.
#>    Please update that file if necessary.
#> 7. Live births: This dataset is a csv file containing yearly live births by
#>    site and catchment from DSS. A dataset with known live birth statistics by
#>    site, catchment, and year,
#>    '/tmp/RtmpE72IRs/file317416d132b8/live_births.csv', has been provided.
#>    Please update that file if necessary.
#> 8. Live births: This dataset is a csv file containing yearly DHS all-cause
#>    mortality data by site and catchment from DSS. A dataset with known DHS
#>    statistics by site, catchment, year, and age,
#>    '/tmp/RtmpE72IRs/file317416d132b8/dhs.csv', has been provided. Please update
#>    that file if necessary.
#> 
#> → Once the files are in place, edit the file
#>   /tmp/RtmpE72IRs/file317416d132b8/config.yaml to provide the file names
#>   corresponding to each of these datasets.
#> 
#> The config.yaml template looks like this:
#>     ┌──────────────────────────────────────────┐
#>     │  champs_analytics_dataset: ''            │
#>     │  maternal_registry_dataset: ''           │
#>     │  champs_vocabulary_dataset: ''           │
#>     │  dss_dataset: ''                         │
#>     │  season_lookup: seasons.csv              │
#>     │  catchment_lookup: catchment_lookup.csv  │
#>     │  live_births_dataset: live_births.csv    │
#>     │  dhs_dataset: dhs.csv                    │
#>     └──────────────────────────────────────────┘
#> 
#> So for example, if the CHAMPS Analytics Dataset you placed in this directory is
#> named 'Analytics_Dataset_2021-09-01.xlsx', you would edit the corresponding
#> line in config.yaml as follows:
#>     ┌─────────────────────────────────────────────────────────────────┐
#>     │  champs_analytics_dataset: 'Analytics_Dataset_2021-09-01.xlsx'  │
#>     └─────────────────────────────────────────────────────────────────┘

As long as you maintain the specified directory on your computer, you do not need to run this command again. We recommend choosing a persistent directory.

This creates the directory and copies some data files over that come with the package:

list.files(data_dir)
#> [1] "catchment_lookup.csv" "config.yaml"          "dhs.csv"             
#> [4] "seasons.csv"

These files come with the package. They will be described in greater detail in the next article.

The config.yaml file contains pointers to each dataset that is required for analysis:

cat(paste(readLines(file.path(data_dir, "config.yaml")), collapse = "\n"))
#> champs_analytics_dataset: ''
#> maternal_registry_dataset: ''
#> champs_vocabulary_dataset: ''
#> dss_dataset: ''
#> season_lookup: seasons.csv
#> catchment_lookup: catchment_lookup.csv
#> live_births_dataset: live_births.csv
#> dhs_dataset: dhs.csv

Each entry specifies the file name of each required data file. Note the three datasets that already exist and an additional 5 that don’t come with the package due to their sensitive nature. These datasets can be obtained from CHAMPS and more information about their contents and structure will be provided in the next section.