This page provides an overview for the get_troopdata()
function, highlighting some of its potential uses.
First things first—let’s load the troopdata package
The troopdata package provides multiple functions to generate
customizable datasets containing information on US military deployments
and accompanying data. The get_troopdata()
function
represents the core of this package, providing customized data on US
overseas troop deployments, specifically.
Country-year data
The first function of this package is the
get_troopdata()
function. At its most basic this function
returns a data frame of country-year troop deployment values for the
selected time period, using the startdate
and
enddate
parameters.
#> # A tibble: 6 × 6
#> iso3c year ccode countryname region troops_ad
#> <chr> <dbl> <dbl> <chr> <chr> <dbl>
#> 1 USA 1990 2 United States North America 1138627
#> 2 USA 1991 2 United States North America 1216348
#> 3 USA 1992 2 United States North America 1171208
#> 4 USA 1993 2 United States North America 1127242
#> 5 USA 1994 2 United States North America 1073309
#> 6 USA 1995 2 United States North America 1041320
For users who want more refined data, the there are a number of arguments that allow the user to further tailor the output to their needs.
The host
argument allows users to specify the set of
host countries for which they would like data returned. This can be a
vector of numerical values equal to a Correlates of War (COW) Project
country code, a vector of character values equal to an ISO3C country
code, or a vector of character values corresponding to full country
names. Note that when supplying a vector of values they must be
consistent and correspond to a single type of identifier at a time
(i.e. they must all be numeric COW codes, ISO3C character codes, country
names, or region names).
For example, you can use a numeric vector of COW country codes like this:
# Let's make the host selection more specific
hostlist <- c(200, 220)
example <- get_troopdata(host = hostlist, startyear = 1990, endyear = 2020)
head(example)
#> # A tibble: 6 × 6
#> ccode year iso3c countryname region troops_ad
#> <dbl> <dbl> <chr> <chr> <chr> <dbl>
#> 1 200 1990 GBR United Kingdom Europe & Central Asia 25111
#> 2 200 1991 GBR United Kingdom Europe & Central Asia 23442
#> 3 200 1992 GBR United Kingdom Europe & Central Asia 20048
#> 4 200 1993 GBR United Kingdom Europe & Central Asia 16100
#> 5 200 1994 GBR United Kingdom Europe & Central Asia 13781
#> 6 200 1995 GBR United Kingdom Europe & Central Asia 12131
Or you can use a character vector of ISO3C codes.
hostlist.char <- c("CAN", "GBR")
example.char <- get_troopdata(host = hostlist.char, startyear = 1970, endyear = 2020)
head(example.char)
#> # A tibble: 6 × 6
#> iso3c year ccode countryname region troops_ad
#> <chr> <dbl> <dbl> <chr> <chr> <dbl>
#> 1 CAN 1970 20 Canada North America 2643
#> 2 CAN 1971 20 Canada North America 1835
#> 3 CAN 1972 20 Canada North America 1742
#> 4 CAN 1973 20 Canada North America 1362
#> 5 CAN 1974 20 Canada North America 1690
#> 6 CAN 1975 20 Canada North America 2607
Similarly, we can search for full country names:
hostlist.names <- c("Canada", "United Kingdom")
example.names <- get_troopdata(host = hostlist.names, startyear = 1970, endyear = 2020)
head(example.names)
#> # A tibble: 6 × 6
#> countryname year ccode iso3c region troops_ad
#> <chr> <dbl> <dbl> <chr> <chr> <dbl>
#> 1 Canada 1970 20 CAN North America 2643
#> 2 Canada 1971 20 CAN North America 1835
#> 3 Canada 1972 20 CAN North America 1742
#> 4 Canada 1973 20 CAN North America 1362
#> 5 Canada 1974 20 CAN North America 1690
#> 6 Canada 1975 20 CAN North America 2607
When searching for country names, the function will do its best to identify the correct country based on the character string that’s included. This can include cases where fragments of country names are included and the function will try to return the correct country.
example.frag <- get_troopdata(host = "South Ko", startyear = 1970, endyear = 2020)
head(example.frag)
#> # A tibble: 6 × 6
#> countryname year ccode iso3c region troops_ad
#> <chr> <dbl> <dbl> <chr> <chr> <dbl>
#> 1 South Korea 1970 732 KOR East Asia & Pacific 52283
#> 2 South Korea 1971 732 KOR East Asia & Pacific 40740
#> 3 South Korea 1972 732 KOR East Asia & Pacific 41600
#> 4 South Korea 1973 732 KOR East Asia & Pacific 41864
#> 5 South Korea 1974 732 KOR East Asia & Pacific 40878
#> 6 South Korea 1975 732 KOR East Asia & Pacific 41186
Finally, we can also search by region. Instead of inserting a country name or code into the host argument you can simply include character strings that represent regions. In these cases the function returns the aggregate sum of all deployments within that region for the specified time period
region.list <- c("Europe", "Asia")
example.region <- get_troopdata(host = region.list, startyear = 1970, endyear = 2020)
head(example.region)
#> # A tibble: 6 × 3
#> region year troops_ad
#> <chr> <dbl> <dbl>
#> 1 East Asia & Pacific 1970 410878
#> 2 East Asia & Pacific 1971 223025
#> 3 East Asia & Pacific 1972 69242
#> 4 East Asia & Pacific 1973 56240
#> 5 East Asia & Pacific 1974 54946
#> 6 East Asia & Pacific 1975 55739
Disaggregated Data
By default the get_troopdata()
function returns the
aggregate sum of active duty military personnel. But the original DMDC
reports often include disaggregated figures, with separate counts for
each branch of the military. The branch
argument allows
users to specify whether they would like to receive the aggregate sum of
all branches or the disaggregated figures for each branch. This argument
can take on three values: TRUE
, FALSE
, or a
vector of branch names.
# Let's get the disaggregated data for the US deployments to Canada and the UK
hostlist <- c("Canada", "United Kingdom")
example.branch <- get_troopdata(host = hostlist, branch = TRUE, startyear = 1970, endyear = 2020)
#> Warning: Branch data only includes active duty by default. This preserves
#> continuity across time periods as guard and reserve data are not reported prior
#> to 2000s.
head(example.branch)
#> # A tibble: 6 × 12
#> countryname year ccode iso3c region troops_ad army_ad navy_ad air_force_ad
#> <chr> <dbl> <dbl> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 Canada 1970 20 CAN North Am… 2643 12 413 2218
#> 2 Canada 1971 20 CAN North Am… 1835 12 433 1381
#> 3 Canada 1972 20 CAN North Am… 1742 14 410 1315
#> 4 Canada 1973 20 CAN North Am… 1362 12 390 951
#> 5 Canada 1974 20 CAN North Am… 1690 11 703 969
#> 6 Canada 1975 20 CAN North Am… 2607 11 1839 757
#> # ℹ 3 more variables: marine_corps_ad <dbl>, coast_guard_ad <dbl>,
#> # space_force_ad <dbl>
In each case the _ad
suffix on the variable name
indicates “Active Duty” numbers for the given branch.
Note that the total does not necessarily equal the sum of the individual branches. The function returns the maximum annual value for each branch. In cases where there are quarterly values reported, the sum total may come from one quarter and the individual branch values may come from another quarter.
We can also include disaggregated data national guard and reserve personnel, as well as DoD civilians. These numbers are generally only reported for more recent years, and are not available for all countries and time periods. Later updates will include more observations for earlier time periods where they are available.
hostlist <- c("Canada", "United Kingdom")
example.branch <- get_troopdata(host = hostlist, branch = TRUE, startyear = 1970, endyear = 2020, guard_reserve = TRUE, civilians = TRUE)
#> Warning: Branch data only includes active duty by default. This preserves
#> continuity across time periods as guard and reserve data are not reported prior
#> to 2000s.
#> Warning: Guard and Reserve data only available for 2006 forward.
head(example.branch)
#> # A tibble: 6 × 26
#> countryname year ccode iso3c region troops_ad army_ad navy_ad air_force_ad
#> <chr> <dbl> <dbl> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 Canada 1970 20 CAN North Am… 2643 12 413 2218
#> 2 Canada 1971 20 CAN North Am… 1835 12 433 1381
#> 3 Canada 1972 20 CAN North Am… 1742 14 410 1315
#> 4 Canada 1973 20 CAN North Am… 1362 12 390 951
#> 5 Canada 1974 20 CAN North Am… 1690 11 703 969
#> 6 Canada 1975 20 CAN North Am… 2607 11 1839 757
#> # ℹ 17 more variables: marine_corps_ad <dbl>, coast_guard_ad <dbl>,
#> # space_force_ad <dbl>, army_national_guard <dbl>, air_national_guard <dbl>,
#> # army_reserve <dbl>, navy_reserve <dbl>, marine_corps_reserve <dbl>,
#> # air_force_reserve <dbl>, coast_guard_reserve <dbl>,
#> # total_selected_reserve <dbl>, army_civilian <dbl>, navy_civilian <dbl>,
#> # marine_corps_civilian <dbl>, air_force_civilian <dbl>, dod_civilian <dbl>,
#> # total_civilian <dbl>
Time Periods
The most recent update also allows users to specify more fine grained
temporal coverage. DMDC reports have historically been released on an
annual basis, but in more recent years they have been released twice
annually or even quarterly, and the get_troopdata()
function allows users to specify whether they would like to receive the
quarterly data or the annual data. The quarters
argument
allows users to specify whether they would like to receive the quarterly
data or the annual data. This argument can take on two values:
TRUE
or FALSE
with the default being
FALSE
.
If the user opts to return quarterly data, the function will return
the month and quarter columns in addition to the year. Note that not
every quarter corresponds to a quarterly report for every country. In
some cases the quarterly value may be a 0 rather than NA
.
Additionally, the Army has not reported branch data for several recent
quarters between 2022 and 2023 due to internal personnel management
changes. Accordingly the aggregate totals for these quarters may be
lower than expected or not available.
Here we use full country names. See! Neat!
# Let's get the quarterly data for the US deployments to Canada and the UK
hostlist <- c("Canada", "United Kingdom")
example.quarters <- get_troopdata(host = hostlist, branch = TRUE, startyear = 2015, endyear = 2022, quarters = TRUE)
#> Warning: Branch data only includes active duty by default. This preserves
#> continuity across time periods as guard and reserve data are not reported prior
#> to 2000s.
#> Warning: Some service branches do not report data for all quarters. See the
#> following note from December, 2022, June 2023, and March 2023 DMDC reports:
#> 'The Army is converting its Integrated Personnel and Pay System (IPPS-A) and so
#> the Army did not provide military personnel data for end-of-June 2023.'
head(example.quarters)
#> # A tibble: 6 × 14
#> countryname year month quarter ccode iso3c region troops_ad army_ad navy_ad
#> <chr> <dbl> <chr> <dbl> <dbl> <chr> <chr> <dbl> <dbl> <dbl>
#> 1 Canada 2015 Decemb… 4 20 CAN North… 91 3 0
#> 2 Canada 2015 June 2 20 CAN North… 153 8 0
#> 3 Canada 2015 March 1 20 CAN North… 144 8 0
#> 4 Canada 2015 Septem… 3 20 CAN North… 146 8 0
#> 5 Canada 2016 Decemb… 4 20 CAN North… 141 6 0
#> 6 Canada 2016 June 2 20 CAN North… 146 6 0
#> # ℹ 4 more variables: air_force_ad <dbl>, marine_corps_ad <dbl>,
#> # coast_guard_ad <dbl>, space_force_ad <dbl>
Reports
Finally, users may want to view the original DMDC reports that the
data is drawn from. The reports
argument allows users to
specify whether they would like to receive the original DMDC reports
that the data is drawn from. This argument can take on two values:
TRUE
or FALSE
with the default being
FALSE
.
Users can specify the host
, startyear
, and
endyear
arguments as they would for the main function. The
function will return data frame single data frame containing all of the
original columns found in the DMDC reports upon which the data are
based. The formatting and column names will be roughly consistent with
the original reports, but the data will be filtered to only include the
specified host countries and time period.
The source
column in the data frame provides the month
and the year that the data was drawn from. This allows the user to more
easily track down the original DMDC report that the data was drawn from.
The current and archived reports can be found here: DMDC
Reports
Also note that if reports
is set to TRUE
then the user must also set the quarters
argument to
TRUE
.