creating_exposures.Rmd
Synthetic data called “records” is included in the package. To make an exposure frame the data must have “key”, “start”, and “end” columns with unique values in the key column.
key | start | end | issue_age | gender |
---|---|---|---|---|
B10251C8 | 2010-04-10 | 2019-04-04 | 35 | M |
D68554D5 | 2005-01-01 | 2019-04-04 | 30 | F |
The addExposures function creates rows representing exposures between the start and end date with calculated exposures. By default exposure rows are created for each policy year.
key | duration | start_int | end_int | exposure |
---|---|---|---|---|
B10251C8 | 1 | 2010-04-10 | 2011-04-09 | 0.9993 |
B10251C8 | 2 | 2011-04-10 | 2012-04-09 | 1.002 |
B10251C8 | 3 | 2012-04-10 | 2013-04-09 | 0.9993 |
B10251C8 | 4 | 2013-04-10 | 2014-04-09 | 0.9993 |
B10251C8 | 5 | 2014-04-10 | 2015-04-09 | 0.9993 |
B10251C8 | 6 | 2015-04-10 | 2016-04-09 | 1.002 |
One exposure unit is 365.25 days so the exposure column will be either slightly above or below 1. This is subject to change as we work with experienced actuaries to come up with the best possible implementation.
###addExposures() arguments
####type
There are several options available for exposure calculations. For example, we can create exposure rows by policy month.
key | duration | policy_month | start_int | end_int | exposure |
---|---|---|---|---|---|
B10251C8 | 1 | 1 | 2010-04-10 | 2010-05-09 | 0.08214 |
B10251C8 | 1 | 2 | 2010-05-10 | 2010-06-09 | 0.08487 |
B10251C8 | 1 | 3 | 2010-06-10 | 2010-07-09 | 0.08214 |
B10251C8 | 1 | 4 | 2010-07-10 | 2010-08-09 | 0.08487 |
B10251C8 | 1 | 5 | 2010-08-10 | 2010-09-09 | 0.08487 |
B10251C8 | 1 | 6 | 2010-09-10 | 2010-10-09 | 0.08214 |
The policy year and policy month options only do policy anniversary studies because exposure intervals may cross calendar years. There are options for creating exposure rows that do not cross calendar years or calendar months to allow for calendar year or calendar month studies.
Policy year with calendar year:
key | duration | start_int | end_int | exposure |
---|---|---|---|---|
B10251C8 | 1 | 2010-04-10 | 2010-12-31 | 0.7283 |
B10251C8 | 1 | 2011-01-01 | 2011-04-09 | 0.271 |
B10251C8 | 2 | 2011-04-10 | 2011-12-31 | 0.7283 |
B10251C8 | 2 | 2012-01-01 | 2012-04-09 | 0.2738 |
B10251C8 | 3 | 2012-04-10 | 2012-12-31 | 0.7283 |
B10251C8 | 3 | 2013-01-01 | 2013-04-09 | 0.271 |
Policy year with calendar month:
key | duration | start_int | end_int | exposure |
---|---|---|---|---|
B10251C8 | 1 | 2010-04-10 | 2010-04-30 | 0.05749 |
B10251C8 | 1 | 2010-05-01 | 2010-05-31 | 0.08487 |
B10251C8 | 1 | 2010-06-01 | 2010-06-30 | 0.08214 |
B10251C8 | 1 | 2010-07-01 | 2010-07-31 | 0.08487 |
B10251C8 | 1 | 2010-08-01 | 2010-08-31 | 0.08487 |
B10251C8 | 1 | 2010-09-01 | 2010-09-30 | 0.08214 |
B10251C8 | 1 | 2010-10-01 | 2010-10-31 | 0.08487 |
B10251C8 | 1 | 2010-11-01 | 2010-11-30 | 0.08214 |
B10251C8 | 1 | 2010-12-01 | 2010-12-31 | 0.08487 |
B10251C8 | 1 | 2011-01-01 | 2011-01-31 | 0.08487 |
B10251C8 | 1 | 2011-02-01 | 2011-02-28 | 0.07666 |
B10251C8 | 1 | 2011-03-01 | 2011-03-31 | 0.08487 |
B10251C8 | 1 | 2011-04-01 | 2011-04-09 | 0.02464 |
B10251C8 | 2 | 2011-04-10 | 2011-04-30 | 0.05749 |
B10251C8 | 2 | 2011-05-01 | 2011-05-31 | 0.08487 |
Policy month with calendar year:
key | duration | policy_month | start_int | end_int | exposure |
---|---|---|---|---|---|
B10251C8 | 1 | 1 | 2010-04-10 | 2010-05-09 | 0.08214 |
B10251C8 | 1 | 2 | 2010-05-10 | 2010-06-09 | 0.08487 |
B10251C8 | 1 | 3 | 2010-06-10 | 2010-07-09 | 0.08214 |
B10251C8 | 1 | 4 | 2010-07-10 | 2010-08-09 | 0.08487 |
B10251C8 | 1 | 5 | 2010-08-10 | 2010-09-09 | 0.08487 |
B10251C8 | 1 | 6 | 2010-09-10 | 2010-10-09 | 0.08214 |
B10251C8 | 1 | 7 | 2010-10-10 | 2010-11-09 | 0.08487 |
B10251C8 | 1 | 8 | 2010-11-10 | 2010-12-09 | 0.08214 |
B10251C8 | 1 | 9 | 2010-12-10 | 2010-12-31 | 0.06023 |
B10251C8 | 1 | 9 | 2011-01-01 | 2011-01-09 | 0.02464 |
B10251C8 | 1 | 10 | 2011-01-10 | 2011-02-09 | 0.08487 |
Policy month with calendar month:
key | duration | policy_month | start_int | end_int | exposure |
---|---|---|---|---|---|
B10251C8 | 1 | 1 | 2010-04-10 | 2010-04-30 | 0.05749 |
B10251C8 | 1 | 1 | 2010-05-01 | 2010-05-09 | 0.02464 |
B10251C8 | 1 | 2 | 2010-05-10 | 2010-05-31 | 0.06023 |
B10251C8 | 1 | 2 | 2010-06-01 | 2010-06-09 | 0.02464 |
B10251C8 | 1 | 3 | 2010-06-10 | 2010-06-30 | 0.05749 |
B10251C8 | 1 | 3 | 2010-07-01 | 2010-07-09 | 0.02464 |
####lower_year and upper_year
There are arguments in the addExposures function that allow for truncation by calendar year. Exposure rows will only be created if the interval lies entirely within the specified years. This can reduce computation time and memory use.
Policy year with lower and upper truncation year:
exposures_PY_2016_to_2018 <- addExposures(records, type = "PY", lower_year = 2016, upper_year = 2018)
exposures_PY_2016_to_2018
key | duration | start_int | end_int | exposure |
---|---|---|---|---|
B10251C8 | 7 | 2016-04-10 | 2017-04-09 | 0.9993 |
B10251C8 | 8 | 2017-04-10 | 2018-04-04 | 0.9856 |
D68554D5 | 12 | 2016-01-01 | 2016-12-31 | 1.002 |
D68554D5 | 13 | 2017-01-01 | 2017-12-31 | 0.9993 |
D68554D5 | 14 | 2018-01-01 | 2018-04-04 | 0.2574 |
Policy year with calendar month and lower truncation year:
key | duration | start_int | end_int | exposure |
---|---|---|---|---|
B10251C8 | 9 | 2019-01-01 | 2019-01-31 | 0.08487 |
B10251C8 | 9 | 2019-02-01 | 2019-02-28 | 0.07666 |
B10251C8 | 9 | 2019-03-01 | 2019-03-31 | 0.08487 |
B10251C8 | 9 | 2019-04-01 | 2019-04-04 | 0.01095 |
D68554D5 | 15 | 2019-01-01 | 2019-01-31 | 0.08487 |
D68554D5 | 15 | 2019-02-01 | 2019-02-28 | 0.07666 |
D68554D5 | 15 | 2019-03-01 | 2019-03-31 | 0.08487 |
D68554D5 | 15 | 2019-04-01 | 2019-04-04 | 0.01095 |
###Determine Output Size Before Calling addExposures()
We can estimate the size of a call to addExposures() by using expSize(). We shouldn’t try to create 100 million rows, so we can use this function to make sure we don’t.
## row_bound
## 25
expSize() takes the same arguments as addExposures().
## row_bound
## 192
###Adding additional information to the calculated exposures
We can add additional information by joining our original records to the created exposures by the key. Below we join the gender and issue age from our original record to the exposure frame and calculate the attained age.
exposures_mod <- exposures %>% inner_join(select(records, key, issue_age, gender), by = "key") %>%
mutate(attained_age = issue_age + duration - 1)
head(exposures_mod)
key | duration | start_int | end_int | exposure | issue_age | gender | attained_age |
---|---|---|---|---|---|---|---|
B10251C8 | 1 | 2010-04-10 | 2011-04-09 | 0.9993 | 35 | M | 35 |
B10251C8 | 2 | 2011-04-10 | 2012-04-09 | 1.002 | 35 | M | 36 |
B10251C8 | 3 | 2012-04-10 | 2013-04-09 | 0.9993 | 35 | M | 37 |
B10251C8 | 4 | 2013-04-10 | 2014-04-09 | 0.9993 | 35 | M | 38 |
B10251C8 | 5 | 2014-04-10 | 2015-04-09 | 0.9993 | 35 | M | 39 |
B10251C8 | 6 | 2015-04-10 | 2016-04-09 | 1.002 | 35 | M | 40 |
###Making Daily Exposures You can create a row for each policy day in an interval using the addDays() function.
key | date |
---|---|
B10251C8 | 2010-04-10 |
B10251C8 | 2010-04-11 |
B10251C8 | 2010-04-12 |
B10251C8 | 2010-04-13 |
B10251C8 | 2010-04-14 |
B10251C8 | 2010-04-15 |
There are options for lower and upper truncation dates
key | date |
---|---|
B10251C8 | 2018-10-10 |
B10251C8 | 2018-10-11 |
B10251C8 | 2018-10-12 |
D68554D5 | 2018-10-10 |
D68554D5 | 2018-10-11 |
D68554D5 | 2018-10-12 |
You can determine the size of the ouput without creating the output using the daySize() function.
## num_rows
## 6