Premium Pattern • expstudies

Suppose we have some monthly exposures that we would like to add premium data to.

exposures_PM <- addExposures(records, type = "PM")
head(exposures_PM)

key	duration	policy_month	start_int	end_int	exposure
B10251C8	1	1	2010-04-10	2010-05-09	0.08214
B10251C8	1	2	2010-05-10	2010-06-09	0.08487
B10251C8	1	3	2010-06-10	2010-07-09	0.08214
B10251C8	1	4	2010-07-10	2010-08-09	0.08487
B10251C8	1	5	2010-08-10	2010-09-09	0.08487
B10251C8	1	6	2010-09-10	2010-10-09	0.08214

Simulated premium data “trans” comes with the package.

head(trans)

key	trans_date	amt
B10251C8	2012-12-04	199
B10251C8	2013-12-28	197
B10251C8	2015-12-30	177
B10251C8	2019-05-07	192
B10251C8	2012-04-15	206
B10251C8	2019-04-02	220

The addStart function adds the start date of the appropriate exposure interval to the transactions.

trans_with_interval <- addStart(trans, exposures_PM)
head(trans_with_interval)

start_int	key	trans_date	amt
2010-05-10	B10251C8	2010-05-28	190
2010-06-10	B10251C8	2010-07-04	189
2010-11-10	B10251C8	2010-11-21	179
2011-04-10	B10251C8	2011-05-08	210
2011-07-10	B10251C8	2011-07-12	198
2012-01-10	B10251C8	2012-01-14	194

We can group and aggregate by key and start_int to get unique transaction rows corresponding to intervals in exposures_PM.

trans_to_join <- trans_with_interval %>% group_by(start_int, key) %>% summarise(premium = sum(amt))
head(trans_to_join)

start_int	key	premium
2005-06-01	D68554D5	97
2005-10-01	D68554D5	169
2005-12-01	D68554D5	96
2006-01-01	D68554D5	193
2006-02-01	D68554D5	107
2006-03-01	D68554D5	119

Then we can join this to the exposures using a left join without duplicating any exposures.

premium_study <- exposures_PM %>% left_join(trans_to_join, by = c("key", "start_int"))
head(premium_study, 10)

key	duration	policy_month	start_int	end_int	exposure	premium
B10251C8	1	1	2010-04-10	2010-05-09	0.08214	NA
B10251C8	1	2	2010-05-10	2010-06-09	0.08487	190
B10251C8	1	3	2010-06-10	2010-07-09	0.08214	189
B10251C8	1	4	2010-07-10	2010-08-09	0.08487	NA
B10251C8	1	5	2010-08-10	2010-09-09	0.08487	NA
B10251C8	1	6	2010-09-10	2010-10-09	0.08214	NA
B10251C8	1	7	2010-10-10	2010-11-09	0.08487	NA
B10251C8	1	8	2010-11-10	2010-12-09	0.08214	179
B10251C8	1	9	2010-12-10	2011-01-09	0.08487	NA
B10251C8	1	10	2011-01-10	2011-02-09	0.08487	NA

Change the NA values resulting from the join to zeros using an if_else.

premium_study <- premium_study %>% mutate(premium = if_else(is.na(premium), 0, premium))
head(premium_study, 10)

key	duration	policy_month	start_int	end_int	exposure	premium
B10251C8	1	1	2010-04-10	2010-05-09	0.08214	0
B10251C8	1	2	2010-05-10	2010-06-09	0.08487	190
B10251C8	1	3	2010-06-10	2010-07-09	0.08214	189
B10251C8	1	4	2010-07-10	2010-08-09	0.08487	0
B10251C8	1	5	2010-08-10	2010-09-09	0.08487	0
B10251C8	1	6	2010-09-10	2010-10-09	0.08214	0
B10251C8	1	7	2010-10-10	2010-11-09	0.08487	0
B10251C8	1	8	2010-11-10	2010-12-09	0.08214	179
B10251C8	1	9	2010-12-10	2011-01-09	0.08487	0
B10251C8	1	10	2011-01-10	2011-02-09	0.08487	0

Now we are free to do any calculations we want. For a simple example we calculate the average premium in the first two policy months. Refer to the section on adding additional information for more creative policy splits.

premium_study %>% filter(policy_month %in% c(1,2)) %>% group_by(policy_month) %>% summarise(avg_premium = mean(premium))

policy_month	avg_premium
1	60.46
2	66.88

###Other Uses for addStart Suppose we were interested in what the last premium paid by a policy was for some predictive analytics project. Again we left join the premium to the exposure frame.

previous_premium_unfilled <- exposures_PM %>% left_join(trans_to_join, by = c("key", "start_int"))
head(previous_premium_unfilled)

key	duration	policy_month	start_int	end_int	exposure	premium
B10251C8	1	1	2010-04-10	2010-05-09	0.08214	NA
B10251C8	1	2	2010-05-10	2010-06-09	0.08487	190
B10251C8	1	3	2010-06-10	2010-07-09	0.08214	189
B10251C8	1	4	2010-07-10	2010-08-09	0.08487	NA
B10251C8	1	5	2010-08-10	2010-09-09	0.08487	NA
B10251C8	1	6	2010-09-10	2010-10-09	0.08214	NA

This time we fill in NA values with the previous paid premium instead of 0. The first interval is NA because there are no prior premiums.

previous_premium <- previous_premium_unfilled %>%
tidyr::fill(premium, .direction = "down")

key	duration	policy_month	start_int	end_int	exposure	premium
B10251C8	1	1	2010-04-10	2010-05-09	0.08214	NA
B10251C8	1	2	2010-05-10	2010-06-09	0.08487	190
B10251C8	1	3	2010-06-10	2010-07-09	0.08214	189
B10251C8	1	4	2010-07-10	2010-08-09	0.08487	189
B10251C8	1	5	2010-08-10	2010-09-09	0.08487	189
B10251C8	1	6	2010-09-10	2010-10-09	0.08214	189

Data manipulations similar to this can be used to engineer features for anything varying with time: account values, guarantees, planned premiums, etc…