Currently, clear_lastminute_nas
will remove any row which has any NA
's in it whatsoever. With multiple signals, this can mean removing rows with real data (as an example, see 11-29-24
, for which 10-02-24
is the last day with data). Before using chng
, we will need another scheme to handle differently missing data. Ideas:
- locf
NA
's in non-outcome signals. This is not a great idea, as lags likely become meaningless.
- do the equivalent of
extend_ahead
but for lags (e.g. if the last observation for chng is 30 days behind, adjust the lags of c(0,7,14)
to c(30,37,44)
.
- ?
@brookslogan may have other thoughts I forgot about.