Tidy phenology data of multiple types

dat_multiple <- tidy_multiple_data(dat_shoot, dat_phenophase)

usethis::proj_set(str_c(getwd(), "/phenologyb4warmed/"), force = T)
setwd("phenologyb4warmed/")
usethis::use_data(dat_multiple, overwrite = T)
setwd("..")

plot_multiple_data(dat_multiple)

There is some relationship between phenophase data and shoot growth data.
The mapping between data types is not entirely clear at this point.
A lot of phenophase status might be filled-in or edited. Be careful. Artur once mentioned that the first yes observation is often the most accurate.
Even after manually edting data, there are some back and forth in phenophase status.
It is tricky to use “no” observations in the fall. (Some individuals have true “no” observations, while some seem to have “yes” observations filled in throughout the year.) Shall we just focus on the spring?

Propose model to integrate multiple data types

Variables \begin{align*} Y = \begin{cases} 0 & \text{dormancy} \newline 1 & \text{budburst} \newline 2 & \text{oneleaf} \newline 3 & \text{mostleaf} \end{cases} & - \text{life stage} \newline y & - \text{shoot length} \newline d & - \text{day of year} \newline T = \begin{cases} 0 & \text{ambient temperature} \newline 1 & \text{+1.7°C temperature} \newline 2 & \text{+3.4°C temperature} \end{cases} & - \text{warming treatment} \newline D = \begin{cases} 0 & \text{ambient rainfall} \newline 1 & \text{reduced rainfall} \end{cases} & - \text{drought treatment} \end{align*}

Indices \begin{align*} i &- \text{plant (one plant can have multiple shoots)} \newline j &- \text{block (i.e., a group of closely located experimental plots)} \newline t &- \text{year} \newline d &- \text{day of year} \end{align*}

Parameters \begin{align*} \mu &- \text{latent variable for the spring development of a plant} \newline p &- \text{probability of plant being in a specific life stage} \newline \theta &- \text{thresholds for latent variables to determine life stage} \newline \nu &- \text{latent variable for the shoot length} \newline c &- \text{starting value of logistic curve} \newline A &- \text{asymptote of logistic curve} \newline x_0 &- \text{midpoint of logistic curve} \newline k &- \text{growth rate or steepness of logistic curve} \newline \beta &- \text{effects of experimental treatment on parameters of logistic curve} \newline \sigma^2 &- \text{variance (of observation, parameters, or hyperparameters)} \end{align*}

Data model (discrete)

\begin{align*} Y_{i,t,d} & \sim \text{Categorical} (p_{0,i,t,d}, p_{1,i,t,d}, p_{2,i,t,d}, p_{3,i,t,d})\newline p_{0,i,t,d} &= \Phi \left( \frac{\theta_1-\mu_{i,t,d}}{\sigma_Y} \right) \newline p_{1,i,t,d} &= - \Phi \left( \frac{\theta_1-\mu_{i,t,d}}{\sigma_Y} \right) + \Phi \left( \frac{\theta_2-\mu_{i,t,d}}{\sigma_Y} \right) \newline p_{2,i,t,d} &= - \Phi \left( \frac{\theta_2-\mu_{i,t,d}}{\sigma_Y} \right) + \Phi \left( \frac{\theta_3-\mu_{i,t,d}}{\sigma_Y} \right)\newline p_{3,i,t,d} &= - \Phi \left( \frac{\theta_3-\mu_{i,t,d}}{\sigma_Y} \right) + 1 \end{align*}

Data model (continuous)

\begin{align*} y_{i,t,d} &\sim \text{Lognormal}(\nu_{i,t,d}, \sigma_y^2)\newline \nu_{i,t,d} &= c +A_{i,t} \mu_{i,t,d} \newline A_{i,t} &= \mu_{A} + \delta_{A,t} + \alpha_{A,i,t} \end{align*}

Process model

\begin{align*} \mu_{i,t,d} &= \frac{1}{1+e^{-k_{i,t}(d-x_{0 i,t})}} \newline x_{0 i,t} &= \mu_{x_0} + \delta_{x_0,i} + \alpha_{x_0,i,t} \newline log(k_{i,t}) &= \mu_{log(k)} + \delta_{log(k),i} + \alpha_{log(k),i,t} \end{align*}

Fixed effects

\begin{align*} \delta_{A,t} &= \beta_{A,1} T_i + \beta_{A,2} D_i + \beta_{A,3} T_i D_i \newline \delta_{x_0,i} &= \beta_{x_0,1} T_i + \beta_{x_0,2} D_i + \beta_{x_0,3} T_i D_i \newline \delta_{log(k),i} &= \beta_{log(k),1} T_i + \beta_{log(k),2} D_i + \beta_{log(k),3} T_i D_i \end{align*}

Random effects

\begin{align*} \alpha_{A,i,t} &\sim \text{Normal}(0, \sigma_A^2) \newline \alpha_{x_0,i,t} &\sim \text{Normal}(0, \sigma_{x_0}^2) \newline \alpha_{log(k),i,t} &\sim \text{Truncated Normal}(0, \sigma_{log(k)}^2, 0, \infty) \end{align*}

Priors

\begin{align*} \theta_1 &\sim \text{Uniform} (0, 1)\newline \theta_2 &\sim \text{Uniform} (\theta_1, 1)\newline \theta_3 &\sim \text{Uniform} (\theta_2, 1)\newline \sigma_Y^2 &\sim \text{Truncated Normal}(0, 1, 0, \infty)\newline c &\sim \text{Normal}(0, 1) \newline \mu_A &\sim \text{Normal}(5, 1) \newline \beta_A &\sim \text{Multivariate Normal} ( \begin{pmatrix} 0 \newline 0 \newline 0 \newline \end{pmatrix}, \begin{pmatrix} 1 & 0 & 0 \newline 0 & 1 & 0 \newline 0 & 0 & 1 \newline \end{pmatrix} )\newline \sigma_A^2 &\sim \text{Truncated Normal}(0, 1, 0, \infty) \newline \mu_{x_0} &\sim \text{Normal}(160, 100) \newline \beta_{x_0} &\sim \text{Multivariate Normal} ( \begin{pmatrix} 0 \newline 0 \newline 0 \newline \end{pmatrix}, \begin{pmatrix} 100 & 0 & 0 \newline 0 & 100 & 0 \newline 0 & 0 & 100 \newline \end{pmatrix} )\newline \sigma_{x_0}^2 &\sim \text{Truncated Normal}(0, 100, 0, \infty) \newline \mu_{log(k)} &\sim \text{Normal}(-2, 0.04) \newline \beta_{log(k)} &\sim \text{Multivariate Normal} ( \begin{pmatrix} 0 \newline 0 \newline 0 \newline \end{pmatrix}, \begin{pmatrix} 0.04 & 0 & 0 \newline 0 & 0.04 & 0 \newline 0 & 0 & 0.04 \newline \end{pmatrix} )\newline \sigma_{log(k)}^2 &\sim \text{Truncated Normal}(0, 0.04, 0, \infty) \newline \sigma_y^2 &\sim \text{Truncated Normal}(0, 1, 0, \infty) \end{align*}

Multiple data types

Tidy phenology data of multiple types

Propose model to integrate multiple data types