library(SHELF)
library(sn)
library(ggplot2)
Exercise. Food safety assessment with uncertainty analysis
MVEN10 Risk Assessment in Environment and Public Health
Exercise overview
We do this jointly in class.
Background
We can never be completely certain about the future, either in science, or in everyday life. Even when there is strong evidence that something will happen, there will almost always be uncertainty about the outcome. But by taking account of this uncertainty, we often can make better, more transparent decisions about things that may affect the outcome.
The European Food Safety Authority (EFSA) has developed a guidance for uncertainty analysis in scientific assessment which requires all assessment to say
- what sources of uncertainty have been identified
and contain
- a characterisation of their overall impact on the assessment conclusion.
The reason is that uncertainty of scientific conclusions has important implications for decision making and it is important to communicate this uncertainty for the transparency of assessments.
Purpose
- To practice performing a probabilistic uncertainty analysis.
Content
A human chemical risk assessment problem
A probabilistic uncertainty analysis using expert judgement and Monte Carlo-simulation
Duration
45 minutes
Reporting
Be prepared to report back during and at the end of the exercise.
References
Tutorial videos on EFSA’s topic page on uncertainty (examples in the chemical area)
Key concepts (17 minutes)
Methods and options for basic assessment of uncertainty (27 minutes)
Description of the assessment
The example is taken from the human health risk assessment of inorganic arsenic in food by EFSA that constituted one of the case studies.
Here we are to reproduce part of the uncertainty analysis. More specifically, the part where they perform probabilistic uncertainty analysis that a high exposure would exceed the Benchmark Dose (BMD) (which is the same that the MOE is less than 1).
Hazard assessment
Estimates of benchmark dose as μg iAs/kg bw per day was identified by the CONTAM Panel for different critical effects. A lower, median and upper value for the BMD were derived from model averaging of a BMD modelling.
Critical effect | BMDL | BMD | BMDU |
---|---|---|---|
Skin cancer | 0.06 | 0.15 | 0.21 |
Bladder cancer | 0.15 | 1.33 | 5.46 |
- Which quantiles does the BMDL, BMD and BMDU corresponds to?
Exposure assessment
The ranges and medians of the 95th percentile dietary exposure estimates for iAs seen over different consumption surveys and for two relevant age groups are:
Age group | Low value | Mid value | High value |
---|---|---|---|
Infants | 0.21 | 0.6 | 1.20 |
Toddlers | 0.24 | 0.51 | 0.99 |
- Decide which quantiles does the Low, Mid and High value corresponds to.
Load packages
Probabilistic uncertainty analysis
Characterise uncertainty in the Reference Point
- Choose one of the critical effects and find a probability distribution describing uncertainty about the BMD for that effect. Use the SHELF tool, where you feed in your own values.
Let us start with Skin cancer
<- fitdist(vals = c(0.06,0.15,0.21), probs = c(0.05, 0.5, 0.95), lower = 0)
fit_rp fit_rp
$Normal
mean sd
1 0.1492205 0.04007987
$Student.t
location scale df
1 0.1497456 0.02845681 3
$Skewnormal
location scale slant
1 0.1968077 0.06980073 -2.788872
$Gamma
shape rate
1 21.42747 140.6492
$Log.normal
mean.log.X sd.log.X
1 -1.897159 0.2046353
$Log.Student.t
location.log.X scale.log.X df.log.X
1 -1.8975 0.1496427 3
$Beta
shape1 shape2
1 NA NA
$mirrorgamma
shape rate
1 NA NA
$mirrorlognormal
mean.log.X sd.log.X
1 NA NA
$mirrorlogt
location.log.X scale.log.X df.log.X
1 NA NA NA
$ssq
normal t skewnormal gamma lognormal logt beta
1 0.001644973 0.0007581116 7.295898e-12 0.002487694 0.00249963 0.002105911 NA
mirrorgamma mirrorlognormal mirrorlogt
1 NA NA NA
$best.fitting
best.fit
1 skewnormal
$vals
[,1] [,2] [,3]
[1,] 0.06 0.15 0.21
$probs
[,1] [,2] [,3]
[1,] 0.05 0.5 0.95
$limits
lower upper
1 0 Inf
$notes
NULL
attr(,"class")
[1] "elicitation"
- Choose one of the distributions. You can justify your choice as the one having the lowest sum of squares.
<- data.frame(ssq = as.numeric(fit_rp$ssq[1:6]), dist = colnames(fit_rp$ssq[1:6]))
df ggplot(df,aes(x=ssq,y=dist)) +
geom_col()
::plotfit(fit_rp, d = "best", ql = 0.05, qu = 0.95) SHELF
- Generate a report. In the report you can see how to draw random numbers from the chosen probability distribution.
#SHELF::generateReport(fit_rp)
= 10^4
niter <- sn::rsn(n = niter, xi = 0.197, omega = 0.0698, alpha = -2.79)
rp <- rp rp_skin
Characterise uncertainty in a high exposure
- Choose which age group to consider.
I choose Infants
<- fitdist(vals = c(0.21, 0.6, 1.2), probs = c(0.025, 0.5, 0.975), lower = 0)
fit_exposure fit_exposure
$Normal
mean sd
1 0.6004057 0.2043064
$Student.t
location scale df
1 0.6002239 0.1334065 3
$Skewnormal
location scale slant
1 0.3447033 0.3815849 2.628322
$Gamma
shape rate
1 5.950066 9.3859
$Log.normal
mean.log.X sd.log.X
1 -0.5116764 0.3644407
$Log.Student.t
location.log.X scale.log.X df.log.X
1 -0.511229 0.2372064 3
$Beta
shape1 shape2
1 NA NA
$mirrorgamma
shape rate
1 NA NA
$mirrorlognormal
mean.log.X sd.log.X
1 NA NA
$mirrorlogt
location.log.X scale.log.X df.log.X
1 NA NA NA
$ssq
normal t skewnormal gamma lognormal logt
1 0.0005540256 0.0002489426 1.103311e-11 0.0001079138 0.00054172 0.0002361558
beta mirrorgamma mirrorlognormal mirrorlogt
1 NA NA NA NA
$best.fitting
best.fit
1 skewnormal
$vals
[,1] [,2] [,3]
[1,] 0.21 0.6 1.2
$probs
[,1] [,2] [,3]
[1,] 0.025 0.5 0.975
$limits
lower upper
1 0 Inf
$notes
NULL
attr(,"class")
[1] "elicitation"
<- data.frame(ssq = as.numeric(fit_exposure$ssq[1:6]), dist = colnames(fit_exposure$ssq[1:6]))
df ggplot(df,aes(x=ssq,y=dist)) +
geom_col()
Although the skewed normal is the best fitting distribution, I choose Gamma since it is stricly positive and is also having a good fit.
plotfit(fit_exposure, d = "gamma")
- Generate a report. In the report you can see how to draw random numbers from the chosen probability distribution.
#SHELF::generateReport(fit_exposure)
<- rgamma(n = niter, shape = 5.95, rate = 9.39)
exposure <- exposure exposure_infant
Uncertainty in a high exposure and the Reference Point
ggplot(data.frame(concentration = c(rp,exposure), quantity = rep(c("Reference Point", "High Exposure"),each = niter)),aes(x=concentration, fill = quantity)) +
geom_density(alpha=0.2) +
ggtitle("Uncertainty in a High Exposure and the Reference Point")
Characterise uncertainty in the Margin of Exposure
A Margin of Exposure (MOE) is defined as \[MOE=\frac{\text{Reference Point}}{\text{high exposure}}\]
A \(MOE < 1\) corresponds to \(\text{high exposure} < \text{Reference Point}\) and a risk. Risk is unaccepatble if \(P(MOE < 1)\) is too high.
<- density(rp/exposure, from=0)
df_dens <- data.frame(x=df_dens$x,y=df_dens$y)
df_moe ggplot(df_moe,aes(x=x, y=y)) +
geom_line() +
geom_ribbon(data=subset(df_moe,x<1), aes(x=x,ymax=y),ymin=0,fill="darkred", alpha=0.5) +
xlab("MOE") +
ylab("density") +
ggtitle("Uncertainty in MOE", subtitle=paste0("P(MOE<1)=",round(mean(rp<exposure)*100),"%")) +
xlim(0,5)
Characterise uncertainty in the Risk Characterisation Ratio
Alternative risk measure is the RCR
\[RCR=\frac{\text{high exposure}}{\text{Reference Point}}\]
<- density(exposure/rp)
df_dens <- data.frame(x=df_dens$x,y=df_dens$y)
df_rcr ggplot(df_rcr,aes(x=x, y=y)) +
geom_line() +
geom_ribbon(data=subset(df_rcr,x>1), aes(x=x,ymax=y),ymin=0,fill="darkred", alpha=0.5) +
xlab("RCR") +
ylab("density") +
ggtitle("Uncertainty in RCR", subtitle=paste0("P(RCR>1)=",round(mean(rp<exposure)*100),"%")) +
xlim(0,25)
Conclusion
After taking all sources of uncertainty into account, there is a 99% probability that the Margin of Exposure is less than one. I am therefore 99% certain that there exposure of iAr is a health concern for Infants.
My results when redoing this for all combinations of hazard and expsoure.
Characterise uncertainty in a high exposure for Toddlers
<- fitdist(vals = c(0.24, 0.51,0.99), probs = c(0.025, 0.5, 0.975), lower = 0)
fit_exposure fit_exposure
$Normal
mean sd
1 0.5100632 0.1386545
$Student.t
location scale df
1 0.5101256 0.09129073 3
$Skewnormal
location scale slant
1 0.3035722 0.306259 3.747637
$Gamma
shape rate
1 8.140514 15.2918
$Log.normal
mean.log.X sd.log.X
1 -0.6749394 0.3549878
$Log.Student.t
location.log.X scale.log.X df.log.X
1 -0.6735561 0.2193942 3
$Beta
shape1 shape2
1 NA NA
$mirrorgamma
shape rate
1 NA NA
$mirrorlognormal
mean.log.X sd.log.X
1 NA NA
$mirrorlogt
location.log.X scale.log.X df.log.X
1 NA NA NA
$ssq
normal t skewnormal gamma lognormal logt
1 0.0006121929 0.0003579367 4.471613e-11 6.686226e-05 9.704474e-05 2.942128e-05
beta mirrorgamma mirrorlognormal mirrorlogt
1 NA NA NA NA
$best.fitting
best.fit
1 skewnormal
$vals
[,1] [,2] [,3]
[1,] 0.24 0.51 0.99
$probs
[,1] [,2] [,3]
[1,] 0.025 0.5 0.975
$limits
lower upper
1 0 Inf
$notes
NULL
attr(,"class")
[1] "elicitation"
<- data.frame(ssq = as.numeric(fit_exposure$ssq[1:6]), dist = colnames(fit_exposure$ssq[1:6]))
df ggplot(df,aes(x=ssq,y=dist)) +
geom_col()
plotfit(fit_exposure, d = "best")
#SHELF::generateReport(fit_exposure)
#exposure <- exp(-0.674 + 0.219 * rt(n = niter, df = 3))
<- sn::rsn(n = niter, xi = 0.304, omega = 0.306, alpha = 3.75)
exposure <- exposure exposure_toddler
Characterise uncertainty in the Reference Point for Bladder cancer
<- fitdist(vals = c(0.15,1.33,5.46), probs = c(0.05, 0.5, 0.95), lower = 0)
fit_rp fit_rp
$Normal
mean sd
1 1.329963 0.7173644
$Student.t
location scale df
1 1.3306 0.5133742 3
$Skewnormal
location scale slant
1 0.01798258 1.970323 63.57979
$Gamma
shape rate
1 1.165205 0.6355956
$Log.normal
mean.log.X sd.log.X
1 0.2688761 0.936087
$Log.Student.t
location.log.X scale.log.X df.log.X
1 0.2792185 0.6703632 3
$Beta
shape1 shape2
1 NA NA
$mirrorgamma
shape rate
1 NA NA
$mirrorlognormal
mean.log.X sd.log.X
1 NA NA
$mirrorlogt
location.log.X scale.log.X df.log.X
1 NA NA NA
$ssq
normal t skewnormal gamma lognormal logt beta
1 0.0025 0.002309861 0.002000257 0.0001131296 0.001803428 0.0008492425 NA
mirrorgamma mirrorlognormal mirrorlogt
1 NA NA NA
$best.fitting
best.fit
1 gamma
$vals
[,1] [,2] [,3]
[1,] 0.15 1.33 5.46
$probs
[,1] [,2] [,3]
[1,] 0.05 0.5 0.95
$limits
lower upper
1 0 Inf
$notes
NULL
attr(,"class")
[1] "elicitation"
<- data.frame(ssq = as.numeric(fit_rp$ssq[1:6]), dist = colnames(fit_rp$ssq[1:6]))
df ggplot(df,aes(x=ssq,y=dist)) +
geom_col()
::plotfit(fit_rp, d = "gamma", ql = 0.05, qu = 0.95) SHELF
- Generate a report. In the report you can see how to draw random numbers from the chosen probability distribution.
#SHELF::generateReport(fit_rp)
<- rgamma(n = niter, shape = 1.17, rate = 0.636)
rp <- rp rp_bladder
Summary skin cancer and infants
Skin cancer: skewed normal
Infants: gamma
\(P(MOE<1) = 100 \%\)
ggplot(data.frame(concentration = c(rp_skin,exposure_infant), quantity = rep(c("Reference Point", "High Exposure"),each = niter)),aes(x=concentration, fill = quantity)) +
geom_density(alpha=0.2) +
ggtitle("Uncertainty in a High Exposure and the Reference Point")
<- density(rp_skin/exposure_infant, from=0)
df_dens <- data.frame(x=df_dens$x,y=df_dens$y)
df_moe ggplot(df_moe,aes(x=x, y=y)) +
geom_line() +
geom_ribbon(data=subset(df_moe,x<1), aes(x=x,ymax=y),ymin=0,fill="darkred", alpha=0.5) +
xlab("MOE") +
ylab("density") +
ggtitle("Uncertainty in MOE", subtitle=paste0("P(MOE<1)=",round(mean(rp_skin<exposure_infant)*100),"%")) +
xlim(0,5)
Summary Skin cancer and Toddlers
Skin cancer: skewed normal
Toddlers: skewed normal
\(P(MOE<1) = 100 \%\)
ggplot(data.frame(concentration = c(rp_skin,exposure_toddler), quantity = rep(c("Reference Point", "High Exposure"),each = niter)),aes(x=concentration, fill = quantity)) +
geom_density(alpha=0.2) +
ggtitle("Uncertainty in a High Exposure and the Reference Point")
<- density(rp_skin/exposure_toddler, from=0)
df_dens <- data.frame(x=df_dens$x,y=df_dens$y)
df_moe ggplot(df_moe,aes(x=x, y=y)) +
geom_line() +
geom_ribbon(data=subset(df_moe,x<1), aes(x=x,ymax=y),ymin=0,fill="darkred", alpha=0.5) +
xlab("MOE") +
ylab("density") +
ggtitle("Uncertainty in MOE", subtitle=paste0("P(MOE<1)=",round(mean(rp_skin<exposure_toddler)*100),"%")) +
xlim(0,5)
Summary Bladder cancer and Infants
Bladder cancer: gamma
Infants: gamma
\(P(MOE<1) = 25 \%\)
Here one can argue that the assessors are not certain enough to make a conclusion, and advice them to go back and reduce uncertainty (by collecting more data) och refine uncertaint (by changing worst case or conservative estimates to more precise ones).
ggplot(data.frame(concentration = c(rp_bladder,exposure_infant), quantity = rep(c("Reference Point", "High Exposure"),each = niter)),aes(x=concentration, fill = quantity)) +
geom_density(alpha=0.2) +
ggtitle("Uncertainty in a High Exposure and the Reference Point")
<- density(rp_bladder/exposure_infant, from=0)
df_dens <- data.frame(x=df_dens$x,y=df_dens$y)
df_moe ggplot(df_moe,aes(x=x, y=y)) +
geom_line() +
geom_ribbon(data=subset(df_moe,x<1), aes(x=x,ymax=y),ymin=0,fill="darkred", alpha=0.5) +
xlab("MOE") +
ylab("density") +
ggtitle("Uncertainty in MOE", subtitle=paste0("P(MOE<1)=",round(mean(rp_bladder<exposure_infant)*100),"%")) +
xlim(0,5)
Summary Bladder cancer and Toddlers
Bladder cancer: gamma
Toddlers: skewed normal
\(P(MOE<1) = 22 \%\)
ggplot(data.frame(concentration = c(rp_bladder,exposure_toddler), quantity = rep(c("Reference Point", "High Exposure"),each = niter)),aes(x=concentration, fill = quantity)) +
geom_density(alpha=0.2) +
ggtitle("Uncertainty in a High Exposure and the Reference Point")
<- density(rp_bladder/exposure_toddler, from=0)
df_dens <- data.frame(x=df_dens$x,y=df_dens$y)
df_moe ggplot(df_moe,aes(x=x, y=y)) +
geom_line() +
geom_ribbon(data=subset(df_moe,x<1), aes(x=x,ymax=y),ymin=0,fill="darkred", alpha=0.5) +
xlab("MOE") +
ylab("density") +
ggtitle("Uncertainty in MOE", subtitle=paste0("P(MOE<1)=",round(mean(rp_bladder<exposure_toddler)*100),"%")) +
xlim(0,5)