represent a combination of true differences between subsamples and measurement error
(Stockl, 1998). Herein, we visualized the precision of methods by quantifying the standard
deviation among replicates for each trial (flow cytometry excluded, since measurements were
not replicated). These graphs show that variability is low at low concentrations (i.e. the region
of utmost concern for compliance monitoring), but they do not allow us to directly compare
instruments quantitatively since most methods use different measurement units. Ideally, it
would be beneficial if multiple replicates of multiple subsamples were analyzed by each tool to
help parse out these sources of variation, but operational considerations limited the number of
measurements taken during this voyage. Such an approach could be the focus of future
empirical studies, or alternatively, lab-based studies may be useful to study the reliability and
precision of these methods across a controlled range of values (see Vanden Byllaardt et al.,
submitted).
Conducing our trials on board a ship in transit offered both advantages and
disadvantages. While we were disadvantaged by the added complexity of conducting
microscopy on a moving ship, we expect our results benefit from (i) organisms being subjected
to the stresses of the ballast system during collection akin to real compliance scenarios and (ii)
the diversity of communities that were sampled. Indeed, we expect that the composition of
plankton communities is likely to affect the level of concordance observed between analytic
methods. While our results showed that CFA devices performed well for the > 50 pm size class,
these results may be dramatically different if the samples for this size class had not been
dominated by dinoflagellates, as was the case during this voyage. Thus, future work is needed
to ensure that a variety of communities are considered when testing methods, including