# Agreement Between Data Sets

( observed agreement [Po] – expected agreement [Pe]) / (agreement 1 expected [Pe]). The correspondence between the measurements refers to the degree of correspondence between two (or more) measures. Statistical methods used to verify compliance are used to assess the variability of inter-variability or to decide whether one variable measurement technique can replace another. In this article, we examine statistical measures of compliance for different types of data and discuss the differences between them and those for assessing correlation. It is important to note that in each of the three situations in Table 1, the passport percentages are the same for both examiners, and if the two examiners are compared to a typical 2-×-2 test for mated data (McNemar test), there would be no difference between their performance; On the other hand, the agreement between the observers is very different in these three situations. The basic idea that must be understood here is that “agreement” quantifies the agreement between the two examiners for each of the “couples” of the scores, not the similarity of the total pass percentage between the examiners. For the three situations described in Table 1, the use of the McNemar test (designed to compare coupled categorical data) would not make a difference. However, this cannot be construed as evidence of an agreement. The McNemar test compares the total proportions; Therefore, any situation in which the total share of the two examiners in Pass/Fail (for example. B situations 1, 2 and 3 in Table 1) would result in a lack of differences. Similarly, the mated t-test compares the average difference between two observations in a single group. It cannot therefore be significant if the average difference between unit values is small, although the differences between two observers are important for individuals. A simple way to see the agreement between the two versions is the use of Spearman`s rank correlation.

Concordance limits – average difference observed ± 1.96 × standard deviation of observed differences. Let us now consider a hypothetical situation in which examiners do exactly that, i.e. assign notes by throwing a coin toss; Heads – pass, tails – Table 1, situation 2. In this case, one would expect that 25% (-0.50 × 0.50) of the students would receive the results of both and that 25% of the two would receive the “fail” grade – a total approval rate “expected” for “not” or “fail” of 50% (-0.25 – 0.25 – 0.50). Therefore, the observed approval rate (80% in situation 1) must be interpreted to mean that a 50% agreement was foreseen by chance. These auditors could have improved it by 50% (at best an agreement minus the randomly expected agreement – 100% 50% – 50%), but only reached 30% (observed agreement minus the randomly expected agreement – 80% 50% – 30%).