Interactive Explanations by Conflict Resolution via Argumentative Exchanges

Interactive Explanations by Conﬂict Resolution via Argumentative Exchanges

Antonio Rago , Hengzhi Li and Francesca Toni

Department of Computing, Imperial College London, UK

{a.rago, hengzhi.li21, ft}@imperial.ac.uk

Abstract

As the ﬁeld of explainable AI (XAI) is maturing, calls for

interactive explanations for (the outputs of) AI models are

growing, but the state-of-the-art predominantly focuses on

static explanations. In this paper, we focus instead on in-

teractive explanations framed as conﬂict resolution between

agents (i.e. AI models and/or humans) by leveraging on com-

putational argumentation. Speciﬁcally, we deﬁne Argumen-

tative eXchanges (AXs) for dynamically sharing, in multi-

agent systems, information harboured in individual agents’

quantitative bipolar argumentation frameworks towards re-

solving conﬂicts amongst the agents. We then deploy AXs

in the XAI setting in which a machine and a human inter-

act about the machine’s predictions. We identify and assess

several theoretical properties characterising AXs that are suit-

able for XAI. Finally, we instantiate AXs for XAI by deﬁning

various agent behaviours, e.g. capturing counterfactual pat-

terns of reasoning in machines and highlighting the effects of

cognitive biases in humans. We show experimentally (in a

simulated environment) the comparative advantages of these

behaviours in terms of conﬂict resolution, and show that the

strongest argument may not always be the most effective.

1 Introduction

The need for interactivity in explanations of the outputs of

AI models has long been called for (Cawsey 1991), and the

recent wave of explainable AI (XAI) has given rise to re-

newed urgency in the matter. In (Miller 2019), it is stated

that explanations need to be social, and thus for machines

to truly explain themselves, they must be interactive, so that

XAI is not just “more AI”, but a human-machine interac-

tion problem. Some have started exploring explanations as

dialogues (Lakkaraju et al. 2022) , while several are explor-

ing forms of interactive machine learning for model debug-

ging (Teso et al. 2023). It has also been claimed that it is our

responsibility to create machines which can argue with hu-

mans (Hirsch et al. 2018). However, despite the widespread

acknowledgement of the need for interactivity, typical ap-

proaches to XAI deliver “static” explanations, whether they

be based on feature attribution (e.g. as in (Lundberg and

Lee 2017)), counterfactuals (e.g. as in (Wachter, Mittel-

stadt, and Russell 2017)) or other factors such as prime

implicants (e.g. as in (Shih, Choi, and Darwiche 2018;

Ignatiev, Narodytska, and Marques-Silva 2019)). These ex-

planations typically focus exclusively on aspects of the in-

put deemed responsible (in different ways, according to the

method used) for the outputs of the explained AI model, and

offer little opportunity for interaction. For illustration, con-

sider a recommender system providing positive and negative

evidence drawn from input features as an explanation for a

movie recommendation to a user: this form of explanation

is static in that it does not support interactions between the

system and the user, e.g. if the latter disagrees with the role

of the input features in the explanation towards the recom-

mendation, or with the system’s recommendation itself.

A parallel research direction focuses on argumentative ex-

planations for AI models of various types (see (Cyras et

al. 2021; Vassiliades, Bassiliades, and Patkos 2021) for re-

cent overviews), often motivated by the appeal of argumen-

tation in explanations amongst humans, e.g. as in (Antaki

and Leudar 1992), within the broader view that XAI should

take ﬁndings from the social sciences into account (Miller

2019). Argumentative explanations in XAI employ compu-

tational argumentation (see (Atkinson et al. 2017; Baroni et

al. 2018) for overviews), leveraging upon (existing or novel)

argumentation frameworks, semantics and properties.

Argumentative explanations seem well suited to support

interactivity when the mechanics of AI models can be ab-

stracted away argumentatively (e.g. as for some recom-

mender systems (Rago, Cocarascu, and Toni 2018) or neural

networks (Albini et al. 2020; Potyka 2021)). For illustration,

consider the case of a movie review aggregation system, as

in (Cocarascu, Rago, and Toni 2019), and assume that its

recommendation of a movie x and its reasoning therefor

can be represented by the bipolar argumentation framework

(BAF) (Cayrol and Lagasquie-Schiex 2005) ⟨X , A, S⟩ with

arguments X = {e, m

, m

}, attacks A =∅ and supports S =

{(m

, e), (m

, m

)} (see left of Figure 1 for a graphical vi-

sualisation). Then, by supporting e, m

(statically) conveys

shallow evidence for the output (i.e. movie x being recom-

mended). Argumentative explanations may go beyond the

shallow nature of state-of-the-art explanations by facilitating

dynamic, interactive explanations, e.g. by allowing a human

explainee who does not agree with the machine’s output or

the evidence it provides (in other words, there is a conﬂict

between the machine and the human) to provide feedback (in

Figure 1, by introducing attacks (h

, e) or (h

, m

)), while

also allowing for the system to provide additional informa-

tion (in Figure 1, by introducing the support (m

, m

)). The

Proceedings of the 20th International Conference on Principles of Knowledge Representation and Reasoning

Main Track

582

Figure 1: An argumentative explanation for a review aggregation

system, amounting to the interactions between a machine and a

human sharing their reasoning following a recommendation for x.

resulting interactive explanations can be seen as a conﬂict

resolution process, e.g. as in (Raymond, Gunes, and Pro-

rok 2020). Existing approaches focus on speciﬁc settings.

Also, although the need for studying properties of explana-

tions is well-acknowledged (e.g. see (Sokol and Flach 2020;

Amgoud and Ben-Naim 2022)), to the best of our knowl-

edge properties of interactive explanations, e.g. relating to

how well they represent and resolve any conﬂicts, have been

neglected to date. We ﬁll these gaps by providing a gen-

eral argumentative framework for interactive explanations as

conﬂict resolution, as well as properties and instantiations

thereof, backed by simulated experiments. Speciﬁcally:

• We deﬁne Argumentative eXchanges (AXs, §4), in which

agents, whose reasoning is represented as quantitative

bipolar argumentation frameworks (QBAFs) under grad-

ual semantics (Baroni, Rago, and Toni 2018), contribute

attacks/supports between arguments, to interactively ob-

tain BAFs as in Figure 1 towards resolving conﬂicts on the

agents’ stances on explananda. We use QBAFs, which

are BAFs where arguments are equipped with intrinsic

strengths, as they are well suited to modelling private

viewpoints, public conﬂicts, and resolutions, as well as

cognitive biases, which are important in XAI (Bertrand et

al. 2022). We use gradual semantics to capture individual

evaluations of stance, taking biases into account.

• We identify and assess several properties (§5) which AXs

may satisfy to be rendered suitable in an XAI setting.

These properties concern, amongst others, the representa-

tion and possible resolution of conﬂicts within interactive

explanations drawn from AXs.

• We instantiate AXs to the standard XAI setting of two

agents, a machine and a human, and deﬁne a catalogue

of agent behaviours for this setting (§6). We experiment

in a simulated environment (§7) with the behaviours, ex-

ploring ﬁve hypotheses about conﬂict resolution and the

accuracy of contributed arguments towards it, noting that

the strongest argument is not always the most effective.

2 Related Work

There is a vast literature on multi-agent argumentation,

e.g. recently, (Raymond, Gunes, and Prorok 2020) deﬁne

an argumentation-based human-agent architecture integrat-

ing regulatory compliance, suitable for human-agent path

deconﬂiction and based on abstract argumentation (Dung

1995); (Panisson, McBurney, and Bordini 2021) develop a

multi-agent frameworks whereby agents can exchange in-

formation to jointly reason with argument schemes and crit-

ical questions; and (de Tarl

e, Bonzon, and Maudet 2022) let

agents debate using a shared abstract argumentation frame-

work. These works mostly focus on narrow settings us-

ing structured and abstract argumentation under extension-

based semantics, and mostly ignore the XAI angle ((Ray-

mond, Gunes, and Prorok 2020; Calegari et al. 2022) are

exceptions). Instead, with XAI as our core drive, we fo-

cus on (quantitative) bipolar argumentation under gradual

semantics, motivated by their usefulness in several XAI

approaches (e.g. in (Cocarascu, Rago, and Toni 2019;

Albini et al. 2020; Potyka 2021; Rago, Baroni, and Toni

2022)). Other works consider (Q)BAFs in multi-agent ar-

gumentation, e.g. (Kontarinis and Toni 2015), but not for

XAI. We adapt some aspects of these works on multi-agent

argumentation approaches, speciﬁcally the idea of agents

contributing attacks or supports (rather than arguments) to

debates (Kontarinis and Toni 2015) and the restriction to

trees rooted at explananda under gradual semantics from (de

Tarl

e, Bonzon, and Maudet 2022). We leave other interest-

ing aspects they cover to future work, notably handling ma-

liciousness (Kontarinis and Toni 2015), regulatory compli-

ance (Raymond, Gunes, and Prorok 2020), and deﬁning suit-

able utterances (Panisson, McBurney, and Bordini 2021).

Several approaches to obtain argumentative explanations

for AI models exist (see (Cyras et al. 2021; Vassiliades,

Bassiliades, and Patkos 2021) for overviews), often rely-

ing upon argumentative abstractions of the models. Our ap-

proach is orthogonal, as we assume that suitable QBAF ab-

stractions of models and humans exist, focusing instead on

formalising and validating interactive explanations.

Our AXs and agent behaviours are designed to resolve

conﬂicts and are thus related to works on conﬂict res-

olution, e.g. (Black and Atkinson 2011; Fan and Toni

2012a), or centered around conﬂicts, e.g. (Pisano et al.

2022), but these works have different purposes to inter-

active XAI and use forms of argumentation other than

(Q)BAFs under gradual semantics. Our agent behaviours

can also be seen as attempts at persuasion in that they

aim at selecting most efﬁcacious arguments for changing

the mind of the other agents, as e.g. in (Fan and Toni

2012b; Hunter 2018; Calegari, Riveret, and Sartor 2021;

Donadello et al. 2022). Further, our AXs can be seen

as supporting forms of information-seeking and inquiry, as

they allow agents to share information, and are thus re-

lated to work in this spectrum (e.g. (Black and Hunter

2007; Fan and Toni 2015a)). Our framework however dif-

fers from general-purpose forms of argumentation-based

persuasion/information-seeking/inquiry in its focus on inter-

active XAI supported by (Q)BAFs under gradual semantics.

The importance of machine handling of information from

humans when explaining outputs, rather than the humans

exclusively receiving information, has been highlighted e.g.

for recommender systems (Balog, Radlinski, and Arakelyan

2019; Rago et al. 2020) and debugging (Lertvittayakumjorn,

Specia, and Toni 2020) or other human-in-the-loop methods

Proceedings of the 20th International Conference on Principles of Knowledge Representation and Reasoning

Main Track

583

(see (Wu et al. 2022) for a survey). Differently from these

works, we capture two-way interactions.

Some works advocate interactivity in XAI (Paulino-

Passos and Toni 2022), but do not make concrete sugges-

tions on how to support it. Other works advocate dialogues

for XAI (Lakkaraju et al. 2022), but it is unclear how these

can be generated. We contribute to grounding the problem

of generating interactive explanations by a computational

framework implemented in a simulated environment.

3 Preliminaries

A BAF (Cayrol and Lagasquie-Schiex 2005) is a triple

⟨X , A, S⟩ such that X is a ﬁnite set (whose elements are ar-

guments), A ⊆ X ×X (called the attack relation) and S ⊆ X ×

X (called the support relation), where A and S are disjoint.

A QBAF (Baroni et al. 2015) is a quadruple ⟨X , A, S, τ ⟩

such that ⟨X , A, S⟩ is a BAF and τ ∶ X → I ascribes base

scores to arguments; these are values in some given I rep-

resenting the arguments’ intrinsic strengths. Given BAF

⟨X , A, S⟩ or QBAF ⟨X , A, S, τ ⟩, for any a ∈ X , we call

{b ∈ X ∣(b, a) ∈ A} the attackers of a and {b ∈ X ∣(b, a) ∈ S}

the supporters of a.

We make use of the following notation: given BAFs B =

⟨X , A, S⟩, B

′

= ⟨X

′

, A

′

, S

′

⟩, we say that B ⊑ B

′

iff X ⊆ X

′

A ⊆ A

′

and S ⊆ S

′

; also, we use B

′

∖ B to denote ⟨X

′

∖

X , A

′

∖A, S

′

∖S⟩. Similarly, given QBAFs Q =⟨X , A, S, τ ⟩,

′

=⟨X

′

, A

′

, S

′

, τ

′

⟩, we say that Q ⊑ Q

′

iff X ⊆ X

′

, A ⊆ A

′

S ⊆ S

′

and ∀a ∈ X ∩ X

′

(which, by the other conditions, is

exactly X ), it holds that τ

′

(a) = τ (a). Also, we use Q

′

∖Q to

denote ⟨X

′

∖X , A

′

∖A, S

′

∖S, τ

′′

⟩, where τ

′′

is τ

′

restricted

to the arguments in X

′

∖ X .

Given a BAF B and a QBAF

Q = ⟨X , A, S, τ⟩, with an abuse of notation we use B ⊑ Q to

stand for B ⊑ ⟨X , A, S⟩ and Q ⊑ B to stand for ⟨ X , A, S⟩ ⊑

B. For any BAFs or QBAFs F , F

′

, we say that F = F

′

iff

F ⊑F

′

and F

′

⊑F , and F ⊏F

′

iff F ⊑F

′

but F ≠F

′

Both BAFs and QBAFs may be equipped with a gradual

semantics σ, e.g. as in (Baroni et al. 2017) for BAFs and as

in (Potyka 2018) for QBAFs (see (Baroni, Rago, and Toni

2019) for an overview), ascribing to arguments a dialectical

strength from within some given I (which, in the case of

QBAFs, is typically the same as for base scores): thus, for a

given BAF or QBAF F and argument a, σ(F , a) ∈ I.

Inspired by (de Tarl

e, Bonzon, and Maudet 2022)’s use

of (abstract) argumentation frameworks (Dung 1995) of a

restricted kind (amounting to trees rooted with a single ar-

gument of focus), we use restricted BAFs and QBAFs:

Deﬁnition 1. Let F be a BAF ⟨X , A, S⟩ or QBAF

⟨X , A, S, τ⟩. For any arguments a, b ∈ X , let a path from

a to b be deﬁned as (c

, c

), . . . , (c

n−1

, c

) for some n > 0

(referred to as the length of the path) where c

= a, c

= b

and, for any 1 ≤ i ≤ n, (c

i−1

, c

) ∈ A ∪ S.

Then, for e ∈ X ,

F is a BAF/QBAF (resp.) for e iff i) ∄(e, a) ∈ A ∪ S; ii)

Note that B

′

∖B , Q

′

∖Q may not be BAFs, QBAFs, resp., as they

may include no arguments but non-empty attack/support relations.

Later, we will use paths(a, b) to indicate the set of all paths

between arguments a and b, leaving the (Q)BAF implicit, and use

∣p∣ for the length of path p. Also, we may see paths as sets of pairs.

∀a ∈ X ∖ {e}, there is a path from a to e; and iii) ∄a ∈ X

with a path from a to a.

Here e plays the role of an explanandum.

When inter-

preting the BAF/QBAF as a graph (with arguments as nodes

and attacks/supports as edges), i) amounts to sanctioning

that e admits no outgoing edges, ii) that e is reachable from

any other node, and iii) that there are no cycles in the graph

(and thus, when combining the three requirements, the graph

is a multi-tree rooted at e). The restrictions in Deﬁnition 1

impose that every argument in a BAF/QBAF for e are “re-

lated” to e, in the spirit of (Fan and Toni 2015b).

In all illustrations (and in some of the experiments in §7)

we use the DF-QuAD gradual semantics (Rago et al. 2016)

for QBAFs for explananda. This uses I = [0, 1] and:

• a strength aggregation function Σ such that Σ(())=0 and,

for v

, . . . , v

∈[0, 1] (n ≥ 1), if n = 1 then Σ((v

)) = v

if n = 2 then Σ((v

, v

)) = v

+ v

− v

⋅ v

, and if n > 2

then Σ((v

, . . . , v

)) = Σ(Σ((v

, . . . , v

n−1

)), v

);

• a combination function c such that, for v

, v

−

, v

∈ [0, 1]:

if v

−

≥ v

then c(v

, v

−

, v

) = v

− v

⋅ ∣ v

− v

−

∣ and if

−

< v

, then c(v

, v

−

, v

) = v

+ (1 − v

)⋅ ∣ v

− v

−

∣.

Then, for F = ⟨X , A, S, τ ⟩ and any a ∈ X , given

A(a) = {b ∈ X ∣(b, a) ∈ A} and S(a) = { b ∈ X ∣(b, a) ∈

S}, σ(F , a) = c(τ (a), Σ(σ(F , A(a))), Σ(σ(F , S(a))))

where, for any S ⊆ X , σ(F , S) = (σ(F, a

), . . . , σ(F, a

))

for (a

, . . . , a

), an arbitrary permutation of S.

4 Argumentative Exchanges (AXs)

We deﬁne AXs as a general framework in which agents ar-

gue with the goal of conﬂict resolution. The conﬂicts may

arise when agents hold different stances on explananda. To

model these settings, we rely upon QBAFs for explananda

as abstractions of agents’ internals. Speciﬁcally, we as-

sume that each agent α is equipped with a QBAF and a

gradual semantics (σ): the former provides an abstraction

of the agent’s knowledge/reasoning, with the base score

(τ) representing biases over arguments; the latter can be

seen as an evaluation method for arguments. To reﬂect the

use of QBAFs in our multi-agent explanatory setting, we

adopt this terminology (of biases and evaluation methods)

in the remainder. Intuitively, biases and evaluations repre-

sent agents’ views on the quality of arguments before and

after, resp., other arguments are considered. For illustra-

tion, in the setting of Figure 1, biases may result from ag-

gregations of votes from reviews for the machine and from

personal views for the human, and evaluation methods allow

the computation of the machine/human stance on the recom-

mendation during the interaction (as in (Cocarascu, Rago,

and Toni 2019)). Agents may choose their own evaluation

range for measuring biases/evaluating arguments.

Deﬁnition 2. An evaluation range I is a set equipped with a

pre-order ≤ (where, as usual x < y denotes x ≤ y and y ≰ x)

such that I = I

∪ I

−

where I

, I

and I

−

are disjoint and

for any i ∈ I

, j ∈ I

and k ∈ I

−

, k < j < i. We refer to I

, I

and I

−

, resp., as positive, neutral and negative evaluations.

Other terms to denote the “focal point” of BAFs/QBAFs could

be used. We use explanandum given our focus on the XAI setting.

Proceedings of the 20th International Conference on Principles of Knowledge Representation and Reasoning

Main Track

584

Figure 2: AX for explanandum e amongst agents AG = {µ, η},

with the exchange BAF representing an interactive explanation.

White (grey) boxes represent contributions (learnt relations, resp.).

Thus, an evaluation range discretises the space of possible

evaluations into three categories.

Deﬁnition 3. A private triple for an agent α and an ex-

planandum e is (I

, Q

, σ

) where:

• I

= I

∪I

−

∪I

is an evaluation range, referred to as α’s

private evaluation range;

• Q

= ⟨X

, A

, S

, τ

⟩ is a QBAF for e, referred to as

α’s private QBAF, such that ∀a ∈ X

, τ

(a) ∈ I

;

• σ

is an evaluation method, referred to as α’s pri-

vate evaluation method, such that, for any QBAF Q =

⟨X , A, S, τ⟩ (τ ∶ X → I

) and, for any a∈X , σ

(Q, a)∈ I

Agents’ stances on explananda are determined by their

private biases and evaluation methods.

Deﬁnition 4. Let (I

, Q

, σ

) be a private triple for agent

α (for some e), with Q

= ⟨X

, A

, S

, τ

⟩. Then, for

a ∈ X

, α’s stance on a is deﬁned, for ∗ ∈ {−, 0, +}, as

, a) = ∗ iff σ

, a) ∈ I

∗

Note that a may be the explanandum or any other argu-

ment (namely, an agent may hold a stance on any arguments

in its private QBAF). Also, abusing notation, we will lift the

pre-order over elements of I to stances, whereby − < 0 < +.

In general, agents may hold different evaluation ranges,

biases, QBAFs and evaluation methods, but the discretisa-

tion of the agents’ evaluation ranges to obtain their stances

allows for direct comparison across agents.

Example 1. Consider a machine agent µ and a human

agent η equipped resp. with private triples (I

, Q

, σ

)

We choose three discrete values only for simplicity. This may

mean that very close values, e.g. 0.49 and 0.51, belong to different

categories. We leave to future work the analysis of further value

categorisations, e.g. a distinction between strongly and mildly pos-

itive values or comfort zones (de Tarl

e, Bonzon, and Maudet 2022).

and (I

, Q

, σ

), with Q

= ⟨X

, A

, S

, τ

⟩, Q

⟨X

, A

, S

, τ

⟩ QBAFs for the same e and:

• I

−

= I

−

= [0, 0.5), I

= I

= {0.5} and I

= I

= (0.5, 1];

• X

= {e, a, b, c}, A

= {(a, e)}, S

= {(b, e), (c, a)}

(represented graphically on the top left of Figure 2) and

(e) = 0.7, τ

(a) = 0.8, τ

(b) = 0.4, and τ

• X

={e, a, b, d, f},A

={(a, e), (d, a)} ,S

={(b, e), (f, b)}

(represented on the top right of Figure 2) and τ

(e) = 0.6,

(a) = 0.8, τ

(b) = 0.2, τ

(d) = 0.6 and τ

(f) = 0.5.

• σ

is the DF-QuAD semantics, giving σ

, e) = 0.336,

, a)=0.92, σ

, b)=0.4, and σ

, c)=0.6;

• σ

is also DF-QuAD, giving σ

, e)=0.712,σ

, a)=

0.32, σ

, b)=0.6, σ

, d)=0.6, σ

, f )=0.5.

Thus, the machine and human agents hold entirely different

views on the arguments (based on their private QBAFs and

their evaluations) and Σ

, e) = − while Σ

, e) = +.

Thus, there is a conﬂict between the agents’ stances on e.

We deﬁne AXs so that they can provide the ground to

identify and resolve conﬂicts in stance amongst agents.

Deﬁnition 5. An Argumentative eXchange (AX) for an ex-

planandum e amongst agents AG (where ∣AG∣ ≥ 2) is a tuple

⟨B

, . . . , B

, AG

, . . . , AG

, C⟩ where n > 0 and:

• for every timestep 0 ≤ t ≤ n:

– B

= ⟨X

, A

, S

⟩ is a BAF for e, called the exchange

BAF at t, such that X

= {e}, A

= S

= ∅ and for

t > 0, B

t−1

⊑ B

;

– AG

is a set of private triples (I

, Q

, σ

) for e, one

for each agent α ∈ AG, where, for t > 0, I

t−1

= I

t−1

= σ

, Q

t−1

⊑ Q

and Q

∖ Q

t−1

⊑ B

∖ B

t−1

;

• C, referred to as the contributor mapping, is a mapping

such that, for every (a, b) ∈ A

∪ S

: C((a, b)) = (α, t)

with 0 < t ≤ n and α ∈ AG.

Agents’ private triples thus change over time during AXs,

with several restrictions, in particular that agents do not

change their evaluation ranges and methods, and that their

biases on known arguments propagate across timesteps (but

note that Deﬁnition 5 does not impose any restriction on

the agents’ private triples at timestep 0, other than they

are all for e). The restriction that all BAFs/QBAFs in

exchanges are for the explanandum, means that all con-

tributed attacks and supports (and underlying arguments)

are “relevant” to the explanandum. Implicitly, while we

do not assume that agents share arguments, we assume

that they agree on an underpinning ‘lingua franca’, so that,

in particular, if two agents are both aware of two argu-

ments, they must agree on any attack or support between

them, e.g. it cannot be that an argument attacks another

argument for one agent but not for another (in line with

other works, e.g. (de Tarl

e, Bonzon, and Maudet 2022;

Raymond, Gunes, and Prorok 2020)). We leave to future

work the study of the impact of this assumption in practice

when AXs take place between machines and humans.

During AXs, agents contribute elements of the at-

tack/support relations, thus “arguing” with one another.

These elements cannot be withdrawn once contributed, in

Proceedings of the 20th International Conference on Principles of Knowledge Representation and Reasoning

Main Track

585

line with human practices, and, by deﬁnition of C, each ele-

ment is said once by exactly one agent, thus avoiding repe-

titions that may occur in human exchanges. Note that we

do not require that all agents contribute something to an

AX, namely it may be that {α∣C((a, b)) = (α, t), (a, b) ∈

∪ S

} ⊂ AG. Also, we do not force agents to contribute

something at every timestep (i.e. it may be the case that

t−1

= B

at some timestep t). Further, while the deﬁnition

of AX does not impose that agents are truthful, from now on

we will focus on truthful agents only and thus assume that if

(a, b) ∈ A

or S

and C((a, b)) = (α, t) (with 0 < t ≤ n),

then, resp., (a, b) ∈ A

t−1

or S

t−1

In the remainder, we may denote the private triple

, Q

, σ

) as α

and the stance Σ

, a) as Σ

(a).

Example 2. An AX amongst {µ, η} from Example 1 may be

⟨B

, B

, AG

, C⟩ such that (see top row of Figure 2):

• B

= ⟨{e}, ∅, ∅⟩, B

= ⟨{e, a, b}, {(a, e)}, {(b, e)}⟩;

• µ

= µ

and η

= η

are as in Example 1;

• C((a, e)) = (µ, 1) and C((b, e)) = (η, 1), i.e. µ and η

contribute, resp., attack (a, e) and support (b, e) at 1.

Here, each agent contributes a single attack or support jus-

tifying their stances (negative for µ and positive for η), but,

in general, multiple agents may contribute multiple relations

at single timesteps, or no relations at all.

When contributed attacks/supports are new to agents, they

may (rote) learn them, with the arguments they introduce.

Deﬁnition 6. Let ⟨B

, . . . , B

, AG

, . . . , AG

, C⟩ be an AX

amongst agents AG. Then, for any α ∈ AG, with private

tuples (I

, Q

, σ

), . . . , (I

, Q

, σ

• for any 0 < t ≤ n, for ⟨X

, A

, S

, τ

⟩ = Q

∖ Q

t−1

, A

, and S

are, resp., the learnt arguments, attacks,

and supports by α at timestep t;

• for ⟨X

, A

, S

, τ

⟩ = Q

∖ Q

, X

, A

, and S

are,

resp., the learnt arguments, attacks, and supports by α.

Note that, by deﬁnition of AXs, all learnt arguments, at-

tacks and supports are from the (corresponding) exchange

BAFs. Note also that in Example 2 neither agent learns any-

thing, as indeed each contributed an attack/support already

present in the other agent’s private QBAF.

Example 3. Let us extend the AX from Example 2 to obtain

⟨B

, B

, AG

, C⟩ such that (see the top two

rows of Figure 2):

• B

= ⟨{e, a, b, c}, {(a, e)}, {( b, e), (c, a)}⟩

• µ

= µ

; η

is such that Q

⊐ Q

where X

∪ {c}, A

= A

, S

= S

∪ {(c, a)} and τ

• C((c, a)) = (µ, 2), namely µ contributes the support

(c, a) in B

at timestep 2.

We will impose that any attack/support which is added

to the exchange BAF by an agent is learnt by the other

agents, alongside any new arguments introduced by those at-

tacks/supports. Thus, for any α∈AG and t >0, B

∖B

t−1

⊑Q

However, agents have a choice on their biases on the learnt

arguments. These biases could reﬂect, e.g., their trust on

the contributing agents or the intrinsic quality of the argu-

ments. Depending on these biases, learnt attacks and sup-

ports may inﬂuence the agents’ stances on the explanan-

dum differently. For illustration, in Example 3, η opted

for a low bias (0.2) on the learnt argument c, resulting in

, e) = 0.648, σ

, a) = 0.32 and σ

, c) = 0.2,

and thus Σ

(e) = + still, as in Examples 1, 2. If, instead, η

had chosen a high bias on the new argument, e.g. τ

this would have given σ

, e) = 0.432, σ

, a) = 0.88

and σ

, c) = 1, leading to Σ

(e) = −, thus resolving the

conﬂict. This illustration shows that learnt attacks, supports

and arguments may ﬁll gaps, change agents’ stances on ex-

plananda and pave the way to the resolution of conﬂicts.

Deﬁnition 7. Let E = ⟨B

, . . . , B

, AG

, . . . , AG

, C⟩ be

an AX for explanandum e amongst agents AG such that

(e) ≠ Σ

(e) for some α, β ∈ AG. Then:

• E is resolved at timestep t, for some 0 < t ≤ n, iff ∀α, β ∈

AG, Σ

(e) = Σ

(e), and is unresolved at t otherwise;

• E is resolved iff it is resolved at timestep n and it is unre-

solved at every timestep 0 ≤ t < n;

• E is unresolved iff it is unresolved at every 0 < t ≤ n.

Thus, a resolved AX starts with a conﬂict between at least

two agents and ends when no conﬂicts amongst any of the

agents exist or when the agents give up on trying to ﬁnd

a resolution. Practically, AXs may be governed by a turn-

making function π ∶ Z

→ 2

determining which agents

should contribute at any timestep. Then, an AX may be

deemed to be unresolved if, for example, all agents decide,

when their turn comes, against contributing.

Note that, while agents’ biases and evaluations are kept

private during AXs, we assume that agents share their

stances on the explanandum, so that they are aware of

whether the underpinning conﬂicts are resolved. Agents’

stances, when ascertaining whether an AX is resolved, are

evaluated internally by the agents, without any shared evalu-

ation of the exchange BAF, unlike, e.g. in (de Tarl

e, Bonzon,

and Maudet 2022) and other works we reviewed in §2.

Finally, note that our deﬁnition of AX is neutral as to the

role of agents therein, allowing in particular that agents have

symmetrical roles (which is natural, e.g., for inquiry) as well

as asymmetrical roles (which is natural, e.g., when machines

explain to humans: this will be our focus from §5).

5 Explanatory Properties of AXs

Here we focus on singling out desirable properties that AXs

may need satisfy to support interactive XAI. Let us assume

as given an AX E = ⟨B

, . . . , B

, AG

, . . . , AG

, C⟩ for e

as in Deﬁnition 5. The ﬁrst three properties impose basic re-

quirements on AXs so that they result in ﬁtting explanations.

Property 1. E satisﬁes connectedness iff for any 0 ≤ t ≤ n,

if ∣X

∣ > 1 then ∀a ∈ X

, ∃b ∈ X

such that (a, b) ∈ A

∪ S

or (b, a) ∈ A

∪ S

Basically, connectedness imposes that there should be no

ﬂoating arguments and no “detours” in the exchange BAFs,

at any stage during the AX. It is linked to directional con-

nectedness in (Cyras, Kampik, and Weng 2022). A violation

Proceedings of the 20th International Conference on Principles of Knowledge Representation and Reasoning

Main Track

586

of this property would lead to counter-intuitive (interactive)

explanations, with agents seemingly “off-topic”.

Property 2. E satisﬁes acyclicity iff for any 0 ≤ t ≤ n,

∄a ∈ X

such that paths(a, a) ≠ ∅.

Acyclicity ensures that all reasoning is directed towards

the explanandum in AXs. A violation of this property may

lead to seemingly non-sensical (interactive) explanations.

Property 3. E satisﬁes contributor irrelevance iff for any AX

for e ⟨B

′

, . . . , B

′

, AG

′

, . . . , AG

′

, C

′

⟩, if B

′

, B

′

, AG

′

=AG

, then ∀α ∈ AG: Σ

, e)=Σ

′

, e).

Contributor irrelevance ensures that the same ﬁnal ex-

change BAF results in the same stances for all agents, re-

gardless of the contributors of its attacks and supports or the

order in which they were contributed.

These three properties are basically about the exchange

BAFs in AXs, and take the viewpoint of an external “judge”

for the explanatory nature of AXs. These basic properties

are all satisﬁed, by design, by AXs:

Proposition 1. Every AX satisﬁes Properties 1 to 3.

We now introduce properties which AXs may not always

satisfy, but which, nonetheless, may be desirable if AXs are

to generate meaningful (interactive) explanations. First, we

deﬁne notions of pro and con arguments in AXs, amounting

to positive and negative reasoning towards the explanandum.

Deﬁnition 8. Let B = ⟨X , A , S⟩ be any BAF for e. Then,

the pro arguments and con arguments for B are, resp.:

●pro(B)={a ∈X ∣∃p∈paths(a, e), where ∣p ∩ A∣ is even};

●con(B)={a ∈X ∣∃p∈paths(a, e), where ∣p ∩ A∣ is odd}.

Note that the intersection of pro and con arguments may

be non-empty as multiple paths to explananda may exist, so

an argument may bring both positive and negative reasoning.

Pro/con arguments with an even/odd, resp., number of

attacks in their path to e are related to chains of sup-

ports (supported/indirect defeats, resp.) in (Cayrol and

Lagasquie-Schiex 2005) (we leave the study of formal links

to future work). Pro/con arguments are responsible for in-

creases/decreases, resp., in e’s strength using DF-QuAD:

Proposition 2. For any α ∈ AG, let σ

indicate the evalua-

tion method by DF-QuAD. Then, for any 0 < t ≤ n:

● if σ

, e) > σ

t−1

, e), then pro(B

) ⊃ pro(B

t−1

);

● if σ

, e)<σ

t−1

, e), then con(B

) ⊃ con(B

t−1

We conjecture (but leave to future work) that this result

(and more later) holds for other gradual semantics satisfying

monotonicity (Baroni, Rago, and Toni 2019) or bi-variate

monotony/reinforcement (Amgoud and Ben-Naim 2018).

Property 4. E satisﬁes resolution representation iff E is re-

solved and ∀α ∈ AG: if Σ

(e) > Σ

(e), then pro(B

) ≠ ∅;

and if Σ

(e) < Σ

(e), then con(B

) ≠ ∅.

This property also takes the viewpoint of an external

“judge”, by imposing that the ﬁnal exchange BAF convinc-

ingly represents a resolution of the conﬂicts between agents’

stances, thus showing why stances were changed. Specif-

ically, it imposes that a changed stance must be the result

Proofs for all propositions are in arxiv.org/abs/2303.15022.

of pro or con arguments (depending on how stances have

changed). For example, in Figure 2, b, d, f are pro argu-

ments which could justify an increase in stance for e, while

a, c are con arguments which could justify its decrease.

Note that this property does not hold in general, e.g., given

an agent which (admittedly counter-intuitively) increases its

evaluation of arguments when they are attacked. However, it

holds for some evaluation models, notably DF-QuAD again:

Proposition 3. If E is resolved and ∀α ∈ AG, σ

is DF-

QuAD, then E satisﬁes resolution representation.

The ﬁnal property we consider concerns unresolved AXs,

in the same spirit as resolution representation.

Property 5. E satisﬁes conﬂict representation iff E is unre-

solved, pro(B

) ≠ ∅ and con(B

) ≠ ∅.

This property thus requires that the conﬂict in an unre-

solved AX is apparent in the exchange BAF, namely it in-

cludes both pro and con arguments (representing the con-

ﬂicting stances). For example, if the AX in Figure 2 con-

cluded unresolved at t = 2, this property requires that B

contains both pro arguments for e (e.g. a or c) and con ar-

guments against it (e.g. b). This property does not hold in

general, e.g. for an agent who rejects all arguments by im-

posing on them minimum biases and contributes no attack

or support. Proving that this property holds requires consid-

eration of the agents’ behaviour, which we examine next.

6 Agent Behaviour in AXs for XAI

All our examples so far have illustrated how AXs may sup-

port explanatory interactions amongst a machine µ and a hu-

man η. This speciﬁc XAI setting is our focus in the remain-

der, where we assume AG = { µ, η}. Also, for simplicity, we

impose (as in all illustrations) that I

= I

= [0, 1], I

−

= I

−

[0, 0.5), I

= I

= {0.5} and I

= I

= (0.5, 1]. We also

restrict attention to AXs governed by a turn-making func-

tion π imposing a strict interleaving such that π(i) = {µ} if

i is odd, and π(i) = {η} otherwise (thus, in particular, the

machine starts the interactive explanation process).

In line with standard argumentative XAI, the machine

may draw the QBAF in its initial private triple (at t = 0)

from the model it is explaining. This QBAF may be obtained

by virtue of some abstraction methodology or may be the ba-

sis of the model itself (see (Cyras et al. 2021)). The humans,

instead, may draw the QBAF in their initial private triple, for

example, from their own knowledge, biases, and/or regula-

tions on the expected machine’s behaviour. The decision on

the evaluation method, for machines and humans, may be

dictated by speciﬁc settings and desirable agent properties

therein. Here we focus on how to formalise and evaluate in-

teractive explanations between a machine and a human using

AXs, and ignore how their initial private triples are obtained.

Below we deﬁne various behaviours dictating how ma-

chines and humans can engage in AXs for XAI, focusing on

ways to i) determine their biases and ii) decide their contri-

butions (attacks/supports) to (unresolved) AXs.

Biases. As seen in §4, the degree to which learnt at-

tacks/supports impact the stances of agents on explananda

is determined by the agents’ biases on the learnt arguments.

Proceedings of the 20th International Conference on Principles of Knowledge Representation and Reasoning

Main Track

587

In XAI different considerations regarding this learning ap-

ply to machines and humans. Firstly, not all machines may

be capable of learning: simple AI systems which provide ex-

planations but do not have the functionality for understand-

ing any input from humans are common in AI. Secondly,

machines capable of learning may assign different biases to

the learnt arguments: a low bias indicates scepticism while

a high bias indicates credulity. Machines may be designed

to give low biases to arguments from sources which cannot

be trusted, e.g. when the expertise of a human is deemed

insufﬁcient, or high biases to arguments when the human

is deemed competent, e.g. in debugging. Here, we refrain

from accommodating such challenges and focus instead on

the restrictive (but sensible, as a starting point) case where

machines assign constant biases to arguments from humans.

Deﬁnition 9. Let c ∈ [0, 1] be a chosen constant. For any

learnt argument a ∈ X

∖ X

t−1

at timestep t, τ

(a) = c.

If c = 0 then the machine is unable to learn, whereas

0 < c < 1 gives partially sceptical machines and c = 1

gives credulous machines. The choice of c thus depends

on the speciﬁc setting of interest, and may have an impact

on the conﬂict resolution desideratum for AXs. For exam-

ple, let µ use DF-QuAD as its evaluation method: if c = 1

we can derive guarantees of rejection/weakening or accep-

tance/strengthening of arguments which are attacked or sup-

ported, resp., by learnt arguments,

demonstrating the po-

tential (and dangers) of credulity in machines (see §7).

Humans, meanwhile, typically assign varying biases to

arguments based on their own internal beliefs. These as-

signments may reﬂect cognitive biases such as the conﬁrma-

tion bias (Nickerson 1998) – the tendency towards looking

favourably at evidence which supports one’s prior views. In

§7 we model humans so that they assign random biases to

learnt arguments, but explore conﬁrmation bias by applying

a constant offset to reduce the bias assigned by the human.

This differs, e.g., from the modelling of conﬁrmation bias in

(de Tarl

e, Bonzon, and Maudet 2022), acting on the prob-

ability of an argument being learned. We leave the explo-

ration of alternatives for assigning biases to future work.

Attack/Support Contributions. We consider shallow,

greedy and counterfactual behaviours: intuitively, the ﬁrst

corresponds to the one-shot explanations in most XAI,

the second contributes the (current) strongest argument in

favour of the agent position, and the third considers how

each attack/support may (currently) affect the exchange

BAF before it is contributed. All behaviours identify ar-

gument pairs to be added to the exchange BAF as attacks

or supports reﬂecting their role in the private QBAFs from

which they are drawn. We use the following notion:

Deﬁnition 10. For E resolved at timestep t, if Σ

(e) >

(e) then the states of µ and η at t are, resp., arguing for

and arguing against e (else, the states are reversed).

The agents’ states point to a “window for persuasion”,

whereby an agent arguing for (against) e may wish to at-

tempt to increase (decrease, resp.) the stance of the other

Propositions on such effects are in arxiv.org/abs/2303.15022.

agent, without accessing their private QBAFs, thus differ-

ing from other works, e.g. (de Tarl

e, Bonzon, and Maudet

2022), which rely on shared evaluations: in our case, rea-

soning is shared but it is not evaluated in a shared manner.

The shallow behaviour selects a (bounded by max) max-

imum number of supports for/attacks against the explanan-

dum if the agent is arguing for/against, resp., it, as follows:

Deﬁnition 11. Let max ∈ N. Agent α ∈ AG exhibits shallow

behaviour (wrt max) iff, at any 0 ≤ t < n where π(t) = {α},

C = {(a, b)∣C((a, b)) = (α, t)} is a maximal (wrt cardinal-

ity) set {(a

, e), . . . , (a

, e)} with p ≤ max such that:

• if α is arguing for e then C⊆S

t−1

∖S

t−1

where ∀i∈{1, . . . ,

p},∄(b, e)∈S

t−1

∖(S

t−1

∪ C) with σ

t−1

(b)>σ

t−1

);

• if α is arguing against e then C⊆A

t−1

∖A

t−1

where ∀i∈{1,

. . . , p},∄(b, e)∈A

t−1

∖(A

t−1

∪C) with σ

t−1

(b)>σ

t−1

This behaviour thus focuses on reasoning for or against

the explanandum e exclusively. It selects supports or at-

tacks in line with the agent’s stance on e and with the

highest evaluation in the contributing agent’s private QBAF.

This behaviour is inspired by static explanation methods in

XAI, which deliver all information in a single contribution.

Clearly, if we let µ exhibit this shallow behaviour and η be

unresponsive, i.e. never contribute any attack/support, then

the AX cannot satisfy conﬂict representation.

The greedy behaviour allows an agent arguing for e to

support the pro or attack the con arguments, while that argu-

ing against can support the con or attack the pro arguments.

Deﬁnition 12. Agent α ∈ AG exhibits greedy behaviour iff,

at any 0 ≤ t < n where π(t) = {α}, C = {(a, b)∣C((a, b)) =

(α, t)} is empty or amounts to a single attack or support

(a, b) ∈ (A

t−1

∪ S

t−1

) ∖ (A

t−1

∪ S

t−1

) such that:

1. if α is arguing for e then: (a, b) ∈ S

t−1

and b ∈

pro(B

t−1

) ∪ {e}; or (a, b) ∈ A

t−1

and b ∈ con(B

t−1

);

if α is arguing against e then: (a, b) ∈ S

t−1

and b ∈

con(B

t−1

); or (a, b) ∈ A

t−1

and b ∈ pro(B

t−1

) ∪ {e};

2. ∄(a

′

, b

′

) ∈ (A

t−1

∪ S

t−1

) ∖ (A

t−1

∪ S

t−1

) satisfying 1.

such that σ

t−1

′

) > σ

t−1

(a);

3. ∄(a

′′

, b

′′

) ∈ (A

t−1

∪ S

t−1

) ∖ (A

t−1

∪ S

t−1

) satisfying 1.

such that σ

t−1

′′

) = σ

t−1

(a) and

∣argmin

′′

∈paths(a

′′

,e)

∣P

′′

∣ ∣ < ∣argmin

P ∈paths(a,e)

∣P ∣ ∣.

Intuitively, 1. requires that the attack or support, if any,

is in line with the agent’s views; 2. ensures that the attack-

ing or supporting argument has maximum strength; and 3.

ensures that it is “close” to the explanandum. We posit that

enforcing agents to contribute at most one argument per turn

will aid minimality without affecting conﬂict resolution neg-

atively wrt the shallow behaviour (see §7). Minimality is a

common property of explanations in XAI, deemed beneﬁ-

cial both from a machine perspective, e.g. wrt computational

aspects (see computational complexity in (Sokol and Flach

2020)), and from a human perspective, e.g. wrt cognitive

load and privacy maintenance (see parsimony in (Sokol and

Flach 2020)). Naturally, however, conﬂict resolution in AXs

should always take precedence over minimality, as prioritis-

ing the latter would force AXs to remain empty.

Proceedings of the 20th International Conference on Principles of Knowledge Representation and Reasoning

Main Track

588

Proposition 4. If E is unresolved and ∀α ∈ AG: α exhibits

greedy behaviour and {(a,b)∈A

∪S

∣C((a, b))=(α, t), t ∈

{1, . . . , n}}≠∅, then E satisﬁes conﬂict representation.

Proposition 5. If ∀α ∈ AG, for all 0 ≤ t < n and ∀a ∈ X

paths((a, e)) = {(a, e)}, then the shallow (with max = 1)

and greedy behaviours are aligned.

The greedy behaviour may not always lead to resolutions:

Example 4. Let us extend the AX from Example 3 to

⟨B

, . . . , B

, AG

, . . . , AG

, C⟩ such that (see Figure 2):

• B

=⟨{e, a, b, c, d}, {(a, e), (d, a)}, {(b, e), (c, a)}⟩;

• η

= η

; µ

is such that Q

⊐ Q

where X

= X

∪ {d},

= A

∪ {(d, a)}, S

= S

, τ

(d) = 0.6; then, the

argument evaluations are σ

, e) = 0.42, σ

, a) =

0.8, σ

, b) = 0.4, σ

, c) = 0.6, σ

, d) = 0.6;

• C((d, a)) = (η, 3), i.e. η contributes attack (d, a) at t = 3.

Here, in line with the greedy behaviour, µ learns the attack

(d, a) contributed by η at timestep 3. Then, even if µ assigns

the same bias to these learnt arguments as η (which is by no

means guaranteed), this is insufﬁcient to change the stance,

i.e. Σ

(e) = −, and so the AX remains unresolved.

The ﬁnal counterfactual behaviour takes greater consid-

eration of the argumentative structure of the reasoning avail-

able to the agents in order to maximise the chance of conﬂict

resolution with a limited number of arguments contributed.

This behaviour is deﬁned in terms of the following notion.

Deﬁnition 13. Given an agent α ∈ AG, a private view

of the exchange BAF by α at timestep t is any Q

αv

⟨X

αv

, A

αv

, S

αv

, τ

αv

⟩ such that B

⊑ Q

αv

⊑ Q

An agent’s private view of the exchange BAF thus

projects their private biases onto the BAF, while also po-

tentially accommodating counterfactual reasoning with ad-

ditional arguments. Based on arguments’ evaluations in an

agent’s private view, the agent can then judge which attack

or support it perceives will be the most effective.

Deﬁnition 14. Given an agent α ∈ AG, α’s perceived effect

on e at 0<t≤n of any (a, b) ∈ (A

t−1

∪S

t−1

)∖(A

t−1

∪S

t−1

where a ∈ X

t−1

∖ X

t−1

and b ∈ X

t−1

, is ((a, b), Q

) =

αv

, e) − σ

t−1

αv

, e) for Q

αv

⊐ Q

t−1

αv

a private view

of the exchange BAF at t by α such that X

αv

= X

t−1

αv

∪ {a},

αv

= (X

αv

× X

αv

) ∩ A

t−1

and S

αv

= (X

αv

× X

αv

) ∩ S

t−1

The counterfactual view underlying this notion of per-

ceived effect relates to (Kampik and Cyras 2022), although

we consider the effect of adding an attack or support,

whereas they consider an argument’s contribution by remov-

ing it. It also relates to the hypothetical value of (de Tarl

Bonzon, and Maudet 2022), which however amounts to the

explanandum’s evaluation in the shared graph.

Deﬁnition 15. Agent α ∈ AG exhibits counterfactual be-

haviour iff, at any 0 ≤ t < n where π(t) = {α}, C =

{(a, b)∣C((a, b))=(α, t)} is empty or is {(a, b)} such that:

● if α is arguing for e then ((a, b), Q

) > 0 and (a, b) is

argmax

′

)∈(A

t−1

∪S

t−1

)∖(A

t−1

∪S

t−1

)

((a

′

, b

′

), Q

);

● if α is arguing against e then ((a, b), Q

)<0 and (a,b)

is argmin

′

)∈(A

t−1

∪S

t−1

)∖(A

t−1

∪S

t−1

)

((a

′

, b

′

), Q

Identifying attacks and supports based on their effect on

the explanandum is related to proponent and opponent argu-

ments in (Cyras, Kampik, and Weng 2022), deﬁned however

in terms of quantitative dispute trees for BAFs.

The counterfactual behaviour may better consider argu-

mentative structure, towards resolved AXs, as shown next.

Example 5. Consider the AX from Example 4 but where:

• B

=⟨{e, a, b, c, f }, {(a, e)}, {(b, e), (c, a), (f, b)}⟩;

• µ

is such that Q

⊐ Q

where X

= X

∪ {f }, A

, S

= S

∪ {(f, b)}, τ

(f) = 0.5; then, the argument

evaluations are σ

, e) = 0.546, σ

, a) = 0.92,

, b) = 0.7, σ

, c) = 0.6 and σ

, f ) = 0.5;

• C((f, b)) = (η, 3), i.e. η contributes support (f, b) to B

Here, η contributes (f, b) in line with the counterfactual be-

haviour as ((f, b), Q

) = 0.24 > ((d, a), Q

) = 0.216.

This sufﬁciently modiﬁes µ’s private QBAF such that Σ

+, and the AX is now resolved: the counterfactual behaviour

succeeds where the greedy behaviour did not (Example 4).

We end showing some conditions under which conﬂict

representation is satisﬁed by the counterfactual behaviour.

Proposition 6. If E is unresolved and is such that ∀α ∈

AG: α exhibits counterfactual behaviour; σ

is DF-QuAD;

{(a, b) ∈ A

∪ S

∣C((a, b)) = (α, t), t ∈ {1, . . . , n}} ≠ ∅;

then E satisﬁes conﬂict representation.

7 Evaluation

We now evaluate sets of AXs obtained from the behaviours

from §6 via simulations, using the following metrics:

Resolution Rate (RR): the proportion of resolved AXs.

Contribution Rate (CR): the average number of argu-

ments contributed to the exchange BAFs in the resolved

AXs, in effect measuring the total information exchanged.

Persuasion Rate (PR): for an agent, the proportion of

resolved AXs in which the agent’s initial stance is the other

agent’s ﬁnal stance, measuring the agent’s persuasiveness.

Contribution Accuracy (CA): for an agent, the propor-

tion of the contributions which, if the agent was arguing

for (against) e, would have maximally increased (decreased,

resp.) e’s strength in the other agent’s private QBAF.

We tested PR and CA for machines only. Let unrespon-

sive behaviour amount to contributing nothing (as in §6).

Then, our hypotheses were:

H1: For a shallow machine and an unresponsive human,

as the max constant increases, RR, CR and CA increase.

H2: For a shallow machine and an unresponsive human,

as the human’s conﬁrmation bias increases, RR decreases.

H3: For a greedy machine and a counterfactual human,

RR increases relative to a shallow machine and an unrespon-

sive human.

H4: For a greedy machine and a counterfactual human,

as the machine’s bias on learnt arguments increases, RR in-

creases while CR and PR decrease.

H5: For a counterfactual machine and a counterfactual

human, RR and CA increase relative to a greedy machine.

See arxiv.org/abs/2303.15022 for exact formulations.

Proceedings of the 20th International Conference on Principles of Knowledge Representation and Reasoning

Main Track

589

Experimental Setup. For each AX for e (restricted as in

§6), we created a “universal BAF”, i.e. a BAF for e of which

all argumentation frameworks are subgraphs. We populated

the universal BAFs with 30 arguments by ﬁrst generating a

6-ary tree with e as the root. Then, any argument other than

e had a 50% chance of having a directed edge towards a

random previous argument in the tree, to ensure that multi-

ple paths to the explanandum are present. 50% of the edges

in the universal BAF were randomly selected to be attacks,

and the rest to be supports. We built agents’ private QBAFs

from the universal BAF by performing a random traversal

through the universal BAF and stopped when the QBAFs

reached 15 arguments, selecting a random argument from

each set of children, as in (de Tarl

e, Bonzon, and Maudet

2022). We then assigned random biases to arguments in the

agents’ QBAFs, and (possibly different) random evaluation

methods to agents amongst QuAD (Baroni et al. 2015), DF-

QuAD, REB (Amgoud and Ben-Naim 2017) and QEM (Po-

tyka 2018) (all with evaluation range [0, 1]). We used differ-

ent evaluation methods to simulate different ways to evalu-

ate arguments in real-world humans/machines. We repeated

this process till agents held different stances on e.

For each hypothesis, we ran 1000 experiments per conﬁg-

uration, making sure the experiments for different strategies

are run with the same QBAFs. We ran the simulations on

the NetLogo platform using BehaviorSpace.

We tested the

signiﬁcance between testing conditions in a pairwise man-

ner using the chi-squared test for the discrete measures RR

and PR, and Student’s t-test for the continuous measures CR

and CA. We rejected the null hypotheses when p < 0.01.

Experimental Results. Table 1 reports the results of our

simulations: all hypotheses were (at least partially) veriﬁed.

H1: As expected, increasing max for shallow machines

results in signiﬁcantly higher RR, CR and CA up to max =

3 (p < 0.005 for max values of 1 vs 2 and 2 vs 3 for all

metrics). Above this limit (max values of 3 vs 4 and 4 vs 5),

this trend was no longer apparent, suggesting that there was

a limit to the effectiveness of contributing arguments at this

distance from e. Note that the machine’s PR is always 100%

here, since the (unresponsive) human does not contribute.

H2: We ﬁxed max = 4 (the value with the maximum

RR for H1) and found that increasing the conﬁrmation bias

in the human signiﬁcantly decreased the machine’s RR ini-

tially (p < 0.01 for 0 vs −0.1 and −0.1 vs −0.2), before the

effect tailed off as RR became very low (p = 0.09 for −0.2

vs −0.3 and p = 0.03 for −0.3 vs −0.4), demonstrating the

need for behaviours which consider deeper reasoning than

the shallow behaviour to achieve higher resolution rates.

H3: From here onwards we tested with a counterfactual

human

and ﬁxed the level of conﬁrmation bias therein to

−0.2. We compared shallow against greedy machines, also

limiting the number of arguments they contributed to max-

ima of three and four to compare fairly with the shallow

machine with the ﬁxed max constant. RR increased signiﬁ-

cantly with the greedy behaviour (p < 0.001), over the shal-

low machine which remained statistically signiﬁcant when

See github.com/CLArg-group/argumentative

exchanges.

Experiments with greedy humans gave similar ﬁndings.

Behaviour Learning

RR CR PR

µ η µ η

S (1) - - 0 5.4 1 100 45.4

S (2) - - 0 9.6 1.96 100 51.9

H1 S (3) - - 0 13.0 2.76 100 56.7

S (4) - - 0 13.9 3.22 100 58.1

S (5) - - 0 13.7 3.38 100 58.3

S (4) - - -0.1 11.2 3.26 100 57.6

S (4) - - -0.2 8.6 3.27 100 58.0

S (4) - - -0.3 6.7 3.30 100 58.3

S (4) - - -0.4 5.3 3.38 100 58.5

G (≤3) C 0 -0.2 9.8 3.15 83.7 38.8

H3 G (≤4) C 0 -0.2 11.9 3.88 79.0 37.1

G C 0 -0.2 18.8 7.16 79.3 35.7

G C 0.5 -0.2 42.2 6.73 31.5 37.5

G C 1.0 -0.2 55.5 5.24 20.4 38.2

H5 C C 0.5 -0.2 48.4 7.37 41.5 50.5

Table 1: Results in the simulations for the ﬁve hypotheses for three

behaviours: Shallow (max constant given in parentheses); Greedy

(where any limit on the number of contributed arguments by the

agent is in brackets); and Counterfactual. Learning amounts to c in

Deﬁnition 9 for µ and to the conﬁrmation bias offset for η (where

appropriate). We report RR, PR

and CA

as percentages. We

indicate in bold the chosen baseline for the next hypothesis.

we restricted the greedy machine’s contributed arguments to

4 (p < 0.005), but not to 3 (p = 0.202).

H4: RR increased signiﬁcantly with the bias on learnt

arguments (p < 0.001 for both comparisons of learning con-

ﬁgurations: 0 vs 0.5 and 0.5 vs 1). However, the machine’s

CR and PR fell signiﬁcantly (p < 0.001 for similar pairwise

comparisons, except for 0 vs 0.5 for CR, where p = 0.27).

highlighting the naive nature of machines learning credu-

lously (i.e. assigning all learnt arguments the top bias).

H5: The counterfactual behaviour outperformed the

greedy behaviour signiﬁcantly in terms of both RR (p <

0.01) and CA (p < 0.001), showing, even in this limited set-

ting, the advantages in taking a counterfactual view, given

that the strongest argument (as selected by the greedy be-

haviour) may not always be the most effective in persuading.

8 Conclusions

We deﬁned the novel concept of AXs, and deployed AXs

in the XAI setting where a machine and a human engage

in interactive explanations, powered by non-shallow reason-

ing, contributions from both agents and modelling of agents’

learning and explanatory behaviour. This work opens sev-

eral avenues for future work, besides those already men-

tioned. It would be interesting to experiment with any num-

ber of agents, besides the two that are standard in XAI,

and to identify restricted cases where hypotheses H1-H5 are

guaranteed to hold. It would also be interesting to accommo-

date mechanisms for machines to model humans, e.g. as in

opponent modelling (Hadjinikolis et al. 2013). Also fruitful

could be an investigation of how closely AXs can represent

machine and human behaviour. Further, while we used AXs

in XAI, they may be usable in various multi-agent settings.

Proceedings of the 20th International Conference on Principles of Knowledge Representation and Reasoning

Main Track

590

Acknowledgements

This research was partially funded by the ERC under the

EU’s Horizon 2020 research and innovation programme

(No. 101020934, ADIX) and by J.P. Morgan and by the

Royal Academy of Engineering, UK.

References

Albini, E.; Lertvittayakumjorn, P.; Rago, A.; and Toni, F.

2020. DAX: deep argumentative explanation for neural net-

works. CoRR abs/2012.05766.

Amgoud, L., and Ben-Naim, J. 2017. Evaluation of ar-

guments in weighted bipolar graphs. In ECSQARU 2017,

25–35.

Amgoud, L., and Ben-Naim, J. 2018. Evaluation of argu-

ments in weighted bipolar graphs. Int. J. Approx. Reason.

99:39–55.

Amgoud, L., and Ben-Naim, J. 2022. Axiomatic founda-

tions of explainability. In IJCAI 2022, 636–642.

Antaki, C., and Leudar, I. 1992. Explaining in conversation:

Towards an argument model. Europ. J. of Social Psychology

22:181–194.

Atkinson, K.; Baroni, P.; Giacomin, M.; Hunter, A.;

Prakken, H.; Reed, C.; Simari, G. R.; Thimm, M.; and Vil-

lata, S. 2017. Towards artiﬁcial argumentation. AI Magazine

38(3):25–36.

Balog, K.; Radlinski, F.; and Arakelyan, S. 2019. Transpar-

ent, scrutable and explainable user models for personalized

recommendation. In SIGIR 2019, 265–274.

Baroni, P.; Romano, M.; Toni, F.; Aurisicchio, M.; and

Bertanza, G. 2015. Automatic evaluation of design alter-

natives with quantitative argumentation. Argument Comput.

6(1):24–49.

Baroni, P.; Comini, G.; Rago, A.; and Toni, F. 2017. Ab-

stract games of argumentation strategy and game-theoretical

argument strength. In PRIMA 2017, 403–419.

Baroni, P.; Gabbay, D.; Giacomin, M.; and van der Torre,

L., eds. 2018. Handbook of Formal Argumentation. College

Publications.

Baroni, P.; Rago, A.; and Toni, F. 2018. How many proper-

ties do we need for gradual argumentation? In AAAI 2018,

1736–1743.

Baroni, P.; Rago, A.; and Toni, F. 2019. From ﬁne-grained

properties to broad principles for gradual argumentation: A

principled spectrum. Int. J. Approx. Reason. 105:252–286.

Bertrand, A.; Belloum, R.; Eagan, J. R.; and Maxwell, W.

2022. How cognitive biases affect XAI-assisted decision-

making: A systematic review. In AIES ’22, 78–91.

Black, E., and Atkinson, K. 2011. Choosing persuasive

arguments for action. In AAMAS 2011, 905–912.

Black, E., and Hunter, A. 2007. A generative inquiry dia-

logue system. In AAMAS 2007, 241.

Calegari, R.; Omicini, A.; Pisano, G.; and Sartor, G. 2022.

Arg2P: an argumentation framework for explainable intelli-

gent systems. J. Log. Comput. 32(2):369–401.

Calegari, R.; Riveret, R.; and Sartor, G. 2021. The burden

of persuasion in structured argumentation. In ICAIL 2021,

180–184.

Cawsey, A. 1991. Generating interactive explanations. In

AAAI 1991, 86–91.

Cayrol, C., and Lagasquie-Schiex, M. 2005. On the accept-

ability of arguments in bipolar argumentation frameworks.

In ECSQARU 2005, 378–389.

Cocarascu, O.; Rago, A.; and Toni, F. 2019. Extracting

dialogical explanations for review aggregations with argu-

mentative dialogical agents. In AAMAS 2019, 1261–1269.

Cyras, K.; Rago, A.; Albini, E.; Baroni, P.; and Toni, F.

2021. Argumentative XAI: A survey. In IJCAI 2021, 4392–

4399.

Cyras, K.; Kampik, T.; and Weng, Q. 2022. Dispute trees

as explanations in quantitative (bipolar) argumentation. In

ArgXAI 2022 co-located with COMMA 2022.

de Tarl

e, L. D.; Bonzon, E.; and Maudet, N. 2022. Mul-

tiagent dynamics of gradual argumentation semantics. In

AAMAS 2022, 363–371.

Donadello, I.; Hunter, A.; Teso, S.; and Dragoni, M. 2022.

Machine learning for utility prediction in argument-based

computational persuasion. In AAAI 2022, 5592–5599.

Dung, P. M. 1995. On the Acceptability of Arguments and

its Fundamental Role in Nonmonotonic Reasoning, Logic

Programming and n-Person Games. Artiﬁcial Intelligence

77(2):321–358.

Fan, X., and Toni, F. 2012a. Argumentation dialogues for

two-agent conﬂict resolution. In COMMA 2012, 249–260.

Fan, X., and Toni, F. 2012b. Mechanism design for

argumentation-based persuasion. In COMMA 2012, 322–

333.

Fan, X., and Toni, F. 2015a. Mechanism design for

argumentation-based information-seeking and inquiry. In

PRIMA 2015, 519–527.

Fan, X., and Toni, F. 2015b. On computing explanations in

argumentation. In AAAI 2015, 1496–1502.

Hadjinikolis, C.; Siantos, Y.; Modgil, S.; Black, E.; and

McBurney, P. 2013. Opponent modelling in persuasion dia-

logues. In IJCAI 2013, 164–170.

Hirsch, T.; Soma, C. S.; Merced, K.; Kuo, P.; Dembe, A.;

Caperton, D. D.; Atkins, D. C.; and Imel, Z. E. 2018. “It’s

hard to argue with a computer”: Investigating psychothera-

pists’ attitudes towards automated evaluation. In DIS 2018,

559–571.

Hunter, A. 2018. Towards a framework for computational

persuasion with applications in behaviour change. Argument

Comput. 9(1):15–40.

Ignatiev, A.; Narodytska, N.; and Marques-Silva, J. 2019.

Abduction-based explanations for machine learning models.

In AAAI 2019, 1511–1519.

Kampik, T., and Cyras, K. 2022. Explaining change in

quantitative bipolar argumentation. In COMMA 2022, 188–

199.

Proceedings of the 20th International Conference on Principles of Knowledge Representation and Reasoning

Main Track

591

Kontarinis, D., and Toni, F. 2015. Identifying malicious

behavior in multi-party bipolar argumentation debates. In

EUMAS/AT 2015, 267–278.

Lakkaraju, H.; Slack, D.; Chen, Y.; Tan, C.; and Singh, S.

2022. Rethinking explainability as a dialogue: A practi-

tioner’s perspective. CoRR abs/2202.01875.

Lertvittayakumjorn, P.; Specia, L.; and Toni, F. 2020.

FIND: human-in-the-loop debugging deep text classiﬁers. In

EMNLP 2020, 332–348.

Lundberg, S. M., and Lee, S. 2017. A uniﬁed approach to

interpreting model predictions. In NIPS 2017, 4765–4774.

Miller, T. 2019. Explanation in artiﬁcial intelligence: In-

sights from the social sciences. Artif. Intell. 267:1–38.

Nickerson, R. S. 1998. Conﬁrmation bias: A ubiquitous

phenomenon in many guises. Review of General Psychology

2:175 – 220.

Panisson, A. R.; McBurney, P.; and Bordini, R. H. 2021. A

computational model of argumentation schemes for multi-

agent systems. Argument Comput. 12(3):357–395.

Paulino-Passos, G., and Toni, F. 2022. On interactive ex-

planations as non-monotonic reasoning. In XAI 2022 co-

located with IJCAI 2022.

Pisano, G.; Calegari, R.; Prakken, H.; and Sartor, G. 2022.

Arguing about the existence of conﬂicts. In COMMA 2022,

284–295.

Potyka, N. 2018. Continuous dynamical systems for

weighted bipolar argumentation. In KR 2018, 148–157.

Potyka, N. 2021. Interpreting neural networks as quantita-

tive argumentation frameworks. In AAAI 2021, 6463–6470.

Rago, A.; Baroni, P.; and Toni, F. 2022. Explaining causal

models with argumentation: the case of bi-variate reinforce-

ment. In KR 2022, 505–509.

Rago, A.; Toni, F.; Aurisicchio, M.; and Baroni, P. 2016.

Discontinuity-free decision support with quantitative argu-

mentation debates. In KR 2016, 63–73.

Rago, A.; Cocarascu, O.; Bechlivanidis, C.; and Toni, F.

2020. Argumentation as a framework for interactive expla-

nations for recommendations. In KR 2020, 805–815.

Rago, A.; Cocarascu, O.; and Toni, F. 2018. Argumentation-

based recommendations: Fantastic explanations and how to

ﬁnd them. In IJCAI 2018, 1949–1955.

Raymond, A.; Gunes, H.; and Prorok, A. 2020. Culture-

based explainable human-agent deconﬂiction. In AAMAS

2020, 1107–1115.

Shih, A.; Choi, A.; and Darwiche, A. 2018. A symbolic ap-

proach to explaining bayesian network classiﬁers. In IJCAI

2018, 5103–5111.

Sokol, K., and Flach, P. A. 2020. Explainability fact sheets:

a framework for systematic assessment of explainable ap-

proaches. In FAT* 2020, 56–67.

Teso, S.; Alkan,

O.; Stammer, W.; and Daly, E. 2023.

Leveraging explanations in interactive machine learning: An

overview. Frontiers Artif. Intell. 6.

Vassiliades, A.; Bassiliades, N.; and Patkos, T. 2021. Ar-

gumentation and explainable artiﬁcial intelligence: a survey.

The Knowledge Engineering Review 36:e5.

Wachter, S.; Mittelstadt, B. D.; and Russell, C. 2017. Coun-

terfactual explanations without opening the black box: Au-

tomated decisions and the GDPR. CoRR abs/1711.00399.

Wu, X.; Xiao, L.; Sun, Y.; Zhang, J.; Ma, T.; and He, L.

2022. A survey of human-in-the-loop for machine learning.

Future Gener. Comput. Syst. 135:364–381.

Proceedings of the 20th International Conference on Principles of Knowledge Representation and Reasoning

Main Track

592