Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Abstract

This study examined whether three heads are better than four in terms of performance and learning properties in group decision-making. It was predicted that learning incoherence took place in tetrads because the majority rule could not be applied when two subgroups emerged. As a result, tetrads underperformed triads. To examine this hypothesis, we adopted a reinforcement learning framework using simple Q-learning and estimated learning parameters. Overall, the results were consistent with the hypothesis. Further, this study is one of a few attempts to apply a computational approach to learning behavior in small groups. This approach enables the identification of underlying learning parameters in group decision-making.

Introduction

Division of labor and specialization have significantly increased in modern society, and most of the tasks involved in these processes are carried out by groups1,2. Management of groups plays a critical role in achieving greater performance, that in turn hinges on understanding underlying group dynamics. Lewin3 devised the term “group dynamics” to refer to the way groups and individuals act and react to changing circumstances. In related literature, group performance has been found to be related to a combination of personality traits4, member ability4,5, team familiarity, team roles6 or leadership styles7,8, identity, conformity, psychological safety, and cohesiveness9,10,11.

Although these psychological and sociological factors indeed account for group performance, little attention has been paid to empirically measuring and characterizing the learning properties in group decision-making. Thus, this study does so, aiming to estimate relevant learning parameters by taking a computational approach to group decision-making.

Group decision-making has some advantages over individual decision-making. The former induces more employee involvement and satisfaction12, leading to higher performance13. However, a number of related studies on group decision-making have supported the proposition that groups rarely outperform their best members14,15. Nevertheless, the majority of related literature on group decision-making, under the assumption that group members cooperate and share information voluntarily15,16,17, has shown that groups outperform individuals in decision-making, enabling knowledge transfer from group to individual contexts18,19, more accurate information recall20, better negotiation outcomes21, more creative ideas22, and more accuracy23,24.

The single detection approach highlights the performance problems that can occur with individuals vs. dyads, the effects of larger group size on performance remain to be examined. This study examined whether three heads perform better than four or one. The comparison of groups of three and four and individuals highlights new, interesting issues in group decision-making that do not arise with groups of two vs. one, that is, even-sized groups vs. odd-sized groups14,25. Small groups are likely to break into two coalitions. If a group has an even number of members, the two subgroups are equal in size. In this case, since the majority rule cannot be applied, subgroup dynamics might lead to deadlock26,27,28,29. In contrast, if a small group has an odd number of members, a minority and majority subgroup emerge, and the majority influence provides a clear direction and group cohesion14,25,30,31. Thus, it is predicted that odd-sized small groups have higher cohesion and consistent decision-making, leading to superior performance to even-sized ones.

This hypothesis could be reformulated in terms of learning coherence and incoherence. That is, triads (more generally, odd-sized groups) perform better than tetrads (more generally, even-sized groups) because the former maintains learning coherence due to the majority rule, whereas the latter suffers from learning incoherence due to conflict among group members. To formalize learning coherence in triads and learning incoherence in tetrads, assume two learning strategies exist, Sh and Sl, and the ratio of the Sh in the population is p. The strategies Sh and Sl generate expected rewards of Rh and Rl where Rh > Rl. In addition, a randomized strategy between Sh and Sl could exist, that underperforms Sh and Sl due to its learning incoherence. The expected rewards for this randomized strategy are Rw, where Rw < Rl < Rh. In the tetrads, when two subgroups of two members have different learning strategies, conflicts arise, leading to a situation in which both strategies are randomly adopted. The probability of adopting these strategies is $$6{p}^{2}{\left(1-p\right)}^{2}$$. In contrast, the triads never encounter this situation because the majority, who decides the preferred strategy, always exists. Thus, learning coherence could be achieved in triads. The difference between expected rewards for the tetrads and triads are $$-3{p}^{2}{\left(1-p\right)}^{2}\left[Rh+Rl-2Rw\right]<0$$, indicating that tetrads underperform triads.

The purpose of this study was to examine the hypothesis that learning coherence emerges in individuals and triads and learning incoherence occurs in tetrads by estimating and comparing learning parameters.

Methods

Participants

A total of 343 healthy undergraduate students at Kobe University (103 women, age range = 19–25 years, SD = 1.21) participated in the study for course credit. All experimental protocols in this study were approved by the Ethics Committee, Graduate School of Business Administration, Kobe University, and the study was carried out in accordance with the relevant guidelines and regulations. All participants signed an informed consent form before the experiment.

Experiments

In test 1, participants undertook cognitive tasks (two-armed bandit [TAB] problems) individually. In test 2, they formed groups of three and performed as groups. In test 3, they formed groups of four and performed the same cognitive tasks as groups. There were seven rounds of tests. To control for learning effects, three tests were randomly assigned to either groups or individuals in each round, that is, some groups were triads, the other groups were tetrads, and the remaining were individuals. All tests were performed with the PsytoolKit32,33, and when participants performed the TAB tasks as a group, they communicated with each other via a breakout session in Zoom to decide the choices in the TAB.

All participants undertook test 1, and most of the participants took part in tests 2 and 3. In each of the triad and tetrad groups, at least one member participated in both tests 2 and 3. Because all group members in the triads did not participate in test 3, and all group members in the tetrads did not participate in test 2, 8 triad groups and 4 tetrad groups were dropped from the sample. As a result, the total number of groups of triads and tetrads examined in this study were 100 and 104, respectively.

Q-learning model

In this study, a simple Q-learning reinforcement learning algorithm34 was adopted to account for asymmetric learning rates (learning biases). Participants played a TAB problem, in which they chose either a right or left box on the screen. After the selection, the participants were awarded either 10 or 0 points, and they were instructed to try to achieve the highest score over a series of 100 choices. One of the boxes had a higher probability of being worth 1 point (70%), and the corresponding probability of the other box was set at 30%. However, we switched these probabilities twice over 100 choices. For example, the right and left boxes had a respective 70% and 30% probability of being worth 1 point for the first 30 choices, and from the 31st to the 70th choice, the probabilities switched such that the probability of earning 1 point for the right and left boxes became 30% and 70%, respectively. Then, for the last 30 choices, the probabilities of the right and left boxes returned to the initial respective levels of 70% and 30%. Thus, in each round of tests 1 and 2, these changes in probability took place three times over 100 choices. Moreover, the probabilities were randomized for every round of tests 1 and 2 so that even in the same test, the probability for each round differed. Therefore, participants could not transfer learning obtained in one round of the test to other rounds.

In the Q-learning framework, a decision-maker is assumed to calculate the action value for each choice (i.e., the right and left boxes). The action value of option i at trial t is denoted by $${Q}_{i}\left(t\right)$$, calculated as follows:

$$\begin{array}{*{20}c} {{\text{Q}}_{i} \left( {t + 1} \right) = \left\{ {\begin{array}{*{20}c} {{\text{ Q}}_{i} \left( t \right) + \alpha^{ + } \delta \left( t \right) + \phi if \delta \left( t \right) \ge 0, } \\ {{\text{ Q}}_{i} \left( t \right) + \alpha^{ - } \delta \left( t \right) + \phi if \delta \left( t \right) < 0,} \\ \end{array} } \right.} \\ \end{array}$$
(1)

with

$$\begin{array}{*{20}c} {\delta \left( t \right) = R_{i} \left( t \right) - {\text{Q}}_{i} \left( t \right), } \\ \end{array}$$
(2)

where $${\mathrm{R}}_{i}\left(t\right)$$ is the reward associated with option $$i$$ at trial $$t$$, either 10 or 0 points, and $$\delta \left(t\right)$$ is the reward prediction error. $${\alpha }^{\pm }$$ indicates the learning rate so that the learning biases are measured by $${\alpha }^{+}-{\alpha }^{-}$$. If this is positive (negative), positivity (negativity) biases exist. $$\phi$$ is added in Eq. 1 as the choice trace to account for autocorrelation of choice, which could affect learning biases35.

As one of the characteristics of learning, this study compared positivity biases. The positivity and confirmation biases refer to the tendency to respond to positive news more sensitively than to negative news, and the tendency to respect outcomes consistent with one’s hypothesis36. Related studies examined the existence of these biases in individual reinforcement learning, and reported that learning rates tend to be positively biased37,38,39,40,41. Katahira35 suggested that the autocorrelation of choices itself tends to generate pseudo-positivity biases. Harada42 controlled for this autocorrelation by incorporating the effects of past choices into the learning model, and demonstrated that the positivity biases were indeed confirmed in a simple Q-learning model. However, once a more dynamic model was introduced, the positivity biases disappeared. Therefore, learning biases not only depended on the autocorrelation of choices, but also on autocorrelation of learning parameters in the model. While previous studies examined learning biases for individuals, this study investigated the existence of positivity biases in group learning of triads and tetrads. As related studies indicated, it could be inferred that either positivity biases existed or no biases existed for both triads and tetrads. According to our hypothesis, we speculated that learning coherence in triads lead to positivity biases because individual learning was reported to generate positivity biases in related studies while tetrads generated no biases due to learning incoherence.

If the decision-maker $$\mathrm{j }\left(\mathrm{i}\ne \mathrm{j}\right)$$ does not choose the option, its action value does not change, remains to be changed:

$$\begin{array}{c}{Q}_{j}\left(t+1\right)={Q}_{j}\left(t\right).\end{array}$$
(4)

Given these action values of the two options, the decision-maker determines one of the two options according to the softmax decision rule:

$$\begin{array}{c}P\left(a\left(t\right)=i\right)=\frac{exp\left(\beta {Q}_{i}\left(t\right)\right)}{\sum_{j=1}^{2}exp\left(\beta {Q}_{j}\left(t\right)\right)},\end{array}$$
(5)

where $$P\left(a\left(t\right)=i\right)$$ is the probability of choosing the action $$a\left(t\right)=i$$ at trial $$t$$. The parameter $$\upbeta$$ is the inverse temperature, that measures the relative strength of exploitation vs. exploration (exploitation/exploration ratio). Exploitation is related to optimization under current contexts, implying the choice of the option with the highest action value $${Q}_{i}\left(t\right)$$. Exploration, on the other hand, refers to the digression from optimization so that one of the options without the highest action value is selected. If $$\upbeta$$ is high, the probability of choosing the option with the highest action value increases, leading to exploitation. In contrast, if $$\upbeta$$ is low, the probability of choosing the option without the highest action value increases. Thus, $$\upbeta$$ measures the exploitation/exploration ratio.

Estimation method

The parameters specified in Eqs. (1)–(5) were estimated by optimizing the maximum a posteriori objective function:

$$\begin{array}{c}\widehat{\theta }=argmax \; p\left({D}_{s}|{\theta }_{s}\right)\mathrm{p}\left({\theta }_{s}\right),\end{array}$$
(6)

where $$p\left({D}_{s}|{\theta }_{s}\right)$$ is the likelihood of data $${D}_{s}$$ for a subject $$\mathrm{s}$$ conditional on parameters $${\theta }_{s}=\left\{{{\alpha }^{\pm }}^{S}{, \phi }^{S},{\beta }^{S}\right\}$$. $$\mathrm{p}\left({\theta }_{s}\right)$$ is the prior probability of $${\theta }_{s}$$. Note that $$\mathrm{\alpha }$$ should be bounded between 0 and 1, and $$\upbeta$$ take non-negative values. Therefore, the corresponding priors were assumed to follow beta distributions for $${\alpha }^{\pm }$$ with shape parameters of 2 and 2, and gamma distributions for $$\upbeta$$ with a shape parameter of 2 and a scale parameter of 3. In addition, $${\phi }^{S}$$ is assumed to follow standard normal distribution with mean 0 and variance 1.

Results

This study investigated underlying learning mechanisms of triads and tetrads from two perspectives: (1) group differences and (2) within-group effects. The descriptive statistics for relevant variables are reported in Table 1. Since the data rejected either the homogeneity of variance by the Bartlett test or the normality by the Shapiro–Wilk test in the statistical tests of the differences of relevant data across and within groups, the Kruskal–Wallis test was applied in the subsequent analyses without referring to the results of either the Bartlett or the Shapiro–Wilk tests, due to space limitation.

Group differences

Performance

First, the performance difference between triads and tetrads was examined. The result suggested that a performance difference existed between triads and tetrads and triads outperformed tetrads ($${\chi }^{2}$$=4.12, p = 0.04). Thus, we could identify that triads generated slightly higher performance than tetrads (see Fig. 1).

Inverse temperature

As the first characteristic of learning, the magnitude of the inverse temperature between triads and tetrads was compared. Inverse temperature measured the degree of exploitation vis-à-vis exploration. Exploitation adopts the optimal choices, given existing information, whereas exploration makes random choices. Inverse temperature was significantly higher for triads than for tetrads ($${\chi }^{2}$$=42.88, p = 5.8.e−11) (see Fig. 2). It follows that triads were more likely to make random choices, regardless of past records. It could be inferred that this result was generated due to the fact that the majority rule was harder to apply in tetrads than in triads. This implied learning coherence in triads and incoherence in tetrads.

Positivity biases

As the second characteristic of learning, this study compared positivity biases. While previous studies examined learning biases for individuals, this study investigated the existence of positivity biases in group learning of triads and tetrads. As related studies indicated, it could be inferred that either positivity biases existed or no biases existed for both triads and tetrads. For triads, the positivity biases were supported ($${\chi }^{2}$$=13.39, p = 2.5e−04). However, for tetrads, we confirmed negativity biases ($${\chi }^{2}$$=24.05, p = 9.4e−07). This study also investigated learning biases for individuals, revealing that positivity biases existed ($${\chi }^{2}$$=22.08, p = 2.6e−06). Thus, while individuals and triads confirmed positivity biases, tetrads generated negativity biases (see Fig. 3). According to related studies, this result suggested learning coherence for triads and learning incoherence for tetrads.

Within-group effects

As the within-group effects, the maximum, minimum, and the average of group members’ individual performances and learning parameters were compared with the corresponding group variables.

Performance

In triads, group performance outperformed the minimum of individual performances of group members ($${\chi }^{2}$$=45.7, p = 1.4e-11), but underperformed its maximum version ($${\chi }^{2}$$=23.91, p = 1.1e-06). However, group performance and the average of individual performances were not differentiated ($${\chi }^{2}$$=0.89, p = 0.34). Similarly, in tetrads, group performance outperformed the minimum of individual members ($${\chi }^{2}$$=47.94, p = 4.4e-12), but underperformed both their maximum ($${\chi }^{2}$$=59.46, p = 1.3e-14). Group performance and its average version were not differentiated ($${\chi }^{2}$$=0.03, p = 0.87).

Inverse temperature

In triads, inverse temperature was greater than its minimum ($${\chi }^{2}$$=58.41, p = 2.1e−14) and average ($${\chi }^{2}$$=7.12, p = 0.01) of individual group members, but was weakly smaller than its maximum version ($${\chi }^{2}$$=3.50, p = 0.06). In tetrads, group inverse temperature was greater than the minimum of individual members ($${\chi }^{2}$$=26.68, p = 2.4e-07), but was smaller than both its maximum ($${\chi }^{2}$$=137.48, p = 2.2e-16) and average versions ($${\chi }^{2}$$=87.80, p = 2.2e-16).

Thus, group effects in triads were higher in inverse temperature because triads achieved higher group inverse temperature than their average, while those in tetrads were smaller than their average version.

Positivity biases

In triads, positivity biases were greater than the minimum of individual performances of group members ($${\chi }^{2}$$=40.02, p = 2.5e-10), but were smaller than its maximum version ($${\chi }^{2}$$=26.05, p = 3.3e-07). However, positivity biases and their average version were not differentiated ($${\chi }^{2}$$=1.03, p = 0.31). In tetrads, negativity biases were greater than all of their minimum ($${\chi }^{2}$$=34.33, p = 0.4.7e-09), maximum ($${\chi }^{2}$$=142, p = 2.2e-16), and average ($${\chi }^{2}$$=46.45, p = 9.4e-12) of individual members.

Thus, group effects in triads were high in generating positivity biases, but those in tetrads were also significant in giving rise to negativity biases.

Discussion

Overall, our statistical analysis revealed that triads had higher performance, higher inverse temperature, and more positivity biases. Since inverse temperature and positivity biases were indicated to be positively related to performance, these results implied that triads achieved learning coherence, but tetrads experienced learning incoherence. On the one hand, it can be inferred that triads that might break into majority and minority subgroups, enabled the group to achieve consistent and efficient learning over 100 choices, indicated by high performance, inverse temperature and positivity biases. On the other hand, tetrads that might be constrained by two equal subgroups, encountered dispute and confrontation, sometimes leading to deadlock, resulting in lower performance and inconsistent learning behavior, represented as low inverse temperature and high negativity biases. These results were consistent with related studies14,25,26,27,28,29,31.

In contrast to performance, inverse temperature, positivity biases, risk parameters, $$\upmu ,$$ and $$\upnu$$, did not account for the difference between triads and tetrads. Note that risk-seeking behavior also has a tendency toward divergence from current learning strategies. In this sense, risk-seeking has some similarity to exploration. However, in our model, exploration corresponded to divergence from the optimal Q value, that already incorporated risk-seeking behavior. Hence, risk-seeking and exploration have subtle differences. That inverse temperature differed between triads and tetrads, implying that the divergence from a consistent learning strategy was reflected in the inverse temperature but not in risk attitudes.

In addition to these results, this paper contributes a novel methodology for the study of small groups. To the best of our knowledge, this is one of the first attempts to take a computational approach to the study of small-group dynamics. Of course, a large body of literature on group dynamics has empirically investigated the properties of the dynamics of small groups. However, most of these studies did not explicitly model the underlying mechanism of group decision-making or estimate parameters that characterize group dynamics. The computational approach proposed in this paper articulates the algorithm of group decision-making and enables the underlying learning parameters to be estimated, allowing for rigorous comparison among small groups in terms of learning parameters such as inverse temperature and risk attitudes. We hope this computational approach sheds new light on group dynamics and group decision-making.

In this respect, it should also be noted that a simple Q-learning model, or reinforcement learning in general, closely correspond to the actual working of neural networks in the brain. The key variables are the actual rewards and reward prediction errors. The Q value is the expected reward, that is updated by feedback from a reward prediction error. This reinforcement learning framework is supported by a number of empirical studies including neural signals in various cortical and subcortical structures that behave as predicted43,44,45,46. For example, it is now commonly accepted that dopamine neurons in the midbrain of humans and monkeys encode reward prediction errors46,47,48. Thus, the reinforcement learning model class is typically matched by brain activity. Since the simply Q-learning model considered in this paper belongs to this model class, the model matches brain activity, unlike abstract and unrealistic models without an empirical foundation.

One of the managerial implications derived from this study is that group size is crucial to the management of small groups. In particular, when groups undertake learning under uncertainty without the burden of creativity and insight, triads, rather than tetrads, should be selected. However, when tasks require much creativity and insight, tetrads, rather than triads, might be preferred, although this idea was not examined in this study. In broader contexts, odd-sized groups are favored for learning tasks without creativity and even-sized groups for creative problem-solving25. This rule is clear and straightforward to implement, but, of course, diversity in knowledge, skill, working experiences, cultural backgrounds, and personalities also account for group performance. However, unless managers have sufficient time to take these factors into account, this simple rule should be implemented.

References

1. 1.

Wuchty, S., Jones, B. F. & Uzzi, B. The increasing dominance of teams in production of knowledge. Science 316, 1036–1039. https://doi.org/10.1126/science.1136099 (2007).

2. 2.

Gowers, T. & Nielsen, M. Massively collaborative mathematics. Nature 461, 879–881. https://doi.org/10.1038/461879a (2009).

3. 3.

Lewin, K. In Field theory in social science: selected theoretical papers (ed. Dorwin, C.) (Harpers, 1951).

4. 4.

van Vianen, A. E. M. & De Dreu, C. K. W. Personality in teams: Its relationship to social cohesion, task cohesion, and team performance. Eur. J. Work Organ. Psy. 10, 97–120. https://doi.org/10.1080/13594320143000573 (2001).

5. 5.

Barrick, M. R., Stewart, G. L., Neubert, M. J. & Mount, M. K. Relating member ability and personality to work-team processes and team effectiveness. J. Appl. Psychol. 83, 377–391. https://doi.org/10.1037/0021-9010.83.3.377 (1998).

6. 6.

Fisher, S., Hunter, T. A. & Macrosson, W. D. K. Belbin’s team role theory: for non-managers also?. J. Manag. Psychol. 17, 14–20. https://doi.org/10.1108/02683940210415906 (2002).

7. 7.

De Church, L. A. & Marks, M. A. Leadership in multiteam systems. J Appl Psychol 91, 311–329. https://doi.org/10.1037/0021-9010.91.2.311 (2006).

8. 8.

Gerstner, C. R. & Day, D. V. Meta-Analytic review of leader–member exchange theory: Correlates and construct issues. J. Appl. Psychol. 82, 827–844. https://doi.org/10.1037/0021-9010.82.6.827 (1997).

9. 9.

Beal, D. J., Cohen, R. R., Burke, M. J. & McLendon, C. L. Cohesion and performance in groups: A meta-analytic clarification of construct relations. J. Appl. Psychol. 88, 989–1004. https://doi.org/10.1037/0021-9010.88.6.989 (2003).

10. 10.

Chiocchio, F. & Essiembre, H. Cohesion and performance: A meta-analytic review of disparities between project teams, production teams, and service teams. Small Group Res. 40, 382–420. https://doi.org/10.1177/1046496409335103 (2009).

11. 11.

Mullen, B. & Copper, C. The relation between group cohesiveness and performance: An integration. Psychol. Bull. 115, 210–227. https://doi.org/10.1037/0033-2909.115.2.210 (1994).

12. 12.

Wellins, R. S., Byham, W. C. & Dixon, G. R. Inside Teams (Jossey-Bass, 1994).

13. 13.

Salas, E., Cooke, N. J. & Rosen, M. A. On teams, teamwork, and team performance: Discoveries and developments. Hum. Factors 50, 540–547. https://doi.org/10.1518/001872008x288457 (2008).

14. 14.

Hastie, R. & Kameda, T. The robust beauty of majority rules in group decisions. Psychol. Rev. 112, 494–508. https://doi.org/10.1037/0033-295X.112.2.494 (2005).

15. 15.

Kerr, N. L. & Tindale, R. S. Group performance and decision making. Annu. Rev. Psychol. 55, 623–655. https://doi.org/10.1146/annurev.psych.55.090902.142009 (2004).

16. 16.

Adamowicz, W. et al. Decision strategy and structure in households: A “groups” perspective. Mark. Lett. 16, 387–399. https://doi.org/10.1007/s11002-005-5900-6 (2005).

17. 17.

Tindale, R. S. & Kluwe, K. In The Wiley Blackwell Handbook of Judgment and Decision Making Vol. 2 (eds Gideon, K. & George, W.) 849–874 (John Wiley & Sons, 2015).

18. 18.

Maciejovsky, B. & Budescu, D. V. Collective induction without cooperation? Learning and knowledge transfer in cooperative groups and competitive auctions. J. Pers. Soc. Psychol. 92, 854–870. https://doi.org/10.1037/0022-3514.92.5.854 (2007).

19. 19.

Laughlin, P. R. Group Problem Solving (Princeton University Press, 2011).

20. 20.

Hinsz, V. B. Cognitive and consensus processes in group recognition memory performance. J. Pers. Soc. Psychol. 59, 705–718. https://doi.org/10.1037/0022-3514.59.4.705 (1990).

21. 21.

Morgan, P. M. & Tindale, R. S. Group vs individual performance in mixed-motive situations: Exploring an inconsistency. Organ. Behav. Hum. Decis. Process. 87, 44–65. https://doi.org/10.1006/obhd.2001.2952 (2002).

22. 22.

Nijstad, B. A. & Paulus, P. B. In Group Creativity: Innovation Through Collaboration (eds Paulus, P. B. & Nijstad, B. A.) 326–339 (Oxford University Press, 2003).

23. 23.

Kerr, N. L. & Tindale, R. S. Group-based forecasting?: A social psychological analysis. Int. J. Forecast. 27, 14–40. https://doi.org/10.1016/j.ijforecast.2010.02.001 (2011).

24. 24.

Mellers, B. et al. Psychological strategies for winning a geopolitical forecasting tournament. Psychol. Sci. 25, 1106–1115. https://doi.org/10.1177/0956797614524255 (2014).

25. 25.

Menon, T. & Phillips, K. W. Getting even or being at odds? Cohesion in even- and odd-sized small groups. Organ. Sci. 22, 738–753. https://doi.org/10.1287/orsc.1100.0535 (2011).

26. 26.

Murnighan, J. K. Models of coalition behavior: Game theoretic, social psychological, and political perspectives. Psychol. Bull. 85, 1130–1153. https://doi.org/10.1037/0033-2909.85.5.1130 (1978).

27. 27.

O’Leary, M. B. & Mortensen, M. Go (con)figure: Subgroups, imbalance, and isolates in geographically dispersed teams. Organ. Sci. 21, 115–131. https://doi.org/10.1287/orsc.1090.0434 (2010).

28. 28.

Polzer, J. T., Crisp, C. B., Jarvenpaa, S. L. & Kim, J. W. Extending the faultline model to geographically dispersed teams: How colocated subgroups can impair group functioning. Acad. Manag. J. 49, 679–692. https://doi.org/10.5465/amj.2006.22083024 (2006).

29. 29.

Shears, L. M. Patterns of coalition formation in two games played by male tetrads. Behav. Sci. 12, 130–137. https://doi.org/10.1002/bs.3830120206 (1967).

30. 30.

Asch, S. E. In Groups, Leadership and Men; Research in Human Relations (ed. Guetzkow, H.) 177–190 (Carnegie Press, 1951).

31. 31.

Wittenbaum, G. M., Stasser, G. & Merry, C. J. Tacit coordination in anticipation of small group task completion. J. Exp. Soc. Psychol. 32, 129–152. https://doi.org/10.1006/jesp.1996.0006 (1996).

32. 32.

Stoet, G. PsyToolkit—A software package for programming psychological experiments using Linux. Behav. Res. Methods 42, 1096–1104. https://doi.org/10.3758/BRM.42.4.1096 (2010).

33. 33.

Stoet, G. PsyToolkit: A novel web-based method for running online questionnaires and reaction-time experiments. Teach. Psychol. 44, 24–31. https://doi.org/10.1177/0098628316677643 (2017).

34. 34.

Watkins, C. J. C. H. & Dayan, P. Q-learning. Mach. Learn. 8, 279–292. https://doi.org/10.1007/BF00992698 (1992).

35. 35.

Katahira, K. The statistical structures of reinforcement learning with asymmetric value updates. J. Math. Psychol. 87, 31–45. https://doi.org/10.1016/j.jmp.2018.09.002 (2018).

36. 36.

Palminteri, S., Lefebvre, G., Kilford, E. J. & Blakemore, S.-J. Confirmation bias in human reinforcement learning: Evidence from counterfactual feedback processing. PLoS Comput. Biol. 13, 1–22. https://doi.org/10.1371/journal.pcbi.1005684 (2017).

37. 37.

Aberg, K. C., Doell, K. C. & Schwartz, S. Hemispheric asymmetries in striatal reward responses relate to approach–avoidance learning and encoding of positive–negative prediction errors in dopaminergic midbrain regions. J. Neurosci. 35, 14491–14500. https://doi.org/10.1523/JNEUROSCI.1859-15.2015 (2015).

38. 38.

den Ouden, H. E. M. et al. Dissociable effects of dopamine and serotonin on reversal learning. Neuron 80, 1090–1100. https://doi.org/10.1016/j.neuron.2013.08.030 (2013).

39. 39.

Frank, M. J., Moustafa, A. A., Haughey, H. M., Curran, T. & Hutchison, K. E. Genetic triple dissociation reveals multiple roles for dopamine in reinforcement learning. Proc. Natl. Acad. Sci. U.S.A. 104, 16311–16316. https://doi.org/10.1073/pnas.0706111104 (2007).

40. 40.

Lefebvre, G., Lebreton, M., Meyniel, F., Bourgeois-Gironde, S. & Palminteri, S. Behavioural and neural characterization of optimistic reinforcement learning. Nat. Hum. Behav. 1, 1–9. https://doi.org/10.1038/s41562-017-0067 (2017).

41. 41.

van den Bos, W., Cohen, M. X., Kahnt, T. & Crone, E. A. Striatum-medial prefrontal cortex connectivity predicts developmental changes in reinforcement learning. Cereb. Cortex 22, 1247–1255. https://doi.org/10.1093/cercor/bhr198 (2012).

42. 42.

Harada, T. Learning from success or failure?—Positivity biases revisited. Front. Psychol. 11, 1–11. https://doi.org/10.3389/fpsyg.2020.01627 (2020).

43. 43.

Glimcher, P. W. & Rustichini, A. Neuroeconomics: The consilience of brain and decision. Science 306, 447–452. https://doi.org/10.1126/science.1102566 (2004).

44. 44.

Hikosaka, O., Nakamura, K. & Nakahara, H. Basal ganglia orient eyes to reward. J. Neurophysiol. 95, 567–584. https://doi.org/10.1152/jn.00458.2005 (2006).

45. 45.

Rangel, A., Camerer, C. & Montague, P. R. A framework for studying the neurobiology of value-based decision making. Nat. Rev. Neurosci. 9, 545–556. https://doi.org/10.1038/nrn2357 (2008).

46. 46.

Schultz, W., Dayan, P. & Montague, P. R. A neural substrate of prediction and reward. Science 275, 1593–1599. https://doi.org/10.1126/science.275.5306.1593 (1997).

47. 47.

Bayer, H. M. & Glimcher, P. W. Midbrain dopamine neurons encode a quantitative reward prediction error signal. Neuron 47, 129–141. https://doi.org/10.1016/j.neuron.2005.05.020 (2005).

48. 48.

Cohen, J. Y., Haesler, S., Vong, L., Lowell, B. B. & Uchida, N. Neuron-type-specific signals for reward and punishment in the ventral tegmental area. Nature 482, 85–88. https://doi.org/10.1038/nature10754 (2012).

49. 49.

Simmel, G. The Sociology of Georg Simmel (The Free Press, 1964).

50. 50.

Harada, T. Three heads are better than two: Comparing learning properties and performances across individuals, dyads, and triads through a computational approach. PLoS ONE 16, 1–16. https://doi.org/10.1371/journal.pone.0252122 (2021).

Author information

Authors

Contributions

T.H. wrote the whole manuscript, prepared figures and tables.

Ethics declarations

Competing interests

The author declares no competing interests.

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

Harada, T. Examining learning coherence in group decision-making: triads vs. tetrads. Sci Rep 11, 20461 (2021). https://doi.org/10.1038/s41598-021-00089-w

• Accepted:

• Published:

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.