Multiagent Systems
See recent articles
Showing new listings for Monday, 27 January 2025
- [1] arXiv:2501.14111 [pdf, html, other]
-
Title: Collaborating in a competitive world: Heterogeneous Multi-Agent Decision Making in Symbiotic Supply Chain EnvironmentsSubjects: Multiagent Systems (cs.MA)
Supply networks require collaboration in a competitive environment. To achieve this, nodes in the network often form symbiotic relationships as they can be adversely effected by the closure of companies in the network, especially where products are niche. However, balancing support for other nodes in the network against profit is challenging. Agents are increasingly being explored to define optimal strategies in these complex networks. However, to date much of the literature focuses on homogeneous agents where a single policy controls all of the nodes. This isn't realistic for many supply chains as this level of information sharing would require an exceptionally close relationship. This paper therefore compares the behaviour of this type of agent to a heterogeneous structure, where the agents each have separate polices, to solve the product ordering and pricing problem. An approach to reward sharing is developed that doesn't require sharing profit. The homogenous and heterogeneous agents exhibit different behaviours, with the homogenous retailer retaining high inventories and witnessing high levels of backlog while the heterogeneous agents show a typical order strategy. This leads to the heterogeneous agents mitigating the bullwhip effect whereas the homogenous agents do not. In the high demand environment, the agent architecture dominates performance with the Soft Actor-Critic (SAC) agents outperforming the Proximal Policy Optimisation (PPO) agents. Here, the factory controls the supply chain. In the low demand environment the homogenous agents outperform the heterogeneous agents. Control of the supply chain shifts significantly, with the retailer outperforming the factory by a significant margin.
- [2] arXiv:2501.14488 [pdf, html, other]
-
Title: Breaking the Pre-Planning Barrier: Real-Time Adaptive Coordination of Mission and Charging UAVs Using Graph Reinforcement LearningSubjects: Multiagent Systems (cs.MA)
Unmanned Aerial Vehicles (UAVs) are pivotal in applications such as search and rescue and environmental monitoring, excelling in intelligent perception tasks. However, their limited battery capacity hinders long-duration and long-distance missions. Charging UAVs (CUAVs) offers a potential solution by recharging mission UAVs (MUAVs), but existing methods rely on impractical pre-planned routes, failing to enable organic cooperation and limiting mission efficiency. We introduce a novel multi-agent deep reinforcement learning model named \textbf{H}eterogeneous \textbf{G}raph \textbf{A}ttention \textbf{M}ulti-agent Deep Deterministic Policy Gradient (HGAM), designed to dynamically coordinate MUAVs and CUAVs. This approach maximizes data collection, geographical fairness, and energy efficiency by allowing UAVs to adapt their routes in real-time to current task demands and environmental conditions without pre-planning. Our model uses heterogeneous graph attention networks (GATs) to present heterogeneous agents and facilitate efficient information exchange. It operates within an actor-critic framework. Simulation results show that our model significantly improves cooperation among heterogeneous UAVs, outperforming existing methods in several metrics, including data collection rate and charging efficiency.
New submissions (showing 2 of 2 entries)
- [3] arXiv:2501.13946 (cross-list from cs.CL) [pdf, html, other]
-
Title: Hallucination Mitigation using Agentic AI Natural Language-Based FrameworksComments: 18 pages, 6 figuresSubjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Multiagent Systems (cs.MA)
Hallucinations remain a significant challenge in current Generative AI models, undermining trust in AI systems and their reliability. This study investigates how orchestrating multiple specialized Artificial Intelligent Agents can help mitigate such hallucinations, with a focus on systems leveraging Natural Language Processing (NLP) to facilitate seamless agent interactions. To achieve this, we design a pipeline that introduces over three hundred prompts, purposefully crafted to induce hallucinations, into a front-end agent. The outputs are then systematically reviewed and refined by second- and third-level agents, each employing distinct large language models and tailored strategies to detect unverified claims, incorporate explicit disclaimers, and clarify speculative content. Additionally, we introduce a set of novel Key Performance Indicators (KPIs) specifically designed to evaluate hallucination score levels. A dedicated fourth-level AI agent is employed to evaluate these KPIs, providing detailed assessments and ensuring accurate quantification of shifts in hallucination-related behaviors. A core component of this investigation is the use of the OVON (Open Voice Network) framework, which relies on universal NLP-based interfaces to transfer contextual information among agents. Through structured JSON messages, each agent communicates its assessment of the hallucination likelihood and the reasons underlying questionable content, thereby enabling the subsequent stage to refine the text without losing context. The results demonstrate that employing multiple specialized agents capable of interoperating with each other through NLP-based agentic frameworks can yield promising outcomes in hallucination mitigation, ultimately bolstering trust within the AI community.
- [4] arXiv:2501.14170 (cross-list from cs.LG) [pdf, html, other]
-
Title: Argos: Agentic Time-Series Anomaly Detection with Autonomous Rule Generation via Large Language ModelsSubjects: Machine Learning (cs.LG); Distributed, Parallel, and Cluster Computing (cs.DC); Multiagent Systems (cs.MA)
Observability in cloud infrastructure is critical for service providers, driving the widespread adoption of anomaly detection systems for monitoring metrics. However, existing systems often struggle to simultaneously achieve explainability, reproducibility, and autonomy, which are three indispensable properties for production use. We introduce Argos, an agentic system for detecting time-series anomalies in cloud infrastructure by leveraging large language models (LLMs). Argos proposes to use explainable and reproducible anomaly rules as intermediate representation and employs LLMs to autonomously generate such rules. The system will efficiently train error-free and accuracy-guaranteed anomaly rules through multiple collaborative agents and deploy the trained rules for low-cost online anomaly detection. Through evaluation results, we demonstrate that Argos outperforms state-of-the-art methods, increasing $F_1$ scores by up to $9.5\%$ and $28.3\%$ on public anomaly detection datasets and an internal dataset collected from Microsoft, respectively.
- [5] arXiv:2501.14189 (cross-list from cs.AI) [pdf, html, other]
-
Title: Distributed Multi-Agent Coordination Using Multi-Modal Foundation ModelsSubjects: Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Multiagent Systems (cs.MA)
Distributed Constraint Optimization Problems (DCOPs) offer a powerful framework for multi-agent coordination but often rely on labor-intensive, manual problem construction. To address this, we introduce VL-DCOPs, a framework that takes advantage of large multimodal foundation models (LFMs) to automatically generate constraints from both visual and linguistic instructions. We then introduce a spectrum of agent archetypes for solving VL-DCOPs: from a neuro-symbolic agent that delegates some of the algorithmic decisions to an LFM, to a fully neural agent that depends entirely on an LFM for coordination. We evaluate these agent archetypes using state-of-the-art LLMs (large language models) and VLMs (vision language models) on three novel VL-DCOP tasks and compare their respective advantages and drawbacks. Lastly, we discuss how this work extends to broader frontier challenges in the DCOP literature.
- [6] arXiv:2501.14653 (cross-list from cs.LG) [pdf, html, other]
-
Title: Federated Domain Generalization with Data-free On-server Gradient MatchingComments: 26 pages, 15 figures, ICLRSubjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Distributed, Parallel, and Cluster Computing (cs.DC); Multiagent Systems (cs.MA)
Domain Generalization (DG) aims to learn from multiple known source domains a model that can generalize well to unknown target domains. One of the key approaches in DG is training an encoder which generates domain-invariant representations. However, this approach is not applicable in Federated Domain Generalization (FDG), where data from various domains are distributed across different clients. In this paper, we introduce a novel approach, dubbed Federated Learning via On-server Matching Gradient (FedOMG), which can \emph{efficiently leverage domain information from distributed domains}. Specifically, we utilize the local gradients as information about the distributed models to find an invariant gradient direction across all domains through gradient inner product maximization. The advantages are two-fold: 1) FedOMG can aggregate the characteristics of distributed models on the centralized server without incurring any additional communication cost, and 2) FedOMG is orthogonal to many existing FL/FDG methods, allowing for additional performance improvements by being seamlessly integrated with them. Extensive experimental evaluations on various settings to demonstrate the robustness of FedOMG compared to other FL/FDG baselines. Our method outperforms recent SOTA baselines on four FL benchmark datasets (MNIST, EMNIST, CIFAR-10, and CIFAR-100), and three FDG benchmark datasets (PACS, VLCS, and OfficeHome).
- [7] arXiv:2501.14654 (cross-list from cs.LG) [pdf, html, other]
-
Title: MedAgentBench: Dataset for Benchmarking LLMs as Agents in Medical ApplicationsSubjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Multiagent Systems (cs.MA)
Recent large language models (LLMs) have demonstrated significant advancements, particularly in their ability to serve as agents thereby surpassing their traditional role as chatbots. These agents can leverage their planning and tool utilization capabilities to address tasks specified at a high level. However, a standardized dataset to benchmark the agent capabilities of LLMs in medical applications is currently lacking, making the evaluation of LLMs on complex tasks in interactive healthcare environments challenging. To address this gap, we introduce MedAgentBench, a broad evaluation suite designed to assess the agent capabilities of large language models within medical records contexts. MedAgentBench encompasses 100 patient-specific clinically-derived tasks from 10 categories written by human physicians, realistic profiles of 100 patients with over 700,000 data elements, a FHIR-compliant interactive environment, and an accompanying codebase. The environment uses the standard APIs and communication infrastructure used in modern EMR systems, so it can be easily migrated into live EMR systems. MedAgentBench presents an unsaturated agent-oriented benchmark that current state-of-the-art LLMs exhibit some ability to succeed at. The best model (GPT-4o) achieves a success rate of 72%. However, there is still substantial space for improvement to give the community a next direction to optimize. Furthermore, there is significant variation in performance across task categories. MedAgentBench establishes this and is publicly available at this https URL , offering a valuable framework for model developers to track progress and drive continuous improvements in the agent capabilities of large language models within the medical domain.
Cross submissions (showing 5 of 5 entries)
- [8] arXiv:2409.06888 (replaced) [pdf, html, other]
-
Title: A Quality Diversity Method to Automatically Generate Multi-Agent Path Finding Benchmark MapsCheng Qian, Yulun Zhang, Varun Bhatt, Matthew Christopher Fontaine, Stefanos Nikolaidis, Jiaoyang LiComments: 13 pages, 18 figuresSubjects: Multiagent Systems (cs.MA)
We use the Quality Diversity (QD) algorithm with Neural Cellular Automata (NCA) to generate benchmark maps for Multi-Agent Path Finding (MAPF) algorithms. Previously, MAPF algorithms are tested using fixed, human-designed benchmark maps. However, such fixed benchmark maps have several problems. First, these maps may not cover all the potential failure scenarios for the algorithms. Second, when comparing different algorithms, fixed benchmark maps may introduce bias leading to unfair comparisons between algorithms. Third, since researchers test new algorithms on a small set of fixed benchmark maps, the design of the algorithms may overfit to the small set of maps. In this work, we take advantage of the QD algorithm to (1) generate maps with patterns to comprehensively understand the performance of MAPF algorithms, (2) be able to make fair comparisons between two MAPF algorithms, providing further information on the selection between two algorithms and on the design of the algorithms. Empirically, we employ this technique to generate diverse benchmark maps to evaluate and compare the behavior of different types of MAPF algorithms, including search-based, priority-based, rule-based, and learning-based algorithms. Through both single-algorithm experiments and comparisons between algorithms, we identify patterns where each algorithm excels and detect disparities in runtime or success rates between different algorithms.
- [9] arXiv:2402.16201 (replaced) [pdf, html, other]
-
Title: Honeybee: Byzantine Tolerant Decentralized Peer Sampling with Verifiable Random WalksComments: 29 pages; acmsmall-confSubjects: Networking and Internet Architecture (cs.NI); Cryptography and Security (cs.CR); Distributed, Parallel, and Cluster Computing (cs.DC); Data Structures and Algorithms (cs.DS); Multiagent Systems (cs.MA)
Popular blockchains today have hundreds of thousands of nodes and need to be able to support sophisticated scaling solutions$\unicode{x2013}$such as sharding, data availability sampling, and layer-2 methods. Designing secure and efficient peer-to-peer (p2p) networking protocols at these scales to support the tight demands of the upper layer crypto-economic primitives is a highly non-trivial endeavor. We identify decentralized, uniform random sampling of nodes as a fundamental capability necessary for building robust p2p networks in emerging blockchain networks. Sampling algorithms used in practice today (primarily for address discovery) rely on either distributed hash tables (e.g., Kademlia) or sharing addresses with neighbors (e.g., GossipSub), and are not secure in a Sybil setting. We present Honeybee, a decentralized algorithm for sampling nodes that uses verifiable random walks and peer consistency checks. Honeybee is secure against attacks even in the presence of an overwhelming number of Byzantine nodes (e.g., $\geq50\%$ of the network). We evaluate Honeybee through experiments and show that the quality of sampling achieved by Honeybee is significantly better compared to the state-of-the-art. Our proposed algorithm has implications for network design in both full nodes and light nodes.
- [10] arXiv:2501.10388 (replaced) [pdf, html, other]
-
Title: Beyond the Sum: Unlocking AI Agents Potential Through Market ForcesComments: 20 pages, 5 figuresSubjects: Computers and Society (cs.CY); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Science and Game Theory (cs.GT); Multiagent Systems (cs.MA)
The emergence of Large Language Models has fundamentally transformed the capabilities of AI agents, enabling a new class of autonomous agents capable of interacting with their environment through dynamic code generation and execution. These agents possess the theoretical capacity to operate as independent economic actors within digital markets, offering unprecedented potential for value creation through their distinct advantages in operational continuity, perfect replication, and distributed learning capabilities. However, contemporary digital infrastructure, architected primarily for human interaction, presents significant barriers to their participation.
This work presents a systematic analysis of the infrastructure requirements necessary for AI agents to function as autonomous participants in digital markets. We examine four key areas - identity and authorization, service discovery, interfaces, and payment systems - to show how existing infrastructure actively impedes agent participation. We argue that addressing these infrastructure challenges represents more than a technical imperative; it constitutes a fundamental step toward enabling new forms of economic organization. Much as traditional markets enable human intelligence to coordinate complex activities beyond individual capability, markets incorporating AI agents could dramatically enhance economic efficiency through continuous operation, perfect information sharing, and rapid adaptation to changing conditions. The infrastructure challenges identified in this work represent key barriers to realizing this potential.