Notes
Slide Show
Outline
1
Machine Learning Approaches
for UAV Coordination
(and Adaptation)
  • Yoav Shoham & Kevin Leyton-Brown
  • Dept. of Computer Science
  • Stanford University
2
Critical Elements
3
Critical Elements (1)
4
Critical Elements (1)
5
Dispersion Games
  • A novel class of games in which agents attempt to disperse across actions to maximize a utility function
    • cooperative: all agents have the same utility function
    • agents are assumed to have neither a central coordinator nor any direct communication with other agents
    • related to coordination games (anti-coordination)
    • agents can observe action of {no; some; all} other agents
      • algorithms for each case
6
Experimental Results
  • Example:
  • agents choose tasks from an available set
  • utility maximized when all agents choose different tasks, each task is covered by one agent
7
Dispersion Games: Extensions
  • Our past work on dispersion games only analyzed the case of symmetric agents and tasks, uniform cost for each task
  • Non-uniform task costs:
    • action space expanded to all subsets of tasks with size between 1 and max
    • utility function =
  • Other possible extensions:
    • restrictions on which agents can service which tasks
    • some tasks must be serviced by multiple agents
  • Fundamentally, this approach is best suited to problems in without initial coordination or communication
8
Non-Cooperative Setting
  • Consider the task of tracking intelligent, moving targets
  • Reasonable to expect that these targets would be able to adapt to fixed strategies to reduce UAV effectiveness
    • If searches are done by day, move at night
    • If searches are on predictable schedule, move between them
  • Mobile targets thus give rise to a (partly) competitive setting
    • UAVs relate cooperatively with each other
    • UAVs relate competitively with the targets
9
Critical Elements (2)
10
Critical Elements (2)
11
Learning in Multi-agent Systems
  • Recently, many attempts to extend the techniques of reinforcement learning to multi-agent competitive settings
    • allow for the possibility of other agents working directly against the learner
    • not just a static, stochastic environment
  • Unfortunately most of this work is limited to small and sometimes unnatural subsets from the domain of multi-agent problems
12
Overview of Related Publications
13
Learning in Multi-agent Systems
  • We are currently developing a framework for categorizing problems in multi-agent learning
    • paper to become available early 2003
  • This framework will provide a number of advantages:
    • Clearer understanding of the space of problems and where existing algorithms are most effective
    • Ability to map the dimensions in the UAV domain into this framework
      • make tradeoffs between flexibility and performance more explicit
    • A set of metrics for comparing the performance of proposed algorithms within a problem domain
14
Critical Elements (3)
15
Critical Elements (3)
16
Centralized Cooperation
  • Well-studied OR problem: Vehicle Routing (Multiple Traveling Salesmen)
  • Existing Vehicle Routing variants handle:
    • uncertainty about location of tasks
    • UAVs starting from different locations
    • tasks have different costs
    • tasks must be serviced within time constraints
  • May be useful for us:
    • to solve simple problems
    • to show how we improve over existing technology
    • incorporate these algorithms as building blocks in a MAS
17
Critical Elements (4)
18
Critical Elements (4)
19
Empirical Hardness
  • Use machine learning to build a model of the empirical running time of an optimization algorithm, using features of problem structure
    • case study on combinatorial auction winner determination problem
      • running times vary from roughly 0.01s to 10000s
      • we can predict log running time with mean error of roughly 0.25
        • e.g., actual running time of 102 (100s) will have expected prediction between 101.75 (56s) and 102.25 (178s)
    • we can analyze the learned model to determine which features are most important
  • Current extensions:
    • tuning test distributions for computational hardness
    • building algorithm portfolios using learned models
20
Empirical Hardness
  • These techniques can be useful for UAV problems
    • before sending out UAVs, estimate how long it will take to solve the centralized Vehicle Routing problem
      • if it will take 5 minutes, delay them and solve centrally
      • if it will take 3 days, use a distributed algorithm instead
    • Once a UAV has been assigned a set of tasks, it must decide on an order in which to visit them (TSP)
      • if the TSP can be solved quickly, do so
      • if not, use a greedy strategy, local search, etc.
    • Algorithm portfolios can be used to choose between multiple optimization algorithms
21
Conclusions
  • Dispersion games can be effective for large problems with restricted communication
  • Multiagent reinforcement learning offers a way to tackle intelligent, mobile adversaries
  • Shouldn’t ignore centralized approaches for simple problems
  • Models of empirical hardness can help with many optimization problems