|
1
|
- Yoav Shoham & Kevin Leyton-Brown
- Dept. of Computer Science
- Stanford University
|
|
2
|
|
|
3
|
|
|
4
|
|
|
5
|
- A novel class of games in which agents attempt to disperse across
actions to maximize a utility function
- cooperative: all agents have the same utility function
- agents are assumed to have neither a central coordinator nor any direct
communication with other agents
- related to coordination games (anti-coordination)
- agents can observe action of {no; some; all} other agents
|
|
6
|
- Example:
- agents choose tasks from an available set
- utility maximized when all agents choose different tasks, each task is
covered by one agent
|
|
7
|
- Our past work on dispersion games only analyzed the case of symmetric
agents and tasks, uniform cost for each task
- Non-uniform task costs:
- action space expanded to all subsets of tasks with size between 1 and
max
- utility function =
- Other possible extensions:
- restrictions on which agents can service which tasks
- some tasks must be serviced by multiple agents
- Fundamentally, this approach is best suited to problems in without
initial coordination or communication
|
|
8
|
- Consider the task of tracking intelligent, moving targets
- Reasonable to expect that these targets would be able to adapt to fixed
strategies to reduce UAV effectiveness
- If searches are done by day, move at night
- If searches are on predictable schedule, move between them
- Mobile targets thus give rise to a (partly) competitive setting
- UAVs relate cooperatively with each other
- UAVs relate competitively with the targets
|
|
9
|
|
|
10
|
|
|
11
|
- Recently, many attempts to extend the techniques of reinforcement
learning to multi-agent competitive settings
- allow for the possibility of other agents working directly against the
learner
- not just a static, stochastic environment
- Unfortunately most of this work is limited to small and sometimes
unnatural subsets from the domain of multi-agent problems
|
|
12
|
|
|
13
|
- We are currently developing a framework for categorizing problems in
multi-agent learning
- paper to become available early 2003
- This framework will provide a number of advantages:
- Clearer understanding of the space of problems and where existing
algorithms are most effective
- Ability to map the dimensions in the UAV domain into this framework
- make tradeoffs between flexibility and performance more explicit
- A set of metrics for comparing the performance of proposed algorithms
within a problem domain
|
|
14
|
|
|
15
|
|
|
16
|
- Well-studied OR problem: Vehicle Routing (Multiple Traveling Salesmen)
- Existing Vehicle Routing variants handle:
- uncertainty about location of tasks
- UAVs starting from different locations
- tasks have different costs
- tasks must be serviced within time constraints
- May be useful for us:
- to solve simple problems
- to show how we improve over existing technology
- incorporate these algorithms as building blocks in a MAS
|
|
17
|
|
|
18
|
|
|
19
|
- Use machine learning to build a model of the empirical running time of
an optimization algorithm, using features of problem structure
- case study on combinatorial auction winner determination problem
- running times vary from roughly 0.01s to 10000s
- we can predict log running time with mean error of roughly 0.25
- e.g., actual running time of 102 (100s) will have expected
prediction between 101.75 (56s) and 102.25 (178s)
- we can analyze the learned model to determine which features are most
important
- Current extensions:
- tuning test distributions for computational hardness
- building algorithm portfolios using learned models
|
|
20
|
- These techniques can be useful for UAV problems
- before sending out UAVs, estimate how long it will take to solve the
centralized Vehicle Routing problem
- if it will take 5 minutes, delay them and solve centrally
- if it will take 3 days, use a distributed algorithm instead
- Once a UAV has been assigned a set of tasks, it must decide on an order
in which to visit them (TSP)
- if the TSP can be solved quickly, do so
- if not, use a greedy strategy, local search, etc.
- Algorithm portfolios can be used to choose between multiple
optimization algorithms
|
|
21
|
- Dispersion games can be effective for large problems with restricted
communication
- Multiagent reinforcement learning offers a way to tackle intelligent,
mobile adversaries
- Shouldn’t ignore centralized approaches for simple problems
- Models of empirical hardness can help with many optimization problems
|