Johan Kummeneje and Harko Verhagen
Department of Computer and
Systems Sciences
Stockholm University/Royal Institute of Technology
Forum
100, SE-16400 Kista, Sweden
johank@dsv.su.se, verhagen@dsv.su.se
We describe three main phases of multi-agent system education during a ten year period and analyze the practical exercises used during these phases. Combined with a description of the student population, pedagogical principles, and the course aims we draw conclusions as to the suitability of each of the exercises.
In the early 1980's Newell (1982) defined the concept of Agent within
computer science in his presidential speech as the first AAAI president, and
agents and multi-agent systems have since been an area of interest for computer
science students and researchers alike. There is no real consensus on the exact
definition of agent though. The working definition that we will use is that an
agent is a software component that is to some degree autonomous and has a
purpose. How to fully define an agent is a subject that we will not touch upon
in this article. Agents as a concept unite two flows within computer science,
one coming from networking and distributed systems, the other one coming from AI
(or maybe even AI nouveau, with its focus on situated intelligence). The first
category focuses more on the technical aspects of unifying distributed software
processes and more related to problems with an engineering flavour: The second
category is more interested in how autonomous individual decision-making
entities can take system effects into account and is focussed on problems
related to those of the social sciences.
Researchers can spend several years
learning to master the agent area. This is however often not the case with
university students that in the best case can spend 20 weeks writing a thesis on
the subject, but more often students can spend approximately one month of full
time studies. Within these constraints the tutor must often balance between a
theoretical course with little or no programming or a course with little or no
theoretical ambition, but a lot of exercises and experiments. We believe that
there is a possible hybrid, where half the course is spent on theory and the
other half is spent on applying the theories on a domain.
We have since 1997
been part of a course in agent programming (which started out in 1995 as agent
oriented programming) at the Department of Computer and Systems Sciences at
Stockholm University and the Royal Institute of Technology, where we now use
such a hybrid, but on the way passed through several possible problem domains
and corresponding course lay-outs. The main question we will address is what
domains are suitable to use in the education of university students in a month
long course attempting to eat the agent cake and have it too.
The students attending the course are mainly third and fourth year students
as part of their four year Masters level studies or 4.5 years civic engineering
studies. The department is shared between two universities, the Royal Institute
of Technology (KTH) and Stockholm University's Faculty of the Social Sciences
(SU). The students thus differ highly with respect to their background,
interests, and indeed expectations towards the course. The KTH students have a
broader and deeper programming education, are better trained in mathematics, and
have more interest in formal systems. They view agents as technical solutions to
real world problems. The SU students have some programming experience but are
trained to try to see different sides to the story and are comfortable with
conflicting theories. For them, the agent programming course is most interesting
as an example of a multidisciplinary course.
We have over the past ten years
been working on solving this equation of multidisciplinary, theoretical
discussions within the agent research field, and practical hands-on programming
of an agent system. Below we will get into the details of our quest for a
solution, but first some more details about the course.
The course is given
annually as a six ECTS credit course during the spring term and each year
approximately 60 students attend. In the first couple of years, the focus was on
basic agent topics, such as decision making and rationality, and agent
frameworks. After 1997, we changed focus as agent research matured and following
the change in course responsible, with more time spend on coordination and
cooperation, and current agent research topics such as agent-based simulation
and the use of agents in computer games. Half the credits are spend on theories
and are examined via a written exam. The other half consists of one or more
practical assignments. These were examined in different ways in the various
versions of the course.
The course has over the ten years of its history developed in three different phases. The fist phase lasted between 1995 and 2000. In this phase, the students had to work on two different assignments, to make sure that a larger area of MASresearch was covered. Each assignment amounted to approximately 40 hours of work. The assignments were developed or adapted by the agent researchers in our group. The second phase had a focus on a bigger assignment of about 80 hours of work, using the Trading Agent Competition as a domain. Focus was on having a large, current research topic at the heart of the assignment. This phase lasted only one the year. The third phase from 2001 until today also has a focus one larger assignment. The domain is deliberately not part of agent research in general so that the students can put their 80 hours of work on more fundamental agent research topics and also produce a higher quality report.
The first version of the course had students finish two practical exercises. Four different exercises were used for these:
The taxilab consisted of a single dispatcher (the agent) that receives a request for a cab, and it tries to assign the request to a cab in the simulated world. As we mentioned earlier, the focus was on decision algorithms. EaterSoar (2005) was a tutorial developed for teaching Soar that was adapted to our purposes. Students had to program a Soar agent in a Pacman environment, where the agent is the player. The emergence of the RoboCup initiative was embraced in our course. Students, in the first attempt, had to write a short essay, but both the students and the teachers agreed that it was not rewarding to just write an essay on such complex domain. In the next attempt, the students were given a fully implemented team (Kummeneje 2001) and were to in smaller groups make some improvements to the team's behaviour. The intelligent homes assignment was strictly speaking not a practical assignment, at least not in the sense that it involved any programming. In this case, students had to write a report on the use of agents and sensors in intelligent homes research at the time. The programming assignments required no written report; working code was the examination for every pair of students.
The lessons learned from the previous period were:
Thus, we wanted to see if we could limit the practical part of the course to
one assignment. Of course this meant that the assignment had to change from the
ones used in the previous period, these were either too small or too hard to
use. We also wanted to preserve the competition part of some of the practical
assignments since we found that students work harder when they have fun. Since
one of the teachers was partly involved in the then new Trading Agent
Competition (TAC) initiative (Wellman and Wurman, 1999; TAC, 2005), we decided
to give TAC a try. Supervising the lab were two masters students working on
their thesis with TAC.
Unfortunately, this decision turned out to be a
catastrophe. Preparing for the lab, we build upon the assumption that the
previous year's agents (that were made public after the event) could be used as
an inspiration for the students and that they could build upon these. But the
organizers behind TAC decided to change to a new TAC server version. This change
was made in the middle of our course. Since the server could not run locally, we
were stuck. The masters students tried to help our students with the change and
coding consequences, but were unable to deal with the problems and amount of
work within the time frame of the course. Thus we turned to a written report as
an unsatisfactory assignment.
From TAC, we learned that being dependent on forces outside of our control for the practical assignment is not a good idea. We considered the simplified RoboCup idea developed in Biter (Vidal and Buhler, 2002) but had problems getting the software to work, and also felt a bit uncomfortable with getting back to the RoboCup domain with the attached problems. So we decided to develop something ourselves, that we could administer ourselves, and that would run locally, enabling the students to test-run locally as well. Looking for a more or less simple game involving at least 2 players, that most students probably are familiar with, we turned to the game of chess inspired by the chess playing ants of Alexis Drogoul (1995). Chess of course has a long history in AI, but our interest was not to have students developing a good chess program, but instead to have them put together a team as a multi-agent system that together becomes the player on one side of the chessboard. To limit the importance of playing strength (which would give good chess playing students an inappropriate advantage), we changed some of the rules. Complementing the code, each group of about 4 students has to write a scientific report, explaining the ideas they wanted to test with the multi-agent system, how they implement these ideas, and what the results were including recommendations for improvement. All reports and agents from the previous courses are made available at the beginning of the course, so that students can build upon previous ideas and code.
Here we describe all the programming assignments used during the ten years of the course's existence in more detail. Apart from the assignment, we also specify the learning goals and outcomes. We analyze the programming assignment using the classification scheme of Russell and Norvig (2003) for agent systems so we can compare the assignments.
Picture the agent as the central point (dispatcher) in a cab business.
Customers call in and order a cab. To the dispatcher's disposal there are a few
cabs, which can be either occupied or available over time. Available to the
students were the cabs and some stubs for the dispatcher. The task for the
students was to implement the decision algorithm of the dispatcher, which could
be reactive or/and deliberate (planning). As the requests are not known
beforehand in order to simulate a real world, the reactive approach is simple to
implement, and fairly easy to extend with a few special cases. The planning
approach proved to be less used as it would require the students to build
additional modules to be able to for example simulate the future few steps by
extrapolation.
We expected the students to examine different algorithms or
methods to implement the dispatcher. The domain (and the lab) can be
characterized according to (Rusell and Norvig, 2003) as accessible,
deterministic, non-episodic, dynamic, and continuous. Accessible, as the state
of all cabs are at all times known. Deterministic, as when the dispatcher
assigns a task to an available cab, it always will perform it. Non-episodic,
i.e. the domain typically does not repeat itself at the larger scale. Of course,
at the micro level everything can be deemed as episodic. The system has several
cabs that continuously will be performing tasks, or being available, as well as
it has no predefined discrete steps of time, we will characterize it as
dynamic.
The students experienced the lab as quite pointless as too much
focus was put on the programming language (Tcl/Tk), which was new to the
students, as well as the algorithm could be very simple and you still finished
the assignment. We also noted a lack of enthusiasm among the students as it is
neither especially interesting nor real-world like. Most important is also that
the domain had no clear metric by which success could be measured, that is, the
students could not assess whether their implementation was doing well or even
get indications along the lines on how to improve their design.
Alan Mackworth (1993) proposed that the domain of robotic soccer could be the
follow-up of chess as the "Grand Challenge of AI", since Deep Blue (admittedly
not really AI) played better than the human chess master, became a reality for
agent researchers to move on to as Kitano et al. (1995) launched RoboCup.
RoboCup consists of several robotic domains as well as one simulated league.
Since our focus is on agent system and our students have no previous knowledge
on robotics, we are naturally limited to the simulated league.
The simulated
league consists of one server (preferably run on Linux) to which 22 agents (or
process if you like) connect over standard IP-sockets. During the game the
server gives updates to the agents that are allowed to act every 100
milliseconds.
The choice of soccer as the domain is natural as it is well
known in all parts of the world, and it is a dynamic environment with several
players doing things simultaneously which makes it non-accessible as the
environment is limited by physical parameters such as a cone of view and not
being able to see forever.
As the domain is modelled to simulate the real
world, each action does not deterministically lead to a given consequence, i.e.
if you hit the ball in a certain angle it is not always the case that it will
end up exactly where you intended it to, depending on the randomness variable in
the simulation server.
It can be argued that the environment consists of
episodes at not only the micro level, but also on a higher level such as
situations as "the passing situation", "the scoring situation". We consider the
domain to be semi-episodic as the situations are not exactly alike every time. A
real world domain is by nature continuous, however the simulated league has
introduced a programmatically limitation of using simulation cycles, which makes
the simulated league discrete, while the others are continuous.
As the
assignment for the students on the course, they were given the team UBU
(Kummeneje 2001), which they were to extend and/or improve in order for the team
to play better, or explore a novel technology by implementing it into the team.
We expected the students to implement interesting research grade algorithms
that would give a contribution both to the knowledge of the students, as well as
contribute to the codebase of the team that was supposed to be competing in the
World Cup Tournaments held annually.
From the experiences with this
assignment we can conclude that it is not possible for the students to give a
significant contribution in such a short time given that the codebase exceeds
10,000 lines of code. As Heintz et al (2000) noted that if the time and level of
the students are sufficiently high, it can be rewarding for both teachers and
students to use RoboCup. Our experience in hindsight is that the effort required
from the students and the teachers are not worth the effort for three ECTS
credits.
The EaterSoar lab was originally developed by the development team behind
Soar for an AAAI tutorial in 1997 and is still in use in Soar tutorials. We
adapted this tutorial for our purposes. Soar is a general cognitive architecture
for developing systems that exhibit intelligent behaviour and can be seen as a
computational implementation of the ideas of (mainly) Allen Newell (1990). Soar
has been developed since 1983.
In the EaterSoar tutorial, the students use
and extend a simple agent written in Soar. The agent plays an interactive game
called "Eaters" (hence the name) which is essentially the well-known Pacman
game. Instead of eating "powerpills" (energy packages placed in the maze) while
avoiding the evil ghosts, in this version there are various agents competing for
the powerpills, which come in 2 different values. Another addition is the
possibility to fight another agent, which results in a random placement in the
maze, and an equal division of the total number of points both agents had before
the fight. Furthermore, agents can move and jump (a larger displacement which
costs more energy). The agent with the highest amount of points wins the game.
The students were provided with an agent that randomly chooses between jump
and move to a spot it can reach, regardless of the spots contents (other than
avoiding the walls of the maze). This agent had to be extended with decision
making rules build upon strategy rather than chance. This meant that an
evaluation function had to be developed as well. In order to predict fights,
agents also needed a model of their opponent's choice. In terms of Russell &
Norvig's schema, the environment is accessible, non-deterministic, non-episodic,
dynamic, and discrete. Thus it is quite a complicated situation. Luckily, the
agents are autonomous with respect to each other, no need for coordination in
the search neither for a higher, non-individual goal nor for communication.
Another issue is the syntax and semantics of the Soar language. Soar is a rule
based language but it is not easily learned. The Soar project has seen several
tools aimed at flattening the learning curve, without much success though. Given
that students only had 40 hours to master Soar and solve the assignment, the lab
was doomed to fail. Its positive points were the capabilities to have agents
play each other, and the possibility to test play against a non-intelligent
agent. Most students did master to program in Soar but did not succeed in
developing a satisfactory decision making module.
The Trading Agent Competition (TAC) is an annual contest which started in
2000, where researchers and students can try their trading agents in the noble
art of booking hotels, travel tickets etc, at the best possible rate. In the
terms of the students' task, they should in small teams implement such agents
and participate in the contest. As in the previous domains our expectation was
for the students to be able to implement different algorithms for agents and
possibly produce agents that could participate in the annual contest. The domain
is not entirely about getting the best possible rate, but getting it in time as
well, which in many cases would boil down to trying to close the deals just
before the timeframe closes for the auction.
We characterize the domain as
being non-accessible as not every other player is fully known, nor all aspects
of the particular auction. This in turn leads to the characterization of the
domain as non-deterministic and dynamic as well as continuous. The episodic part
is quite evident as the same situation (the auction) keeps repeating itself over
the tournament.
Unfortunately, in mid-course the TAC-group decided to change
several server parameters, which rendered the students efforts useless, and we
resorted to examine the lab based on a written report, to both teachers and
students disappointment.
The most important thing we learnt from this
incident was that never use something that cannot be run completely in-house and
you cannot control fully.
Our latest shot at the hybrid course is a domain we named ChessMAS. In
essence, it is a chess game which two players participate in. Each of the
players consists of a separate MAS. As the MAS can be constituted in a number of
ways, it opens up for strategy agents, as well as an agent for each piece on the
board (Drogoul, 1995); however for our task we defined the MAS to be consisting
of at least 3 agents that communicates over a blackboard. The teams communicate
with the server by a CommunicatorAgent. To eliminate the skills of the best
chess players in the course, we have removed some of the more advanced draws,
such as au passant, and also reversed the positions of the king and queen. This
does not to any particular extent alter the basic domain. Each draw must be made
within 30 seconds or the draw is transferred to the opposing team, and the game
is not more than 15 minutes. We have chosen these time limits to force the
students to take the time aspects into consideration while not having to focus
fully on it. To get the students started they receive the server, the
blackboard, the CommunicatorAgent as well a code stub for the communication with
the blackboard.
Simply put for the students, their task is to create a MAS
that will play a game of chess against another MAS created by another group of
students. As part of the examination we play a tournament, and each group are
required to write a report on the algorithms they have used, with an emphasis on
the novel parts of their system.
Chess is an accessible and deterministic
domain, as it has only 64 positions that the 32 pieces can be in, and each piece
can only do a few different actions (totalling to a maximum of 130 possible
moves at any time), which always will be performed. The game is static and
discrete as nothing changes when a team deliberates over the next draw, and as
in real chess the game is played in turns each ended by moving a piece.
Our
experience is that all the students are enjoying the task, as each year there
are 10-15 (leaving only a handful of students not completing the task) teams
finishing the course. We encourage the students to borrow code and ideas (with
due credit in the reports) from previous years and improve upon these. This
enables the students to focus on the social parts of agents, as they not
necessarily have to implement every part of the agents by themselves.
However, as Kummeneje (2001) and others have noted, there is a risk in each
competitive domain that the teams forget about doing research and focus upon
trimming the teams to only play better chess or soccer. We believe that the risk
is worth taking in this case as we have had only one or two case (of
approximately 35 teams) that had over-focused on the chess playing instead of
agent technology.
During the spring semester of 2005, we conducted a small
survey to gather the students' points of views on the good and not so good parts
of using the domain as an assignment. The most occurring and positive aspect
among the students were that the tournament added incentive for the students to
finish their work and also try to do as good as possible. Second to that, they
noted that the domain was fun and the problems interesting, as well as there
were a good support in the availability of previous years teams and reports. But
on the other hand, they also said that the task was to time-consuming to get
into (software etc), and it left not enough time to actually get into the
high-level parts of multi agent coordination. The rest of the comments given
were mainly to administrative issues, for example that the assignment could have
been introduced earlier on course.
The criticisms given by the students are
alike those given when we used the RoboCup domain as assignment, but the
strength of the negative criticisms has been greatly reduced when comparing the
criticisms of the RoboCup assignment and the ChessMAS assignment.
We have presented our view on a successful hybrid course on MAS, and have in
the process discussed a number of domains and their characteristics (see table
1).
| Domain | Accessible | Deterministic | Episodic | Static | Discrete |
| Taxilab | Yes | Yes | No | No | No |
| RoboCup | No | No | No | No | No |
| EaterSOAR | Yes | No | No | No | Yes |
| TAC | No | No | Yes | No | No |
| ChessMAS | Yes | Yes | No | Yes | Yes |
Drogoul, A. When ants play chess. In C. Castelfranchi and J. P. Muller,
editors, From Reaction to Cognition, 13-27. Springer-Verlag,
1995.
SoarTurotial, http://www.eecs.umich.edu/~soar/tutorial.html or
http://sourceforge.net/projects/soar/. Last checked May 19th, 2005.
Heintz,
F., Kummeneje, J., and Scerri, P. Simulated RoboCup in University Undergraduate
Education. In Proceedings of the fourth international workshop on RoboCup at the
4th RoboCup World Championships, 2000.
Kitano, H., Asada, M., Kuniyoshi, Y.,
Noda, I., and Osawa, E. RoboCup: The Robot World Cup Initiative. In Proceedings
of IJCAI-95 Workshop on Entertainment and AI/Alife, 19-24, 1995.
Kummeneje,
J. RoboCup as a Means to Research, Education, and Dissemination, Ph. Lic.
thesis, Department of Computer and Systems Sciences, Stockholm University and
the Royal Institute of Technology, 2001.
Mackworth, A. On Seeing Robots. In:
Computer Vision: System, Theory, and Applications, pages 1-13, World Scientific
Press, Singapore, 1993
Newell, A. The Knowledge Level. Artificial
Intelligence, 18, 87-127, 1982.
Newell, A. Unified Theories of Cognition,
Harvard University Press, 1990.
Russell S., and Norvig, P. Artificial
Intelligence: A Modern Approach. Prentice Hall, 2003.
TAC,
http://www.sics.se/tac/page.php?id=1. Last checked May 21st, 2005.
Vidal, J.
M., and Buhler, P. Teaching Multiagent Systems using RoboCup and Biter.
Interactive Multimedia Electric Journal of Computer-Enhanced Learning, 4, 2
(October, 2002.). Online at: http://imej.wfu.edu/articles/2002/2/05/index.asp.
Last checked May 21st, 2005.
Wellman, M. P., and Wurman, P. R. A Trading
Agent Competition for the Research Community. IJCAI-99 Workshop on
Agent-Mediated Electronic Commerce. Stockholm, August 1999.