Teaching Agent Programming to a Hybrid Student Population

Johan Kummeneje and Harko Verhagen
Department of Computer and Systems Sciences
Stockholm University/Royal Institute of Technology
Forum 100, SE-16400 Kista, Sweden
johank@dsv.su.se, verhagen@dsv.su.se

ABSTRACT

We describe three main phases of multi-agent system education during a ten year period and analyze the practical exercises used during these phases. Combined with a description of the student population, pedagogical principles, and the course aims we draw conclusions as to the suitability of each of the exercises.

1. Introduction

In the early 1980's Newell (1982) defined the concept of Agent within computer science in his presidential speech as the first AAAI president, and agents and multi-agent systems have since been an area of interest for computer science students and researchers alike. There is no real consensus on the exact definition of agent though. The working definition that we will use is that an agent is a software component that is to some degree autonomous and has a purpose. How to fully define an agent is a subject that we will not touch upon in this article. Agents as a concept unite two flows within computer science, one coming from networking and distributed systems, the other one coming from AI (or maybe even AI nouveau, with its focus on situated intelligence). The first category focuses more on the technical aspects of unifying distributed software processes and more related to problems with an engineering flavour: The second category is more interested in how autonomous individual decision-making entities can take system effects into account and is focussed on problems related to those of the social sciences.
Researchers can spend several years learning to master the agent area. This is however often not the case with university students that in the best case can spend 20 weeks writing a thesis on the subject, but more often students can spend approximately one month of full time studies. Within these constraints the tutor must often balance between a theoretical course with little or no programming or a course with little or no theoretical ambition, but a lot of exercises and experiments. We believe that there is a possible hybrid, where half the course is spent on theory and the other half is spent on applying the theories on a domain.
We have since 1997 been part of a course in agent programming (which started out in 1995 as agent oriented programming) at the Department of Computer and Systems Sciences at Stockholm University and the Royal Institute of Technology, where we now use such a hybrid, but on the way passed through several possible problem domains and corresponding course lay-outs. The main question we will address is what domains are suitable to use in the education of university students in a month long course attempting to eat the agent cake and have it too.

2. The students and the course

The students attending the course are mainly third and fourth year students as part of their four year Masters level studies or 4.5 years civic engineering studies. The department is shared between two universities, the Royal Institute of Technology (KTH) and Stockholm University's Faculty of the Social Sciences (SU). The students thus differ highly with respect to their background, interests, and indeed expectations towards the course. The KTH students have a broader and deeper programming education, are better trained in mathematics, and have more interest in formal systems. They view agents as technical solutions to real world problems. The SU students have some programming experience but are trained to try to see different sides to the story and are comfortable with conflicting theories. For them, the agent programming course is most interesting as an example of a multidisciplinary course.
We have over the past ten years been working on solving this equation of multidisciplinary, theoretical discussions within the agent research field, and practical hands-on programming of an agent system. Below we will get into the details of our quest for a solution, but first some more details about the course.
The course is given annually as a six ECTS credit course during the spring term and each year approximately 60 students attend. In the first couple of years, the focus was on basic agent topics, such as decision making and rationality, and agent frameworks. After 1997, we changed focus as agent research matured and following the change in course responsible, with more time spend on coordination and cooperation, and current agent research topics such as agent-based simulation and the use of agents in computer games. Half the credits are spend on theories and are examined via a written exam. The other half consists of one or more practical assignments. These were examined in different ways in the various versions of the course.

3. The history of the practical part of the course

The course has over the ten years of its history developed in three different phases. The fist phase lasted between 1995 and 2000. In this phase, the students had to work on two different assignments, to make sure that a larger area of MASresearch was covered. Each assignment amounted to approximately 40 hours of work. The assignments were developed or adapted by the agent researchers in our group. The second phase had a focus on a bigger assignment of about 80 hours of work, using the Trading Agent Competition as a domain. Focus was on having a large, current research topic at the heart of the assignment. This phase lasted only one the year. The third phase from 2001 until today also has a focus one larger assignment. The domain is deliberately not part of agent research in general so that the students can put their 80 hours of work on more fundamental agent research topics and also produce a higher quality report.

3.1 The early days

The first version of the course had students finish two practical exercises. Four different exercises were used for these:

The taxilab consisted of a single dispatcher (the agent) that receives a request for a cab, and it tries to assign the request to a cab in the simulated world. As we mentioned earlier, the focus was on decision algorithms. EaterSoar (2005) was a tutorial developed for teaching Soar that was adapted to our purposes. Students had to program a Soar agent in a Pacman environment, where the agent is the player. The emergence of the RoboCup initiative was embraced in our course. Students, in the first attempt, had to write a short essay, but both the students and the teachers agreed that it was not rewarding to just write an essay on such complex domain. In the next attempt, the students were given a fully implemented team (Kummeneje 2001) and were to in smaller groups make some improvements to the team's behaviour. The intelligent homes assignment was strictly speaking not a practical assignment, at least not in the sense that it involved any programming. In this case, students had to write a report on the use of agents and sensors in intelligent homes research at the time. The programming assignments required no written report; working code was the examination for every pair of students.

3.2 The course, it is a changing

The lessons learned from the previous period were:

  1. do not have too many different programming languages
  2. if possible, do not dictate the choice of programming language
  3. competitions are fun and inspiring to students

Thus, we wanted to see if we could limit the practical part of the course to one assignment. Of course this meant that the assignment had to change from the ones used in the previous period, these were either too small or too hard to use. We also wanted to preserve the competition part of some of the practical assignments since we found that students work harder when they have fun. Since one of the teachers was partly involved in the then new Trading Agent Competition (TAC) initiative (Wellman and Wurman, 1999; TAC, 2005), we decided to give TAC a try. Supervising the lab were two masters students working on their thesis with TAC.
Unfortunately, this decision turned out to be a catastrophe. Preparing for the lab, we build upon the assumption that the previous year's agents (that were made public after the event) could be used as an inspiration for the students and that they could build upon these. But the organizers behind TAC decided to change to a new TAC server version. This change was made in the middle of our course. Since the server could not run locally, we were stuck. The masters students tried to help our students with the change and coding consequences, but were unable to deal with the problems and amount of work within the time frame of the course. Thus we turned to a written report as an unsatisfactory assignment.

3.3 Valhalla

From TAC, we learned that being dependent on forces outside of our control for the practical assignment is not a good idea. We considered the simplified RoboCup idea developed in Biter (Vidal and Buhler, 2002) but had problems getting the software to work, and also felt a bit uncomfortable with getting back to the RoboCup domain with the attached problems. So we decided to develop something ourselves, that we could administer ourselves, and that would run locally, enabling the students to test-run locally as well. Looking for a more or less simple game involving at least 2 players, that most students probably are familiar with, we turned to the game of chess inspired by the chess playing ants of Alexis Drogoul (1995). Chess of course has a long history in AI, but our interest was not to have students developing a good chess program, but instead to have them put together a team as a multi-agent system that together becomes the player on one side of the chessboard. To limit the importance of playing strength (which would give good chess playing students an inappropriate advantage), we changed some of the rules. Complementing the code, each group of about 4 students has to write a scientific report, explaining the ideas they wanted to test with the multi-agent system, how they implement these ideas, and what the results were including recommendations for improvement. All reports and agents from the previous courses are made available at the beginning of the course, so that students can build upon previous ideas and code.

4. The domains in detail

Here we describe all the programming assignments used during the ten years of the course's existence in more detail. Apart from the assignment, we also specify the learning goals and outcomes. We analyze the programming assignment using the classification scheme of Russell and Norvig (2003) for agent systems so we can compare the assignments.

4.1 Taxilab

Picture the agent as the central point (dispatcher) in a cab business. Customers call in and order a cab. To the dispatcher's disposal there are a few cabs, which can be either occupied or available over time. Available to the students were the cabs and some stubs for the dispatcher. The task for the students was to implement the decision algorithm of the dispatcher, which could be reactive or/and deliberate (planning). As the requests are not known beforehand in order to simulate a real world, the reactive approach is simple to implement, and fairly easy to extend with a few special cases. The planning approach proved to be less used as it would require the students to build additional modules to be able to for example simulate the future few steps by extrapolation.
We expected the students to examine different algorithms or methods to implement the dispatcher. The domain (and the lab) can be characterized according to (Rusell and Norvig, 2003) as accessible, deterministic, non-episodic, dynamic, and continuous. Accessible, as the state of all cabs are at all times known. Deterministic, as when the dispatcher assigns a task to an available cab, it always will perform it. Non-episodic, i.e. the domain typically does not repeat itself at the larger scale. Of course, at the micro level everything can be deemed as episodic. The system has several cabs that continuously will be performing tasks, or being available, as well as it has no predefined discrete steps of time, we will characterize it as dynamic.
The students experienced the lab as quite pointless as too much focus was put on the programming language (Tcl/Tk), which was new to the students, as well as the algorithm could be very simple and you still finished the assignment. We also noted a lack of enthusiasm among the students as it is neither especially interesting nor real-world like. Most important is also that the domain had no clear metric by which success could be measured, that is, the students could not assess whether their implementation was doing well or even get indications along the lines on how to improve their design.

4.2 RoboCup

Alan Mackworth (1993) proposed that the domain of robotic soccer could be the follow-up of chess as the "Grand Challenge of AI", since Deep Blue (admittedly not really AI) played better than the human chess master, became a reality for agent researchers to move on to as Kitano et al. (1995) launched RoboCup. RoboCup consists of several robotic domains as well as one simulated league. Since our focus is on agent system and our students have no previous knowledge on robotics, we are naturally limited to the simulated league.
The simulated league consists of one server (preferably run on Linux) to which 22 agents (or process if you like) connect over standard IP-sockets. During the game the server gives updates to the agents that are allowed to act every 100 milliseconds.
The choice of soccer as the domain is natural as it is well known in all parts of the world, and it is a dynamic environment with several players doing things simultaneously which makes it non-accessible as the environment is limited by physical parameters such as a cone of view and not being able to see forever.
As the domain is modelled to simulate the real world, each action does not deterministically lead to a given consequence, i.e. if you hit the ball in a certain angle it is not always the case that it will end up exactly where you intended it to, depending on the randomness variable in the simulation server.
It can be argued that the environment consists of episodes at not only the micro level, but also on a higher level such as situations as "the passing situation", "the scoring situation". We consider the domain to be semi-episodic as the situations are not exactly alike every time. A real world domain is by nature continuous, however the simulated league has introduced a programmatically limitation of using simulation cycles, which makes the simulated league discrete, while the others are continuous.
As the assignment for the students on the course, they were given the team UBU (Kummeneje 2001), which they were to extend and/or improve in order for the team to play better, or explore a novel technology by implementing it into the team.
We expected the students to implement interesting research grade algorithms that would give a contribution both to the knowledge of the students, as well as contribute to the codebase of the team that was supposed to be competing in the World Cup Tournaments held annually.
From the experiences with this assignment we can conclude that it is not possible for the students to give a significant contribution in such a short time given that the codebase exceeds 10,000 lines of code. As Heintz et al (2000) noted that if the time and level of the students are sufficiently high, it can be rewarding for both teachers and students to use RoboCup. Our experience in hindsight is that the effort required from the students and the teachers are not worth the effort for three ECTS credits.

4.3 Eater SOAR

The EaterSoar lab was originally developed by the development team behind Soar for an AAAI tutorial in 1997 and is still in use in Soar tutorials. We adapted this tutorial for our purposes. Soar is a general cognitive architecture for developing systems that exhibit intelligent behaviour and can be seen as a computational implementation of the ideas of (mainly) Allen Newell (1990). Soar has been developed since 1983.
In the EaterSoar tutorial, the students use and extend a simple agent written in Soar. The agent plays an interactive game called "Eaters" (hence the name) which is essentially the well-known Pacman game. Instead of eating "powerpills" (energy packages placed in the maze) while avoiding the evil ghosts, in this version there are various agents competing for the powerpills, which come in 2 different values. Another addition is the possibility to fight another agent, which results in a random placement in the maze, and an equal division of the total number of points both agents had before the fight. Furthermore, agents can move and jump (a larger displacement which costs more energy). The agent with the highest amount of points wins the game.
The students were provided with an agent that randomly chooses between jump and move to a spot it can reach, regardless of the spots contents (other than avoiding the walls of the maze). This agent had to be extended with decision making rules build upon strategy rather than chance. This meant that an evaluation function had to be developed as well. In order to predict fights, agents also needed a model of their opponent's choice. In terms of Russell & Norvig's schema, the environment is accessible, non-deterministic, non-episodic, dynamic, and discrete. Thus it is quite a complicated situation. Luckily, the agents are autonomous with respect to each other, no need for coordination in the search neither for a higher, non-individual goal nor for communication. Another issue is the syntax and semantics of the Soar language. Soar is a rule based language but it is not easily learned. The Soar project has seen several tools aimed at flattening the learning curve, without much success though. Given that students only had 40 hours to master Soar and solve the assignment, the lab was doomed to fail. Its positive points were the capabilities to have agents play each other, and the possibility to test play against a non-intelligent agent. Most students did master to program in Soar but did not succeed in developing a satisfactory decision making module.

4.4 Trading Agent Competition

The Trading Agent Competition (TAC) is an annual contest which started in 2000, where researchers and students can try their trading agents in the noble art of booking hotels, travel tickets etc, at the best possible rate. In the terms of the students' task, they should in small teams implement such agents and participate in the contest. As in the previous domains our expectation was for the students to be able to implement different algorithms for agents and possibly produce agents that could participate in the annual contest. The domain is not entirely about getting the best possible rate, but getting it in time as well, which in many cases would boil down to trying to close the deals just before the timeframe closes for the auction.
We characterize the domain as being non-accessible as not every other player is fully known, nor all aspects of the particular auction. This in turn leads to the characterization of the domain as non-deterministic and dynamic as well as continuous. The episodic part is quite evident as the same situation (the auction) keeps repeating itself over the tournament.
Unfortunately, in mid-course the TAC-group decided to change several server parameters, which rendered the students efforts useless, and we resorted to examine the lab based on a written report, to both teachers and students disappointment.
The most important thing we learnt from this incident was that never use something that cannot be run completely in-house and you cannot control fully.

4.5 ChessMAS

Our latest shot at the hybrid course is a domain we named ChessMAS. In essence, it is a chess game which two players participate in. Each of the players consists of a separate MAS. As the MAS can be constituted in a number of ways, it opens up for strategy agents, as well as an agent for each piece on the board (Drogoul, 1995); however for our task we defined the MAS to be consisting of at least 3 agents that communicates over a blackboard. The teams communicate with the server by a CommunicatorAgent. To eliminate the skills of the best chess players in the course, we have removed some of the more advanced draws, such as au passant, and also reversed the positions of the king and queen. This does not to any particular extent alter the basic domain. Each draw must be made within 30 seconds or the draw is transferred to the opposing team, and the game is not more than 15 minutes. We have chosen these time limits to force the students to take the time aspects into consideration while not having to focus fully on it. To get the students started they receive the server, the blackboard, the CommunicatorAgent as well a code stub for the communication with the blackboard.
Simply put for the students, their task is to create a MAS that will play a game of chess against another MAS created by another group of students. As part of the examination we play a tournament, and each group are required to write a report on the algorithms they have used, with an emphasis on the novel parts of their system.
Chess is an accessible and deterministic domain, as it has only 64 positions that the 32 pieces can be in, and each piece can only do a few different actions (totalling to a maximum of 130 possible moves at any time), which always will be performed. The game is static and discrete as nothing changes when a team deliberates over the next draw, and as in real chess the game is played in turns each ended by moving a piece.
Our experience is that all the students are enjoying the task, as each year there are 10-15 (leaving only a handful of students not completing the task) teams finishing the course. We encourage the students to borrow code and ideas (with due credit in the reports) from previous years and improve upon these. This enables the students to focus on the social parts of agents, as they not necessarily have to implement every part of the agents by themselves.
However, as Kummeneje (2001) and others have noted, there is a risk in each competitive domain that the teams forget about doing research and focus upon trimming the teams to only play better chess or soccer. We believe that the risk is worth taking in this case as we have had only one or two case (of approximately 35 teams) that had over-focused on the chess playing instead of agent technology.
During the spring semester of 2005, we conducted a small survey to gather the students' points of views on the good and not so good parts of using the domain as an assignment. The most occurring and positive aspect among the students were that the tournament added incentive for the students to finish their work and also try to do as good as possible. Second to that, they noted that the domain was fun and the problems interesting, as well as there were a good support in the availability of previous years teams and reports. But on the other hand, they also said that the task was to time-consuming to get into (software etc), and it left not enough time to actually get into the high-level parts of multi agent coordination. The rest of the comments given were mainly to administrative issues, for example that the assignment could have been introduced earlier on course.
The criticisms given by the students are alike those given when we used the RoboCup domain as assignment, but the strength of the negative criticisms has been greatly reduced when comparing the criticisms of the RoboCup assignment and the ChessMAS assignment.

5. Summary and conclusions

We have presented our view on a successful hybrid course on MAS, and have in the process discussed a number of domains and their characteristics (see table 1).
Domain Accessible Deterministic Episodic Static Discrete
Taxilab Yes Yes No No No
RoboCup No No No No No
EaterSOAR Yes No No No Yes
TAC No No Yes No No
ChessMAS Yes Yes No Yes Yes

Table 1. Analysis of the environment's characteristics

From the table we can conclude that ChessMAS is the simplest MASdomain among the ones we have tried, while RoboCup is the hardest. As mentioned before, this gives the students the best opportunity to focus on fundamental MAS problems such as coordination and to a lesser extent communication. This is fully in line with the change of course focus that occurred after the first phase of the course's history. It also enables the agents to program a MAS that functions within the time available for the assignment. This makes sure that the competition we organize at the end of the course (with semi-finals and finals live during the last lecture) is of interest to all students. The downside of the competition is that some students tend to put too much strain on the winning strategy rather than on the agent research topics, but this is levelled out by the fact that the students that put more effort on the coordination problem in MAS have an easier task when writing their report. Thus, we feel we have found a good balance between agent programming and agent theory, serving both our student populations equally well.
To finish off, we want to mention are main lessons learned as points of advice to colleague teachers:

  1. If possible, develop your own software for the practical part of a course, so you do not become dependent upon external factors.
  2. Do not force the students to learn yet another computer programming language unless this is the focus of your course.
  3. Choose a domain as simple as possible that will serve your purposes while still having clear metrics.
  4. Competitions are fun!
  5. Make sure your theoretical and practical parts are in synch, offering synergy effects.
  6. Complementing the programming part of an assignment with a scientific report broadens the learning effect for all types of students.
  7. Giving students access to the code and report from previous courses inspires them, and enables them to avoid inventing the same thing over and over again.
  8. Different student populations make interesting discussions during lectures and between students.

6. References

Drogoul, A. When ants play chess. In C. Castelfranchi and J. P. Muller, editors, From Reaction to Cognition, 13-27. Springer-Verlag, 1995.
SoarTurotial, http://www.eecs.umich.edu/~soar/tutorial.html or http://sourceforge.net/projects/soar/. Last checked May 19th, 2005.
Heintz, F., Kummeneje, J., and Scerri, P. Simulated RoboCup in University Undergraduate Education. In Proceedings of the fourth international workshop on RoboCup at the 4th RoboCup World Championships, 2000.
Kitano, H., Asada, M., Kuniyoshi, Y., Noda, I., and Osawa, E. RoboCup: The Robot World Cup Initiative. In Proceedings of IJCAI-95 Workshop on Entertainment and AI/Alife, 19-24, 1995.
Kummeneje, J. RoboCup as a Means to Research, Education, and Dissemination, Ph. Lic. thesis, Department of Computer and Systems Sciences, Stockholm University and the Royal Institute of Technology, 2001.
Mackworth, A. On Seeing Robots. In: Computer Vision: System, Theory, and Applications, pages 1-13, World Scientific Press, Singapore, 1993
Newell, A. The Knowledge Level. Artificial Intelligence, 18, 87-127, 1982.
Newell, A. Unified Theories of Cognition, Harvard University Press, 1990.
Russell S., and Norvig, P. Artificial Intelligence: A Modern Approach. Prentice Hall, 2003.
TAC, http://www.sics.se/tac/page.php?id=1. Last checked May 21st, 2005.
Vidal, J. M., and Buhler, P. Teaching Multiagent Systems using RoboCup and Biter. Interactive Multimedia Electric Journal of Computer-Enhanced Learning, 4, 2 (October, 2002.). Online at: http://imej.wfu.edu/articles/2002/2/05/index.asp. Last checked May 21st, 2005.
Wellman, M. P., and Wurman, P. R. A Trading Agent Competition for the Research Community. IJCAI-99 Workshop on Agent-Mediated Electronic Commerce. Stockholm, August 1999.