AnalysisTools

From powerwiki
Jump to: navigation, search

There are several kinds of analysis tools that operate on Power TAC game logs. One way to do analyses would be to develop a database schema for the data that's contained in a game log, populate the database, and then write queries to extract the data. Nobody has yet worked out such a schema, or if they have they have not shared it. So here we describe how to extract data from an individual game using a logtool analyzer, and how to iterate a logtool analyzer across a tournament.

Access to game logs

Each tournament round produces a csv file with columns

   gameId, gameName, status, gameSize, gameLength, lastTick, weatherLocation, weatherDate, DOI, logUrl

followed by a column for each broker, giving the broker's final score in each game. The logUrl is a download link to the compressed log files. Each compressed log bundle contains

  • a boot-log directory containing the state and trace logs from the boot session,
  • a log directory containing the state and trace logs from the sim session, and
  • the xml boot record.

Links to these tournament csv files will be maintained on the powertac.org website for "official" tournaments.

Per-game analyzers

The Power TAC simulator produces two log files for each boot session, and two for each simulation (or sim) session with brokers.

  1. The first is the "trace" log that contains a record of significant activities in the simulator. The trace log is primarily intended for debugging -- for example, it's common to run a boot or sim and search for the string "ERROR" in the trace log. The trace log also sometimes contains information needed for analysis. Since the format of log entries in the trace log is not standardized, data extraction from the trace log requires a certain amount of text processing. There are currently no examples of such analyzers, since it's considered a bug to have important data not appearing in the state log.
  2. The second is the "state" log, containing a record of each significant state change in the simulator. These are all object creation and method calls. Not all object creation and method call events are logged, but generally all instances of domain types and messages are logged when they are created or when they arrive as messages from brokers, as are method calls that change the state of domain objects. The format of this log is standardized by the use of Java annotations to trigger the logging behavior, which is virtually all implemented using aspect-oriented programming. This means that a single codebase (it's in the powertac-aop module inside powertac-core) generates all the log entries. The format is also designed to be easy to process by re-creating copies of the original objects, and calling their methods. This allows the game-replay behavior of the visualizer, as well as data extraction from the re-created objects.

The analyzers in powertac-tools/logtool-examples all operate by reading a state log file and extracting data from the objects created as the file is processed. Most of them produce tables in csv format for ease in further analysis via spreadsheet, R, or Python, while a few produce output in the form of dictionary structures that are directly readable in Python. The format of the output is described in the header comments for each analyzer.

Analyzers are not difficult to write; in general, each has a setup section that processes the command line and initializes data collectors, methods that respond to events such as message arrivals, per-timeslot reporting, and summary reporting. Some understanding of simulator operation and message structure is needed in most cases. For example. wholesale market interactions for a given timeslot typically take place in the 24 timeslots preceding the delivery of energy.

To use these analyzers, you will need a Java 8 or newer SDK (not just the runtime), and a working installation of maven 3.3 or newer. You will need to clone the powertac-tools repo (git clone https://github.com/powertac/powertac-tools), then run a build in powertac-tools/logtool-examples (mvn clean test).

The per-game analyzers can all be run from a command line as

   mvn exec:exec -Dexec.args="classname [options] input-file output-file"

Tournament processing

In many cases, interesting analyses must be extended across all the games in a tournament or experiment. The TournamentLogtoolProcessor in powertac-tools/python-scripts serves this purpose. It reads the csv file produced by the tournament or experiment manager, and iterates over the games. If the logs for a game are not yet stored locally, it downloads and unpacks them. For each game, it then runs the requested logtool analyzer. Note that you will need a significant amount of free disk space to download and unpack all the logs for a tournament. The complete set of unpacked logs for the 2018 tournament requres nearly 300 Gb.

To use TournamentLogtoolProcess.py, you need a working installation of Python 3.4 or newer. You can run it directly from the command line if your operating system supports this behavior (Linux and Mac OS should) and if you make the file executable (chmod +x TournamentLogtoolProcessor.py). If you run it without arguments, it will tell you what it needs. If your operating system does not support this behavior (anyone care to test on Windows and tell us?), then you need to be sure that the python command runs python version 3 (it tells you when it starts up, and you can quit by typing quit() at the prompt). If not, then try python3. You can then run it as python3 TournamentLogtoolPocessor.py to get the usage instructions. Alternatively, you can load it into a python IDE (like Idle) and run the iterate() method. As it runs, it will print out (or return) lines of the form

{'path': path-to-data-file, 'gameId': game-number}

Each line can be read into python as a dict structure, which gives the game number and the path to the file.

Also in the powertac-tools/python-scripts directory are a number of examples that use this ability to iterate over games to extract and plot data across an entire tournament. An example of such a script is PlotCapacityData.py.