The PCS Grand Prediction Challenge sequences and rates have been announced

Objectives of the challenge

Prediction of one picture from one or more other pictures is a fundamental element in video coding and multi-view coding. The prediction is usually performed with the aid of side information that describes motion between frames, disparity between views and so forth. The predictor is often embedded within a closed feedback loop, but not always. For this challenge, we aim to separate out just the picture prediction task and provide a framework for evaluating and comparing the merits of different proposals within a competition environment.

The PCS’2015 organizing committee is endeavouring to establish a well-defined prediction challenge that might become a feature of the symposium in multiple years. The committee is interested in novel and innovative prediction approaches as well as approaches within the scope of video coding standards, such as HEVC and VP9, with a view to collecting and objectively assessing some of the most promising prediction technologies that might ultimately underpin compression technologies of the future.

With this in mind, the committee is interested in receiving submissions from researchers who are actively working on technologies that can be used to predict individual pictures from nearby pictures. Submissions are invited in one of two forms:

  1. Full submissions to the grand prediction challenge will involve both encoded forms of the prediction side information, assessed with respect to encoded size, and an algorithm for predicting target frames from source frames, using this side information, assessed with respect to various distortion metrics.
  2. Partial submissions to the grand prediction challenge may involve side information that has not been compressed, whose encoded size will not be assessed. For such submissions, only the predicted target frames are of interest. Partial submissions may include promising techniques, with attractive prediction properties, for which encoding algorithms have not yet been developed. While partial submissions are highly encouraged, they are not included in the competitive process, as the performance of a partial approach cannot be directly assessed. Nonetheless, promising partial solutions may be selected to form part of the competition record.

It is expected that the most promising prediction strategies will be recorded in a compendium of documents and a preamble that is authored by the competition chairs, to be included with the final published record of the Symposium’s proceedings. It is hoped that this may serve as a base, on which future prediction challenges can build.

The committee is also interested in receiving submissions that suggest novel distortion metrics that can be used to assess the merits of prediction technologies, above and beyond the standard metrics prescribed in this document. Submissions of this nature should be accompanied by a tool that can be used to assess the performance of both the proposers’ submission materials and other participants’ submission materials, following the formatting conventions described in the sequel.

Assumptions and constraints

To keep things manageable, the challenge involves prediction of each target video frame from a defined set of “nearby” source frames within the same sequence. In order to make the competition viable and meaningful, we find it necessary to impose some additional constraints, as described below. The intent behind these constraints is to avoid dependencies between the prediction task and the compression of reference pictures or prediction residuals. Such dependencies are inevitable in a real coding scheme, but may hinder the development and meaningful comparison of novel prediction schemes that might open the door to new compression systems.

In order to decouple the prediction process from other aspects of a typical video compression system, the challenge involves predicting target frames from “original” (i.e., uncompressed) source frames. Then, in order to give the results meaning in a context where the source frames are actually likely to be corrupted by lossy compression artefacts, we insist that the prediction scheme represent a continuous function with bounded derivative, when viewed as a function that maps source frame samples to target frame samples.

Finally, in order to avoid admitting solutions in which prediction residuals (or anything similar to residuals) might be encoded as part of the side information file, we insist that the prediction function involve no bias (or offset) that is independent of the source frames – that is, the prediction function must produce zeros when supplied with source frames that are all zero. This constraint is meaningful when combined with the continuity requirement, so the prediction algorithm must be locally linear about the origin and continuous everywhere, ignoring numerical effects. Considering that most frame prediction schemes are linear mappings, these more relaxed requirements should not prove overly restrictive.

In point form, the competition constraints are as follows:

  1. Each target frame is predicted from “original” source frames, as opposed to coded/quantized source frames. Moreover, the number of such original source frames that are involved in the prediction of any given target frame is small, and explicitly specified in the test conditions.
  2. The prediction process should be driven by side information (e.g., motion) that effectively defines a transformation from the source frames to the predicted target frame. No constraints are placed on how the side information should be interpreted or coded, except that the overall transformation from source to target frames should be “smooth”, continuous, and map input frames that are identically 0 to output frames that are also identically 0.

It is worth noting that prediction schemes which estimate motion or other features based upon previously coded frames, using these to predict other frames (sometimes called pel-recursive methods), are essentially ruled out by the above constraints. Though undoubtedly useful, such approaches cannot be meaningfully evaluated in the context of original source frames (i.e., without a full video coder).

Measures of performance

The ultimate objective of the challenge can be expressed in terms of the operational Distortion-Rate function of the picture prediction system, where Distortion is a measure of the prediction error over a given video sequence and Rate measures the coded data rate of the side information.

The challenge employs the MSE evaluated over all predicted frames as an objective distortion measure. However, the authors are also encouraged to submit other metric or metrics, which are also taken into consideration when assessing an approach; in this case, the authors also submit an executable for evaluating these other metrics.

The only objective measure of “Rate” is the total size of the encoded side information file, divided by the number of target frames whose prediction is achieved using the side information.

NB: partial submissions are also encouraged, in which the side information might not be coded, so that only the distortion is assessable at this point in time.

Experimental conditions

Participants are expected to process a pre-defined set of video sequences under each of the following two prediction configurations:

  • Bidirectional: Frame f_k predicted using only frames f_(k-1) and f_(k+1), where k≥1.
  • Forward-only: Frame f_k predicted using only frames f_(k-2) and f_(k-1), where k≥2.

Additionally, for each video sequence and prediction configuration, participants with full submissions are expected to address a small set of pre-defined “Rate” conditions, expressed in terms of the maximum length of the encoded side information file.

The video sequences and rate conditions will be supplied on the competition page of the PCS’2015 website by 31 Jan 2015.

For each sequence, prediction configuration and rate condition (full submissions only), an encoded side information file is to be generated, forming part of the submission. For partial submissions, only one (presumably uncompressed) side information file should be generated for each sequence and prediction condition.

A single prediction tool, provided with the submission, should be capable of processing the following input parameters to produce each individual predicted target frame:

  1. the zero-based index k that identifies a specific target frame to be predicted;
  2. the name of the encoded side information file that describes the prediction parameters for all target frames for a given set of test conditions;
  3. the two corresponding source frames, each in a separate input file; and
  4. the name of the file containing the predicted output frame.

For the purpose of this challenge, the source frames processed by the prediction tool and the target frame that it produces correspond to individual header-less 8-bit/component YUV files, of which only the Y component need actually be processed for testing purposes. The dimensions of the frames are to be deduced from the side information file so that they need not be explicitly provided to the prediction tool.

For the purpose of facilitating the testing process, participants are also expected to run their prediction tool under each relevant condition, concatenating all predicted frames to form a single YUV file for each sequence, prediction configuration and rate condition. These YUV files are to be uploaded to the test site, along with the prediction tool and a document describing the proposed method, as described below.

Materials Submitted

Participants are expected to submit the following competition materials.

  • Side information (files) for each video and test scenario. These files should conform to a naming convention described on the competition web page.
  • A prediction tool, as described above, in the form of an executable program that can run as a command-line program under either Windows or Linux (preferably both).
  • A complete set of predicted frames, where the YUV frames for each test condition are concatenated into a single YUV file. It should be possible for these predicted frames to be recovered exactly by running the prediction tool mentioned above. These YUV files should conform to a naming convention described on the competition web page.
  • A document describing the prediction algorithm, the interpretation of the side information, and briefly explaining the method that was used to determine the side information. The document should also highlight desirable features of the proposed approach that are not explicitly tested. For example, issues related to complexity, hardware/software implementation and scalability are features that are not explicitly tested in the competition but would be of interest to the picture coding community. This document should be formatted according to exactly the same guidelines as a regular PCS’2015 paper, as outlined in the symposium paper kit.

As mentioned, the PCS’2015 committee is also interested in receiving submissions that propose additional, potentially novel metrics for distortion. Submissions of this form are still required to satisfy the requirements of a regular full or partial submission, meaning that there should be a prediction algorithm and predicted frames. Where an additional distortion metric is proposed, the participants are expected to submit the following:

  • A distortion tool, in the form of an executable program that can run as a command-line program under either Windows or Linux (preferably both), that accepts as input:
    • The original YUV test sequence
    • A prediction output YUV file consisting of all relevant predicted frames.
    • The offset of the first predicted frame compared to the first frame in the original sequence. This is 1 for bidirectional prediction and 2 for forward-only prediction.
  • The dimensions of the individual YUV frames should be determined from the input file name, corresponding to the original YUV test sequence. To facilitate the submission of distortion tools, a conforming tool for the MSE metric is available here.

Outcomes from the Challenge

It is intended that the outcome of the PCS’2015 prediction challenge will become part of the archived Symposium record. Subject to the availability of sufficiently competitive submissions, the Competition Chairs will select the best performing submission(s) and ask them to contribute their submitted document(s) (see above) as part of the Symposium’s enduring record.

Participants should be mindful of the fact that the documents that describe their submission may be published in this fashion, albeit without a formal peer review process. In particular, participants should be careful to correctly cite published, submitted or in-press material on which they rely, including manuscripts submitted as regular Symposium papers.

Where proposal descriptions are included as part of the Symposium proceedings, they will be accompanied by an introductory article, co-authored by the Competition Chairs, that explains the challenge, summarises the results, and provides justification for the selection of those proposals whose documents appear within the proceedings.

Dates

In order to reserve a presentation slot within the PCS2015 grand prediction challenge, and hence an opportunity to have a proposed method documented as part of the output from the challenge, participants need to express their interest in submitting either a full submission or partial submission to the competition test site by no later than 15th of March 1 April 2015. Thereafter, the participants should upload either a full submission, or a partial submission to the competition test site by 15th of April.

Subsequently, updated submissions may be uploaded to the test site up until the 10th of May. Participants who initially submit only a partial submission (no encoding of side information and hence no rate condition) may opt to update their materials to constitute a full submission. It is also possible, but unlikely, for participants to downgrade a full submission to a partial submission prior to the final deadline of 10th of May, 2015. Participants may also update their descriptive document up until the final deadline.

Participation

It is expected that at least one of the proposers, a co-author on the descriptive document, will attend the competition event on Sunday May 31, 2015, to present the proposed method. Submissions that are not presented will not be judged and will not be eligible for inclusion as part of the recorded output from the challenge.