Login Form


Activity recognition results on UCF Sports and Holywood2

Table above shows the results, obtained on UCF Sports dataset (http://crcv.ucf.edu/data/UCF_Sports_Action.php). We report recognition rate with respect to the number...


Computational efficiency and parallel implementation

The developed algorithms are computationally effective and the compositional processing pipeline is well-suited for implementation on massively parallel architectures. Many...


Motion hierarchy structure

Our model is comprised of three processing stages, as shown in the Figure. The task of the lowest stage (layers...


Server crash

After experiencing a total server failure, we are back online. We apologize for the inconvenience - we are still in...


L1: motion features

Layer L1 provides an input to the compositional hierarchy. Motion, obtained in L0 is encoded using a small dictionary.



Perception of motion plays a central role in biological visual systems. Sophisticated mechanisms for observing, extracting, and utilizing motion exist even in primitive animals. For humans, successful motion processing is a prerequisite for accomplishing many everyday tasks. For example, action and activity recognition and categorisation are of crucial importance for the awareness of one’s environment and for interaction with one’s surroundings.

The current state­of­the­art computer vision methods work well for the problems within limited domains and for specific tasks, and activity recognition and categorisation are no exception. However, when applied in more general settings, such methods turn out to be frail, less efficient or even computationally intractable. In a nutshell, the classic approaches are neither general, nor do they scale well. Consequently, new paradigms that would alleviate these shortcomings are constantly sought.

Scientific advances in the recent years, especially in the field of neuroscience, have provided us with inspiration and insights that have given rise to novel approaches in computer vision. Far from duplicating the functionality of the human brain, they aim to improve the performance of computer vision methods by utilizing a selection of biologically inspired design principles. One such principle is the concept of hierarchical compositionality, which has been already exploited in the design of state­of­the­art object categorization methods, with significant contributions from the proposers of this project. Compared to the other state­of­the­art approaches, the approaches based on hierarchical compositionality allow for much more efficient use of the existing resources. This is achieved through sharing of both the representation units and the computations, and by transfer of the knowledge, therefore making the learning process much more efficient. Recently, the hierarchical approaches that deal with the analysis of motion began to emerge. Nevertheless, the analysis of motion by employing the hierarchical compositionality models is still in its infancy.


The proposed project aims at a holistic approach towards learning, detection and recognition / categorisation of the visual motion and the phenomena derived from it. The approach is based on a novel and powerful paradigm of learning multi­layer compositional hierarchies. While individual ingredients, such as the hierarchical processing, compositionality and incremental learning, have already been subjects of a research, they have, to the best of our knowledge, never been treated in a unified motion­related framework. Such a framework is crucial for robustness, versatility, ease of learning and inference, generalisation, real­time performance, transfer of the knowledge, and scalability for a variety of cognitive vision tasks.

Scientific novelty and relevance

The current state­of­the­art engineered solutions require an extensive training and hand design, are frail, and cannot generalize well enough to respond to novel situations. The paradigm of learning the multi­layer compositional visual hierarchies offers a way to overcome these limitations.


The main scientific challenge lies in the design of the structures and the learning process in order to enable efficient learning of robust, extendable, and general­purpose visual representations that would facilitate the execution of various motion­related tasks in the real­world settings. Another part of the challenge is the introduction of proper benchmarks for these tasks in order to establish a base for the evaluation and comparison of the competing approaches.


The proposed project is feasible due to its focus on a set of well­defined requirements that have been translated into highly advanced methodology, both from scientific and technological point of view, followed by a carefully defined set of experiments that have been planned in order to validate the proposed approach.

This website uses cookies to manage authentication, navigation, and other functions. By using our website, you agree that we can place these types of cookies on your device.

View e-Privacy Directive Documents

You have declined cookies. This decision can be reversed.

You have allowed cookies to be placed on your computer. This decision can be reversed.