The player core component is a framework capable of providing real-time synchronization between the viewing window of the video and a set of plugins. As well as this core, each plugin has a set of parameters that define its behavior for a given application.
The main principle of amalia.js is to have a unified metadata model. Ensuring that all the metadata types are consistent and use the same standards will facilitate their use and enable us to design generic visualization plugins. A typical usage of amalia.js is to have a video file with a few metadata blocks. When instantiating the player, a binding has to be done between the metadata blocks and the visualization plugins so as to decide which metadata is displayed and in which way. It is possible to have a single metadata block bound with several visualization plugins, each one responsible for displaying a specific facet of the data.
There are two main types of metadata that the player will have to manage. We have on one side the metadata of media streams that are described in this section and on the other side, settings of the player, plugins and interactions with the applications that are discussed in the following sections. A generic metadata model has been designed, it is intended to represent all the metadata of a media stream. It is able to describe both the audio and video of a content, technical informations, any documentary notes and all the results of automatic metadata extraction algorithms. Currently, this model is only able to manage metadata of a single stream. For analysis results of a corpus, involving links between streams (clustering, mining, similarity detection ...), a specific data model extension has to be created and is currently investigated.
A full XML schema has been written to represent this metadata model. This schema has much more elements than what is strictly needed to display simple metadata with amalia.js. The current version is 0.2.5. We use this schema to generate Java classes using the Jaxb plugin for maven and Jackson to serialize the metadata stream in Json. This Java implementation is available in a separate project along with several helper methods through a general factory (see documentation below). This Json format is currently the only supported metadata format.
Each metadata block has a unique id that is used for the binding with the visualization plugins (see below). A metadata block should only represent a coherent and homogeneous bunch of data. If you need to display distinct types of data, or even several blocks of data of the same type (e.g. the output of different versions of the same algorithm), you should provide them in separate blocks of metadata.
The model is generic and extensible, it is able to represent any metadata that is :
The timecode representation in our format, for
tc
,
tcin
and
tcout
fields must match the following regular expression :
[0-9]{2}:[0-9]{2}:[0-9]{2}.[0-9]{4}
(hh:mm:ss.SSSS)
. Using 4 digits for the milliseconds field is usefull when dealing with precise temporal metadata,
especially with audio description.
The only mandatory field of a metadata block is its
id
. All the other fields are, for the moment, only here for information purposes. In the near future, the
content of the
type
field will be normalized.
All the temporal metadata are in the main
localisation
field. Each localisation has its temporal information represented either as a single timecode (using the
tc
field) or as a time segment (with
tcin
and
tcout
timecodes provided). The hierarchical structure of the temporally localized metadata is represented by
nested
localisation
blocks with sublocalisations
. As a full pyramid of levels is not mandatory, and also to improve
parsing performances, the level of a
localisation
is also explicitly provided with a
tclevel
field.
The model will be fully documented later. If you need to add specific informations that are not yet managed
in our model, you can still use the
data
field to store any Json block with your own structure and access it in your plugins.
In the following example, we have a simple metadata block for a 1 minute video. The id of this block, that
should be used to bind it to a visualization plugin, is
amalia-simple01
. It contains a single data, at 30 seconds, which has a label.
{ "id":"amalia-simple01", "type":"simple", "algorithm":"demo-json-generator", "processor":"Ina Research Department - N. HERVE", "processed":1418900533632, "version":1, "localisation":[ { "type":"simple", "tcin":"00:00:00.0000", "tcout":"00:01:00.0000", "tclevel":0, "sublocalisations":{ "localisation":[ { "label":"A demo label !", "tc":"00:00:30.0000", "tclevel":1 } ] } } ] }
You may also have a look at the Json files used in the following examples : cue points, segments, text and overlay.
Textual metadata can be displayed by two specific plugins : caption
and text synchronization.
Depending on the textual metadata you have, it may be usefull to exploit its hierarchical structure.
Typically, one may choose to have the words, sentences and paragraphs of a text temporally localized.
In such a case, one would choose a specific tclevel
for each level of text. When binding a text
metadata block with one of the text visualization plugin, one has to choose which level will be displayed.
The overlay plugin is able to display spatial
localisation of the metadata.
Currently, we manage only rectangle and ellipse bounding boxes. All the coordinate informations
(x
, y
, w
, h
, ...) are real values between 0 and 1.
They are normalized to the media size.
The orientation o
is a real value between -Π and Π.
In order to track an object, you need to have a localisation
in your metadata with the correct
duration (tcin
and tcout
fields).
Then you add a spatials
block to this localisation
. Each specific track is
represented by a spatial
block.
In this spatial
, you will only have the key points of your object trajectory with the specific
timecode (tc
).
Between these points, the position of the bounding box is linearly interpolated by the overlay plugin (in
fact, currently, by the Raphaël library).
You can have a look at the example Json file of
the overlay plugin for more details.
You can either have static binding where you explicitly associated a metadata block to a plugin or dynamic binding where these associations is based on the metadata types. If you provide a metadata block id to the plugin, you activate the static binding. If you don't provide any, that dynamic binding is activated. The following table lists the automatic bindings between metadata types and vizualisation plugins :
Type | Description | Key |
---|---|---|
DEFAULT | fr.ina.amalia.player.PluginBindingManager.dataTypes.DEFAULT | |
DETECTION | Is mapped with timeline plugin in mode cue point. | fr.ina.amalia.player.PluginBindingManager.dataTypes.DETECTION |
VISUAL_DETECTION | Is mapped with timeline plugin in mode visual component and plugin overlay | fr.ina.amalia.player.PluginBindingManager.dataTypes.VISUAL_DETECTION |
VISUAL_TRACKING | Is mapped with timeline plugin in mode visual component and plugin overlay | fr.ina.amalia.player.PluginBindingManager.dataTypes.VISUAL_TRACKING |
SEGMENTATION | Is mapped with timeline plugin in mode segment | fr.ina.amalia.player.PluginBindingManager.dataTypes.SEGMENTATION |
AUDIO_SEGMENTATION | Is mapped with timeline plugin in mode segment | fr.ina.amalia.player.PluginBindingManager.dataTypes.AUDIO_SEGMENTATION |
TRANSCRIPTION | Is mapped with Text synchronization plugin and caption plugin | fr.ina.amalia.player.PluginBindingManager.dataTypes.TRANSCRIPTION |
SYNCHRONIZED_TEXT | Is mapped with Text synchronization plugin and caption plugin | fr.ina.amalia.player.PluginBindingManager.dataTypes.SYNCHRONIZED_TEXT |
KEYFRAMES | Is mapped with timeline plugin in mode image | fr.ina.amalia.player.PluginBindingManager.dataTypes.KEYFRAMES |
HISTOGRAM | Is mapped with timeline plugin in mode histogram component. | fr.ina.amalia.player.PluginBindingManager.dataTypes.HISTOGRAM |
For any information, you can contact us at