Vitrivr is an open-source research system. This guide is intended to help you fully set up vitrivr and its components and show you how to search an example collection provided by us. If you plan to use vitrivr in your research, contact us.
The vitrivr stack does not require any special hardware and can be deployed on any reasonably modern machine (or on multiple machines for a distributed deployment). Since both Cottontail and Cineast are capable of operating on multiple CPU cores in parallel, it is recommended to use a multi-core machine. The memory consumption of Cottontail and Cineast (during extraction) can also be substantial. While Cineast has an integrated swapping mechanism, it is highly recommended to run the vitrivr stack in an environment with 8 GB or more of main memory.
The requirements in terms of software are dependent on the flavor of the deployment. The following list provides an overview of the required software packages:
Both native deployment and docker deployment instructions are provided on the Cottontail README
We provide an example config. To read more about how to configure Cineast, consult the Wiki
Once this configuration is done, Cineast should be able to communicate with the storage layer. Next, Cineast can be instructed to create all necessary entities which it needs for retrieval. To do this, the following command can be run from within the directory containing both cineast.jar and cineast.json:
In case the Cineast CLI is enabled in the configuration – which it is by default – Cineast will not terminate after the setup but rather wait for further instructions. As there is nothing to do at this point, it can be terminated using either exit or quit.
To build the UI from source, have a look at the Wiki of vitrivr-ng
Currently, vitrivr is undergoing a heavy rewrite. Therefore, this section is under construction as of June 2020 and subject to change.
After the setup is complete, vitrivr is ready to receive some multimedia documents. To add documents to the collection known to vitrivr, Cineast has to perform an extraction process on them. The properties of such an extraction job are specified in a job file. Several examples of such job files can be found on GitHub.
Shown below is a basic example of an extraction job file for videos, as specified by the type property. For every extraction, only one type of multimedia document is processed.
The first block after the type specification is the input block, which contains information on where the files are located and how they should be traversed in the file system. In case a relative file location is provided in the path property, it is considered to be relative to the location of the job file. Cineast will however store the file paths relative to itself, so the stored paths for the documents may differ from what is specified in the job file.
The following block lists the extractors which are to be used for this job. Each entry in this list refers to an extraction module and contains the java class to be loaded. These modules will produce the actual feature vectors during the extraction.
Similar to the extractors, the exporters also process the decoded information from the documents. They do however not produce any feature vectors but are instead used to export information which can be used elsewhere. In this example, only a single exporter is used which generates thumbnail images to be used as static preview during retrieval.
To run an extraction job, Cineast has to be called with the job file as a parameter. Since the extraction can have a large memory footprint, especially for video, it is recommended to pre-allocate sufficient memory to avoid triggering the internal swapping mechanisms – which would increase extraction time – or running out of memory, which would cause the extraction to abort. The following example starts an extraction job with 8 GB of pre-allocated memory:
After the extraction of the first multimedia objects is complete, vitrivr will be able to use them for retrieval. For retrieval to work properly however, some settings might need to be adjusted.
Depending on the types of documents within a collection and the features used during extraction, the configuration which tells Cineast which features to use during retrieval needs to be changed. This is done in the cineast.json configuration file using the retriever.features property. It specifies which feature categories there are and out of which retrieval modules they are comprised. The categories themselves need to match those defined in the UI as they will be used as query parameters. For retrieval to work efficiently, only those features which were used during extraction should be present in the categories used during retrieval. Every feature should only be present in one category. Within a category, each feature can be assigned a weight which is used for result fusion within the category. These weights determine the influence of an individual feature in the result from a category and can be tuned depending the use case at hand.
After the features are configured correctly, start Cineast using
In order for the UI to be able to display the retrieved content correctly, it needs to know how the actual documents (original files) and optional preview images (thumbnails) can be accessed via HTTP(S). The relevant base paths must be specified in its config.json configuration file in the resources block using the host_thumbnails and host_object properties.
The final paths are constructed by the ResolverService class relative to the base paths setup in the configuration. For thumbnails, Vitrivr NG expects a folder structure under the base path based on the media type (‘audio’, ‘video’, ‘image’, ‘model3d’) followed by the media object’s ID. For the original files, Vitrivr NG expects there to be folders separating the media types under the base path. The path known to Cineast will be resolved directly against this media type specific folder. In case a different folder structure is required for any particular installation, the ResolverService can be adjusted to reflect these differences.
After the setup, vitrivr should be ready for search. If you look at the UI, the side bar on the left hand side of the screen contains all the relevant elements for query formulation. Query formulation is centered around the concept of query containers and query terms. Upon execution, the query terms within a query container are connected through a logical AND relationship whereas different query containers are connected by a logical OR. Query containers can be added directly by clicking the green (+) button. Per container, the UI presents the end-user with a choice of up to five query terms — image, audio, 3D model, motion and text (the selection can depend on the configuration). Each term can be toggled and only one instance of a query term of each type can be active per container. Generally, a query term allows the end-user to either select or create a reference document for similarity. For example, the UI includes a sketchpad that can be used to draw sketches for Query-by-Sketch. It is also possible to upload files like images, audio snippets, or 3D models for Query-by-Example. Moreover, execution of some query terms can be refined through additional settings, which influence the feature modules that will be used.
Once a query has been formulated, a click on the search button on the top of the left side panel starts the retrieval process. On the Cineast side, partial results are aggregated and transmitted per feature category. Vitrivr NG displays these partial results as they become available and usually updates the view several times in the process. Currently, there are three different views for presenting results: Two types of gallery views and a simple list view. You can navigate between those views using the toolbar (top). Switching between views does not influence the result set and you can navigate even while the query is being executed.
As results become available, the panel on the right side of the screen is updated with additional options (if it is not visible, you can activate it using the appropriate button in the toolbar). Using this panel, the weights for the different categories can be adjusted which will update the ranking and thus the order of the results. Furthermore, one can toggle media type filters. Since these operations are executed entirely by the UI, no further communication with the back-end is necessary. However, they only operate on the result set that is available locally.
The vitrivr stack is designed in such a way that it can easily be extended and adjusted to different use cases.
Due to its modular architecture, adding further feature modules to Cineast is simple and does not even require a rebuild of Cineast itself. Feature modules are loaded via java reflections whenever they are needed during extraction or retrieval. Therefore, they just need to be present in the classpath of Cineast. We provide an example repository which shows how such feature modules are constructed. Once the newly added modules are present in the classpath, they can be used in the same way as those provided with Cineast.
All the different UI components in Vitrivr NG have been realized as dedicated Angular components. Adjusting the UI therefore boils down to either adjusting existing or adding new such Angular components. All you need is knowledge about Angular and Typescript.
Adding new forms of results presentation is straightforward and only requires two steps. First, one must create a new component that deals with the presentation logic. Inheriting from the existing, AbstractResultsViewComponent class makes sure that all the wiring with the QueryService is already in place. We refer to the implementation of the GalleryComponent or ListComponent for example. Secondly, one must provide the user with means to navigate to the new view. This can be achieved by adding a navigation rule to the AppRoutingModule and adding a link to the toolbar. Of course it is also possible to create a component from scratch, in which case the interface with the QueryService must be established manually.
All communication facilities for the Cineast API are implemented as services. Angular services are re-usable classes that can be injected into components and other services. Currently, there is a QueryService singleton, which provides the similarity query functionality, and a MetadataLookupService, which enables lookup of metadata entries. It is easy to add new services that connect to other WebSocket or RESTful endpoints exposed by Cineast. The CineastAPI service class provides the basic communication primitives required for WebSocket communication and it can be re-used in other classes. The entire communication layer uses an Observer pattern powered by RxJS and we encourage users to adapt the same pattern when creating extensions.
The query container system described previously is modular by design and new types of query terms can be added by creating the respective model entities and UI components. If you want to support new modalities, however, then adding query terms to Vitrivr NG is only the first step in the process. Obviously, Cineast must be extended as well in order to support these modalities.