Previous Topic

Next Topic

Overview

The features and benefits of KowariTM are outlined in the following sections.

Note - The features of user and role security and JAAS support are only available in the commercial version of Kowari, the Tucana Knowledge ServerTM.

In This Section

General

Performance and Scalability

Reliability

Connectivity

Manageability

Cross OS/Platform Support

Scalability

Extensibility

General

  • Native RDF support
  • Multiple databases (models) per server
  • Simple SQL-like query language, the interactive Tucana Query Language (heading toward W3C SPARQL support in a future release)
  • Small footprint
  • Full text search functionality
  • Datatype support
  • Supports and tracks W3C Specifications and guidelines

Performance and Scalability

  • Large storage capacity
  • Optimized for metadata storage and retrieval
  • Multi-processor support
  • Independently tuned for both 64-bit and 32-bit architectures
  • Low memory requirements
  • On-disk joins
  • Streamed query results

More information is available in the Scalability section below.

Reliability

  • Full transaction support
  • Clustering and store level fail-over
  • Permanent integrity

Connectivity

  • JRDF
  • SOAP
  • Software Developers Kit (SDK)

Manageability

  • Near zero administration
  • Web based configuration and monitoring tools

Cross OS/Platform Support

  • Microsoft® Windows NT®, Windows® 2000 and XP
  • UNIX® and Linux®
  • SolarisTM
  • Mac OS® X
  • IRIX®

Scalability

The storage engine of KowariTM is a transactional triplestore known as the XA Triplestore. Much of the scalability of Kowari is due to the following features of the XA Triplestore.

64-bit Data Structures

All relevant fields of in-memory and on-disk data structures are 64 bits wide, thus ensuring that KowariTM can store very large amounts of data up to the limits imposed by the host operating system.

Multiple Sessions with no Lock Contention

A single writing session in addition to multiple reading sessions can access the triplestore concurrently without the reading sessions being required to acquire a global lock while processing a query. This completely avoids the possibility of any lock contention. In general, each session executes in its own thread. The lack of lock contention means that the maximum number of active reading sessions is only limited by the concurrency of the host operating system and I/O subsystem.

When a session initiates a query, which may involve multiple requests to the triplestore, it first takes a snapshot of the entire database. This ensures that all requests to the triplestore during the processing of the query see the database in a consistent state.

The triplestore is designed such that obtaining a snapshot is a very quick operation and does not cause any I/O to be performed. It should take less than a millisecond on current hardware, regardless of the size of the database.

The session must hold a global lock only during this brief period while it obtains the snapshot. Once the snapshot is obtained, no further locking is required regardless of the number of triplestore operations that must be performed or the amount of time required to execute the query.

The existence of a snapshot does not by itself cause any additional storage to be consumed but it will cause any modifications to use copy-on-write semantics. The on-disk data structures of the triplestore are designed to minimize the amount of copying required to perform a modification thus improving performance while also maximizing the amount of storage shared between snapshots.

A snapshot is released once the query processing is complete. Any disk storage used by the snapshot and not shared with any other snapshot is immediately available for reuse. Releasing a snapshot is just as quick as obtaining a snapshot but the session does not even need to hold the global lock during this operation.

A separate global lock (the write lock) is used to ensure that there is only one writing session at any given time. The write lock is released after the writer either commits or rolls back the current transaction.

On-Line Backups

The XA Triplestore allows modifications and queries to proceed concurrently with a backup operation. The session performing the backup acquires a snapshot of the entire database as it would if it was performing a query.

Permanent Integrity

System crashes caused by power failures and some types of hardware fault will not cause data corruption.

The on-disk data structures of the triplestore are designed to be kept in a consistent state at all times while minimizing the overhead required to achieve this. Disk writes during a write transaction are unordered thus preserving good write performance. Write ordering is imposed only during a commit operation.

Use of Java NIO

The XA Triplestore uses the JavaTM NIO (new I/O) API which was introduced in Java 2 SDK Version 1.4. The NIO API provides access to advanced I/O facilities which were previously only available to native C programs. The use of NIO allows the XA Triplestore to provide transactions, permanent integrity and good performance while still remaining a pure Java implementation.

Some of the features of NIO that are used by the triplestore include:

  • Positioned reads and writes

    NIO file channels allow multiple threads to concurrently read and write different parts of the same file without having to use thread synchronization to protect the current file position.

  • Forcing out dirty buffers to physical storage

    The NIO force operation can be used to ensure that all written data has been forced out to physical storage and can also be used to impose write ordering. This is an essential feature for providing permanent integrity and implementing transaction support.

  • Memory mapped file I/O

    The NIO API can be used to map files into virtual memory. Once a file has been mapped its content is accessed through a NIO ByteBuffer as if it had been loaded into memory. This form of I/O can be much more efficient than I/O that uses explicit read and write calls because it uses the virtual memory paging hardware to eliminate some system call overhead and the overhead of copying data between the operating system buffer cache and the application's buffers.

    On 32-bit platforms the amount of virtual memory that is available for mapping files is usually limited to less than 2 GB. As this would impose a restriction on the maximum size of database that can be used by Kowari on 32-bit platforms, the XA Triplestore has an I/O abstraction layer that allows the file I/O mechanism for accessing a file to be selected when the file is opened.

    Kowari can be started in one of three modes: all files mapped, index files mapped and no files mapped. Each of these modes allow Kowari to use successively larger databases. By trading off database size for performance in this way it is possible to use databases of any size on 32-bit platforms while still retaining maximum performance for smaller databases.

    Extensibility

    Kowari may be extended programmatically to treat external data stores as if they were RDF graphs via the Resolver SPI. GIS systems, databases, file systems, Web resources, all may be queried alongside native RDF metadata. More information on creating Resolvers may be found in the documentation.