RizeDb - Document Oriented

Nuget Download

Document-Oriented

RizeDb has a document-oriented implementation that allows for the storing, indexing and retrieving of documents. Documents are not stored into tables and have very little structure to them.

RizeDb's document store requires no schema and document values can change types without requiring the database to update all existing data. By default, nearly all document data is indexed so that finding data is fast and easy.

Table of Contents

RizeDb Primer

The first step to creating a document store is simply to instantiate the DocumentStore object with a stream.

using(var stream = new MemoryStream())
{
   using(var documentStore = new RizeDb.DocumentStore(stream))
   {
      //Code goes here
   }
}

Once your document store is created, create POCO classes.

public class Order
{
    public long Id { get; set; } //This is the only required field
    public string Number { get; set; }
    public DateTime Date { get; set; }
    public List<OrderItem> OrderItems = { get; set; }
}

public class OrderItems
{
    public string ItemName { get; set; }
    public decimal Price { get; set; }
    public int Quantity { get; set; }
}

Now that your POCO classes are created, instantiate and fill an Order object with data.

var order = new Order()
{
    Numer = "1001",
    Date = DateTime.Now,
    OrderItems = new List<OrderItem>();
}

order.OrderItems.Add(new OrderItem()
{
    ItemName = "Sun Glasses",
    Price = 19.99,
    Quantity = 1
});

order.OrderItems.Add(new OrderItem()
{
    ItemName = "Flashlight",
    Price = 10.50,
    Quantity = 4
});

Once your order is ready to be stored simply add it to a document collection.

documentStore.Store("Orders", order);

This will store the order object and its' items as one document into a collection named "Orders". If the collection "Orders" does not already exist it will be created. Now that the order document has been stored into a collection it will have a unique value assigned to the Id property of order. This The Id value is what will be used to retrieve, update or delete the order in the future.

Retrieving your document is as simple as requesting it by Id.

var order = documentStore.Retreive<order>("Orders", 1 /*Assuming the document Id is 1*/);

The retrieve method will create an Order object and its' OrderItem objects and populate them with the exact same data that was stored.

You can also lookup documents by values.

var orders = documentStore.Retreive<order>("Orders", o => o.Number == "1001");

This call to the Retrieve method will create an IEnumerable object containing all Orders objects with the Number "1001".

You can also search by child object values.

var orders = documentStore.Retreive<order>("Orders", o => o.OrderItems.Any(i => i.ItemName == "Flashlight"/);

Again you will get an IEnumerable object containing all orders that have an order item with the ItemName of "Flashlight".

By default, all values except for byte arrays are indexed so lookups are fast and flexible.

You can even use a different object to retrieve data.

public class OrderHeader
{
    public long Id { get; set; }
    public string Number { get; set; }
}

var orderHeader = documentStore.Retreive<OrderHeader>("Orders", 1 /*Assuming the document Id is 1*/);

This call will create and return an OrderHeader object with the Number property set to "1001".

And finally you can change the type of the property and the values will still get set as long as they are compatible.

public class OrderHeader2
{
    public long Id { get; set; }
    public int Number { get; set; } //<-- This was changed to an Int32
}

var orderHeader2 = documentStore.Retreive<OrderHeader2>("Orders", 1 /*Assuming the document Id is 1*/);

Now the newly create OrderHeader2 will have an integer value for the Number property set to 1001. Changing types will work for most types, but some are not compatible and will either be null or default when the document is retrieved.

Operations

Storing

One of the many downsides of Relational Databases is the work it takes to create and alter tables. And while Relation Databases provide many benefits, often developers don't need those benefits and so the extra work becomes a barrier to the rest of the development process. The Document Store in RizeDb was designed to make storing, retrieving, updating and deleting of records as easy as possible. In fact, there is only one requirement and that is that every document has a property named Id that is along. The Id field is the primary key and is required to be present when adding, updating or removing a document.

As seen in the [Getting Started] section adding a document is as easy as:

class Customer
{
    public long Id { get; set; }
    public string Name { get; set; }
}

var customer = new Customer() { Name = "John Smith" }

using(var stream = new MemoryStream())
{
   using(var documentStore = new RizeDb.DocumentStore(stream))
   {
      documentStore.Store("Customers", customer);
   }
}

In the above code example, the customer was stored in the document store and a unique int64 value was assigned to the Id property. It is important to note that if the Id property is not 0 when storing a document that RizeDb will first try and find an existing document with that Id and replace it with the new document. If a document did not already exist, the new document will be added without generating a new Id value.

Retrieving

Nearly all document data is indexed by default. The exception to the rule is Byte arrays. So searching and retrieving documents is very easy and flexible.

The simplest way to retrieve a document is with the Id.

class Customer
{
    public long Id { get; set; }
    public string Name { get; set; }
}

using(var stream = new MemoryStream())
{
   using(var documentStore = new RizeDb.DocumentStore(stream))
   {
      var customer = documentStore.Retreive<Customer>("Customers", 1);
   }
}

The code above will look for the document with the Id of one and if found will create a customer object and populate any matching field names with the stored data. Using the Id will return just one document if a document is found.

A lambda predicate can also be used to retrieve documents.

class Customer
{
    public long Id { get; set; }
    public string Name { get; set; }
}

using(var stream = new MemoryStream())
{
   using(var documentStore = new RizeDb.DocumentStore(stream))
   {
      var customers = documentStore.Retreive<Customer>("Customers", c => c.Name == "John Smith");
   }
}

The code above will look for any Document in the Customers collection that has a Name field with the value "John Smith". Since the system cannot be sure that only one document matches the predicate an IEnumerable of Customer is returned. With all matching documents. There are some scenarios where properties cannot be searched on, or where they may not find matching documents even though they exist. These scenarios are always a result of index changes. Please see Index Optimizing for more details.

Updating

Updates are essentially overwriting the existing document. If you create a new document and assign it an Id that already exists in the collection. The previous document will be removed and the new document will be added. This means that all existing data will be lost unless it is included in the new document.

In the following example, the current customer will be loaded and the name changed.

class Customer
{
    public long Id { get; set; }
    public string Name { get; set; }
}

using(var stream = new MemoryStream())
{
   using(var documentStore = new RizeDb.DocumentStore(stream))
   {
      var customer = documentStore.Retreive<Customer>("Customers", 1);
      customer.Name = "John Doe";
      documentStore.Store("Customers", customer);
   }
}

Removing

Removing a document is very straight forward. All you need is the Id of the document.

using(var stream = new MemoryStream())
{
   using(var documentStore = new RizeDb.DocumentStore(stream))
   {
      var customer = documentStore.Remove("Customers", 1);
   }
}

The document with the Id of 1 will no longer exist in the customers collection.

Transaction Group Operations

The Transaction Group ensures that all operations in that group are written to the document store or none of them are. An example of this would be if you create a new customer and that customer has a new order. You would not want the customer being stored in a collection without the order or the Order without the Customer.

When using a transaction group if there is some sort of failure and the order or customer is not fully committed to the document store. Both operations will be rolled back as if they never happened.

The following example shows how easy it is to use a Transaction Group.

var transactionGroup = documentStore.CreateTransactionGroup()
    .Store("Customer", customer)
    .Store("Order", order)
    .Commit();

Additionally, you can have an action that gets called after each operation. This allows you to provide information such as the Id of parent documents to children documents.

var transactionGroup = documentStore.CreateTransactionGroup()
    .Store("Customer", customer, c => order.CustomerId = c.Id)
    .Store("Order", order)
    .Commit();

It is important to remember to call the Commit method without it your documents will not be stored.

Index Optimizing

When a collection is created indexes for each primitive type are created as well. Ints, long, bytes, etc. all have their own indexes. As do strings, Guids, DateTimes, etc. No space for indexes will be allocated until the first time a value is inserted into it. So if your document does not have any properties of type short you don't have to worry that a collection will be allocated space for an index of type short needlessly.

If you do have a property on your document, but don't ever plan on searching using that property you can use the IndexAttribute to tell the document store not to index that property.

class Log
{
    public long Id { get; set; } 
    public DateTime Date { get; set; }
    public string Source { get; set; }

    [Index(IndexSearchOptions.None)]
    public string Message { get; set; }
}

In the above example, all properties will be indexed except for the Message property. Since string properties are slow to index especially when they are large, removing indexing from the Message property is a good idea.

You can also tell the document store what indexes it should search when looking for a document. Since document stores are schemaless, there is no way for the document store to know which kind of index your properties are stored in. This requires the document store to look in all of the indexes. For example, the property Source used in the previous log Example could be a number value.

var customers = documentStore.Retreive<Log>("Logs", c => c.Source = "23");

Since 23 could be a string, integer, decimal, byte, etc. The system will look in all indexes that support numbers, This causes a lot of overhead and increased look uptime.

To optimize the search you can change your POCO class to look like the code below.

class Log
{
    public long Id { get; set; } 
    public DateTime Date { get; set; }

    [Index(IndexSearchOptions.String)]
    public string Source { get; set; }

    [Index(IndexSearchOptions.None)]
    public string Message { get; set; }
}

This code will limit the search to just the String index. Now let us suppose you change your mind on the type of the Source property and change it to an int instead. Now all future logs will have the source indexed in the int index and you will have your old values stored in the string index. If your POCO class is still limited to searching only the string index you will not get back any of your new documents. To solve that, you can tell the system to look in multiple indexes by changing your POCO class to match the one below.

class Log
{
    public long Id { get; set; } 
    public DateTime Date { get; set; }

    [Index(IndexSearchOptions.String)]
    [Index(IndexSearchOptions.Int)]
    public int Source { get; set; }

    [Index(IndexSearchOptions.None)]
    public string Message { get; set; }
}

Now any searches done on the Source property will look in both the String and Int indexes.

As a side note, when searching through multiple indexes for values threading is used. This helps keep searching fast but optimizing your indexes can still have significant benefits to the performance of the system.

Dropping Collections

There may come a time where you no longer need the data in a collection. You can drop that collection and all of its' indexes. When a collection is dropped its' file space is flagged for reuse. This allows other collections and or new collections to use that file space in the future.

To drop a collection follow the example below.

database.DropCollection("Logs");

Encryption

RizeDb supports AES encryption. When a database is encrypted all data including the logs are encrypted so that no part of your database is exposed. However, of the password that unlocks the database is ever lost, you will not be able to recover your data. To use encryption the database must be created with it, and once created the password cannot be changed or removed.

Future versions of RizeDb will allow for changing and removing of passwords but for now, it is a one-way trip.

An example of creating an encrypted data store can be seen below.

using (var database = new RizeDb.DocumentStore(stream, "Password")) 
{
//Do operations here 
}

Use the same code when opening the database in the future.

Logging

Logging is an important part of any database. Logging is what guarantees that data is written correctly and prevents data corruption. RizeDb has its' logs in the same file as the data is stored. This allows for more portability of the file without corruption, but also means that a file can get large. Each time a document is stored, updated or removed a record of that is written to the log first. Once the data has been written to the log it is then written to the database. If an outage of some sort were to happen while writing to the log the database remains intact with no changes made. If the data is written to the log successfully and a crash or outage was to take place the system would see that data was written to the log and it would finish updating the database.

In RizeDB logs can grow but they don't shrink. So if you are storing large documents or files you may see that your database file seems bigger than expected. That is because the data is first written to a log and it will allocate as much space as needed to store that data, then it will be stored in the database. So if you are storing a 2MB file in the database your database will end up being a little more than 4MB in size. Log space is reused but it will always be as large as the largest transaction written. So storing two 2MB files in the database will yield a database file that is about 6MB. Three 2MB files will yield an 8MB database file.

Future versions of RizeDb will allow log files to be reused in other parts of the database if they are not frequently used.

Settings

Every application needs some sort of location to store simple data. Perhaps login credentials to a third-party app. Maybe a version number of your app or a timestamp of the last time data was synchronized. Whatever it is, it does not make sense to create a custom file or a full collection in the database to store that data. The Document Store in RizeDb has a feature called Settings. The feature allows you to store whatever objects you want with a key-value pair interface. And if you are using encryption the data stored in settings will be encrypted as well.

Using Settings is simple.

using (var database = new RizeDb.DocumentStore(stream)) {     
    var settings = new SettingsClass() { Value = "Hey Yall" };     
    database.SetSetting("MySetting", settings); 
}

To retrieve a setting is simple.

using (var database = new RizeDb.DocumentStore(stream)) 
{ 
    var restoredSetting = database.GetSetting<SettingsClass>("MySetting"); 
}

The Future

The document store of RizeDb is very powerful and feature-rich already. And based on its' popularity we will keep adding features. Based on user feedback and our own wants going forward.

If you would like to report a bug or recommend a feature please email vizeotech@outlook.com