JSORM.db.index

From Jsorm

Jump to: navigation, search

Overview

An index is responsible for implementing a faster (general constant-time) lookup within a database table, in the case of jsormdb, a JSORM.db.engine implementation. The purpose of an index is to provide a constant-time or near-constant-time lookup for indexed fields and specific searches.

Normally, to lookup values in a table requires scanning the entire table, known as a full-table scan. For example, if there are 1,000 records in a table, and the query wants to find all of those where the name field is equal to "John", then the database must look at each and every record, in order, i.e. "scan the full table", in order to find all of the records where the name field is equal to "John". When you index the name field, you provide a lookup table, or similar structure, such that you can go from "John" directly to the record(s) you need.

Of course, there is no such thing as a free lunch (most politicians' assertions notwithstanding). The price you pay for indexing a field is that every activity that changes that field - update, insert or delete - must also update the index, thus taking slightly more time. The index's underlying format can be anything it chooses, some of the most common structures being:

  • hash
  • B-tree

Check out the Wikipedia entry for more information and references.

jsormdb ships with one index, a hash JSORM.db.index.hash.

Signature

An index is expected to accept a single parameter passed to it at instantiation, the initial fields to index. This initial set of fields can be a single string naming the field, or an array of fields.

The actual constructor should be a function that does not require the "new" keyword, and thus should be named in lower-case.

An index is expected to have the following method signature:

  • fields: Add one or more fields to be indexed. A single argument, either a string name of a field, or an array of field names. If a field is already indexed, do nothing with the call.
  • unfields: Remove one or more fields from indexing. A single argument, either a string name of a field, or an array of field names. If a field is not indexed, do nothing with the call.
  • add: Add one or more records to the index. Expects the following arguments:
    • index: A single index or an array of index locations, internal to the storage engine, where these records are stored.
    • records: A single record or an array of the actual data of the records added to the storage engine. The length of records must match the length of index.
  • remove: Remove a single record from the index. Expects the following arguments:
    • index: The index location, internal to the storage engine, where the record to be removed is stored.
    • record: The actual record being removed.
  • clear: Clear all indexed data, but keep the actual field names indexed. If "name" is indexed and "John" is known at records 225 and 350, then after clear(), "name" is still indexed, but "John" is not known to be anywhere.
  • find: Find all records matching a certain query. Accepts one argument, "where", which is a search term. See jsormdb#Querying_data. The index is expected to determine if it can do a lookup. Should return one of three values:
    • null: If the index cannot lookup this query, e.g. the field is not indexed, it is not a query primitive, the compares is not valid for this type of index.
    • empty array: If the index can lookup this query, but no records actually matched.
    • filled array: If the index can lookup this query, it should return an array of the indexes, internal to the storage engine, where the matching records can be found.
  • update: Change data in a single record, which may require changing indexes. For example, if the indexed field is "name", and "John" points to [1,5,88] while "Jill" points to [2,65,100] and then record 65 is changed from "Jill" to "John", the index needs to update the "John" index to [1,5,65,88] while "Jill" needs to be updated to [2,100]. Expects the following arguments:
    • old: Object with the data as it was before the change. Only changed fields should be in this object.
    • newdata: Object with the data as it is after the change. Only changed fields should be in this object.
    • index: Index of the changed record.


Internal Representation

The index is used only by the storage engine. Thus, the data entered into the index and retrieved from the index are internal references useful only to the storage engine that manages the index. An index instance should never be shared across multiple storage engines, and should never be accessed outside of the managing storage engine.

Personal tools