|
What is Aruna DB? |
Last Updated: 9/25/2001
A_Index
This Ruby class provides index support for ArunaDB. ArunaDB is a database server written in Ruby. Indexes allow you to easily iterate over sub-sets of the data in tables. You can create indexes using one or more columns in a table and then iterate over the data in the table by using the index.
Don't create indexes on the primary key in tables. This is redundant and time consuming. Indexes are useful when you have a membership table with the member id as the primary key and you want to find all membership records with a list name of 'Davis'. Create an index on the last_name and iterate over the last_name index looking for rows with the last_name = 'Davis'. Indexes use the primary key of the table for looking up data. As a result, iterating over all rows in a table using an index is slower than iterating over the row in the table directly. As a rule of thumb, only use an index if you can pair down the data (using the min_key and max_key parameters) such that you are only iterating a small subset of the data in the table.
Each index uses two btrees, one btree holds the data (called the index btree) and the other btree is for transaction support (called the lock btree). When transactions are enabled (this is the default) all Inserts, updates, and deletes are written to the lock btree. When you call commit the inserts, updates, and deletes are taken out of the lock btree and are applied to the index btree. Each index is opened by the table it is associated with so the table can automatically update it's indexes. When you connect to an index, you are actually connecting to the same index used by the A_Table class to maintain the index. In other words, the table (the A_Table_Shared class) caches all indexes in memory.
#create a table
cols = []
cols.push(A_Column.new('member_id', 'I', true, nil, '%d > 0', nil, '%3d'))
cols.push(A_Column.new('last_name', 'v32', true, nil, "'%s' != ''", "'%s'.capitalize", '%-12s'))
cols.push(A_Column.new('first_name', 'v32', nil, '', nil, "'%s'.capitalize", '%-12s'))
m = A_Table.new('Membership', cols, 'member_id')
# lets put some data in the table
t1 = A_Transaction.new()
m1.insert(t1, %w(member_id last_name first_name status bool3 bool2), [101, 'davis', 'Michael', 5, 1, 'N'])
m1.insert(t1, %w(member_id last_name first_name status), [102, 'smith', 'bob', 5])
t1.commit()
# create an index on the name
i1 = A_Index.new('name_idx', 'Membership', 'last_name, first_name', '', 'Indexes', 'Indexes', 2, 1)
#print all rows using the index
i1.each(nil, nil, nil){|data| print " #{data}\n"}
print " #{i1.rows_accessed} rows\n"
#print all rows using the index maching 'Davis'
i1.each(nil, ['Davis'], ['Davis']){|data| print " #{data}\n"}
# the array is required for the min_key and max_key values
print " #{i1.rows_accessed} rows\n"
i1.close
m.close
A_Index.connect(index_name, table_name)
Connect to an existing index.
A_Index.create(index_name, table_name, column_names, type='', filestore_name='Indexes', lock_filestore='Indexes', node_page_size=1, leaf_page_size=1)
Alias for A_Index.new()
A_Index.drop(index_name, table_name)
Drop an index.
A_Index.exists?(index_name, table_name)
Returns true if this index exists. Otherwise, returns false.
A_Index.new(index_name, table_name, column_names=nil, type='', filestore_name='Indexes', lock_filestore='Indexes', node_page_size=1, leaf_page_size=1)
Creates a new index.
A_Index.open(index_name, table_name)
Alias for A_Index.connect()
begin_update(transaction)
This prepares the A_Table_Data object that has been yielded by an iterator for updating. You must call this before you change any values in the A_Table_Data object and before you call update_row(). This method is only used in conjunction with the update_row() method.
clear()
Alias for truncate().
close()
Close this index and free any used resources. This does not close any references the table may have to this index.
column_count()
Returns a count of the number of columns in the table associated with this index. This is an alias for A_Table.column_count(). To get a count of columns that are indexed, see index_column_count().
column_names()
Returns an array of column names in the table associated with this index. This is an alias for A_Table.column_names(). To get the name of the columns used in this index, see index_key_names().
delete(transaction, pkey)
Delete the row that matched the pkey from the table associated with this index. This is an alias for A_Table.delete(). FYI, this method operates on the data in the table rather than the data in the index.
delete_row(transaction)
Delete a row from the table this index is associated with while you are iterating over rows in this index. This deletes the row you are looking at. This is an alias for A_Table.delete_row(). FYI, this method operates on the data in the table rather than the data in the index. The iterator must also have the update parameter set to true.
drop()
Drop this index.
each(transaction=nil, min=nil, max=nil, update=nil)
Iterate over all rows in the index. This yields the A_Table_Data object for the table associated with this index (not the A_Table_Data object for this index). This allows you to operate on the table using this index. See each_key() to iterate over rows in the index using the A_Table_Data object for the index. The min and max parameters are a bit unusual. The btree uses these values to narrow down how many pages have to be read to retrieve the values you are looking for. If both values are nil, then all rows in the index are iterated over. If you are looking to iterate over a range of keys, then set the min to smallest key value and the max the largest. Using min and max values could really improve your performance.
each_col()
Yields each A_Column object in the table. This is an alias for A_Table.each_col().
each_key(transaction=nil, min=nil, max=nil, update=nil) # for each primary key, yeild(pkey), the pkey may not exist yet in the primary key
Iterate over all rows in the index. This yields the A_Table_Data object for the index. This allows you to retrieve the primary keys for the values in the index. See each() to iterate over rows in the index using the A_Table_Data object for the table.
each_pkey()
Alias for each.
each_row
Alias for each.
each_sorted(transaction, min, max, sort_block=nil)
Iterate over all rows in the index sorted by values other than the key. This uses the A_FileSort class to sort the rows. This yields an A_Table_Data object for the table associated with this index (not the A_Table_Data object for this index). The min and max parameters are a bit unusual. The btree uses these values to narrow down how many pages have to be read to retrieve the values you are looking for. If both values are nil, then all rows in the table are iterated over. If you are looking to iterate over a range of keys, then set the min to smallest key value and the max the largest. Using min and max values could really improve your performance.
each_value
Alias for each.
exists?(transaction, key)
Returns true if key exists in the index. Returns false if key does not exist in the index.
find(transaction, key)
Returns the first occurrence of the A_Table_Data object of the record associated with this key. This is the A_Table_Data object associated with the table and not the index.
find_last(transaction, key)
Returns the last occurrence of the A_Table_Data object of the record associated with this key. This is the A_Table_Data object associated with the table and not the index.
get_columns(col_names)
Returns an array of columns used by this index.
index_key_names()
Returns an array of column names used in this index.
index_column_count()
Returns the count of column names used in this index.
index_name()
Returns the name of this index.
insert(transaction, columns, values) # insert one record into the table base on the provided column list and values list
This inserts the row (values) into the table associated with this index. This is an alias for A_Table.insert(). FYI, this method operates on the data in the table rather than the data in the index.
insert_row(transaction)
This is an alias for A_Table.insert_row(). See A_Table.insert_row() for details. FYI, this method operates on the data in the table rather than the data in the index.
is_key?(column_name)
Returns true is this column_name is part of the key, otherwise returns false.
key_count()
Returns a count of the number of columns used in the primary key of the table associated with this index. This is an alias for A_Table.key_count(). To get a count of columns that are indexed, see index_column_count().
key_names()
Returns an array of column names used in the primary key of the table associated with this index. This is an alias for A_Table.key_names(). To get an array of column names that are indexed, see index_column_names().
load()
This loads data into the index from the table. This truncates the data in this index before starting. This can be used to repopulate the index at any time. This must be called if you disabled indexes in transaction and commit the transaction. This is automatically called when you create an index.
lock()
Grab the lock associated with this index. This lock is shared by all objects that connect to this index.
name()
Returns the name of this index.
prepare(transaction)
Prepares the table for inserting. This is an alias for A_Table.prepare(). See A_Table.prepare() for details.
rename_table(new_table_name)
Don't call this directly. To rename a table call A_Table.rename(). This is called internally by the method that renames a table. This affects or updates the table and all indexes associated with the table.
reset()
Alias for prepare().
rows_accessed()
Return the number of rows yielded by the iterator methods. I was too lasy to use my own counters, so I track the counter internally in the iterator methods.
rows_deleted()
Returns the number of rows delete in a call to update() or after iterating and calling delete_row().
rows_inserted()
Returns the number of rows inserted in a call to insert(). This should also return 1.
rows_updated()
Returns the number of rows updated in a call to update() or after iterating and calling udpate_row().
set_defaults ()
Alias for prepare
show(prefix='')
Print information about this index and the btree used by this index. Prefix allows you to indent the output for nice formatting.
synchronize()
Mutex.synchronize the lock associated with this index. This lock is shared by all objects that connect to this index.
table_name()
Returns the name of the table associated with this index.
to_s()
Returns a string containing information about this index.
truncate()
Remove all the rows in this index. This is automatically called when the index is loaded. To truncate the table, call A_Table.truncate. This will also truncate all indexes associated with this table. If you call this, you must reload the index.
unlock()
Release the lock associated with this index. This lock is shared by all objects that connect to this index.
update(transaction, pkey, column_names, values) # insert one record into the table base on the provided column list and values list
This updates the row in the table associated with pkey. This is an alias for A_Table.update(). FYI, this method operates on the data in the table rather than the data in the index.
update_row(transaction)
Update the current row in the table you are iterating over. This method can only be used in conjunction with one of the methods that iterator such as each, each_key, each_value, etc. You must call begin_update before you start changing value in the the A_Table_Data object that is yielded by the iterator. FYI, this method operates on the data in the table rather than the data in the index. The iterator must also have the update parameter set to true.
Here is an example:
m1 = A_Table.connect('Membership')
t1 = A_Transaction.new
# this iterates only over rows where the pkey == 102
m1.each(t1, 102, 102) {|data|
data = m1.begin_update(t1)
data.first_name = 'atest'
m1.update_row(t1)
}
t1.commit
m1.close
use()
Alias for prepare().
tst_a_index.rb - this script performs basic testing for the A_Index class. To run this tests type:
ruby -I.. tst_a_table.rb
11 - prints information about creating, dropping, and closing indexes. Also prints information relating to loading indexes.
15 - prints information about iterating over indexes.