db



NAME

       db - the DB library overview and introduction


DESCRIPTION

       The  DB  library  is  a family of groups of functions that
       provides a modular programming interface  to  transactions
       and  record-oriented  file  access.   The library includes
       support for transactions, locking, logging and  file  page
       caching,  as well as various indexed access methods.  Many
       of the functional groups  (e.g.,  the  file  page  caching
       functions)  are  useful  independent of the other DB func-
       tions, although  some  functional  groups  are  explicitly
       based  on  other functional groups (e.g., transactions and
       logging).  For a general description of  the  DB  package,
       see  db(3).   For a description of the access methods, see
       db_open(3).  For a description of  cursors  within  access
       methods,  see  db_cursor(3);  transactions, see db_txn(3);
       the lock manager, see db_lock(3);  the  log  manager,  see
       db_log(3);  the memory pool manager, see db_mpool(3).  For
       information on configuring the DB  transaction  processing
       environment,  and DB support utilities, see db_appinit(3),
       db_archive(1),   db_checkpoint(1),   db_deadlock(1)    and
       db_recover(1).   For  information on dumping and reloading
       DB databases, see db_dump(1) and db_load(1).

       The DB library does  not  provide  user  interfaces,  data
       entry  GUI's,  SQL  support  or  any of the other standard
       user-level database interfaces.  What it does provide  are
       the  programmatic building blocks that allow you to easily
       embed database-style functionality and support into  other
       objects or interfaces.


ARCHITECTURE

       The  DB  library supports two different models of applica-
       tions: client-server and embedded.

       In the client-server model, a database server  is  created
       by  writing  an application that accepts requests via some
       form of IPC and issues calls to the DB functions based  on
       those  queries.   In  this  model, applications are client
       programs that attach to the server and issue queries.  The
       client-server  model trades performance for protection, as
       it does not require that the applications share a  protec-
       tion  domain  with  the  server,  but IPC/RPC is generally
       slower than a function call.  In addition, this model sim-
       plifies  the  creation  of  network client-server applica-
       tions.

       In the embedded model, an application links the DB library
       directly into its address space.  This provides for faster
       access to  database  functionality,  but  means  that  the
       applications  sharing log files, lock manager, transaction
       manager or memory pool manager have the ability  to  read,
       write, and corrupt each other's data.
       It  is the application designer's responsibility to select
       the appropriate model for their application.

       Applications require a single include file, <db.h>,  which
       must  be  installed in an appropriate location on the sys-
       tem.

       The DB library is made up of  five  major  subsystems,  as
       follows:

       Access methods
            The  access  methods subsystem is made up of general-
            purpose support for creating and accessing files for-
            matted as B+tree's, hashed files, and fixed and vari-
            able length records.  These modules are useful in the
            absence of transactions for processes that want fast,
            formatted file support.  See db_open(3)  and  db_cur-
            sor(3).

       Locking
            The  locking subsystem is a general-purpose lock man-
            ager used by  DB.   This  module  is  useful  in  the
            absence  of  the rest of the DB package for processes
            that want a fast,  configurable  lock  manager.   See
            db_lock(3) for more information.

       Logging
            The  logging subsystem is the logging support used to
            support the DB transaction model.  It is largely spe-
            cific  to  the  DB  package,  and unlikely to be used
            elsewhere.  See db_log(3) for more information.

       Memory Pool
            The memory  pool  subsystem  is  the  general-purpose
            shared memory buffer pool used by DB.  This module is
            useful outside of the DB package for  processes  that
            want  page-oriented, cached, shared file access.  See
            db_mpool(3) for more information.

       Transactions
            The transaction subsystem implements the DB  transac-
            tion  model.   It is largely specific to the DB pack-
            age.  See db_txn(3) for more information.

       There are several stand-alone utilities that  support  the
       DB environment.  They are as follows:

       db_archive
            The  db_archive  utility  supports  database  backup,
            archival   and   log   file   administration.     See
            db_archive(1) for more information.

       db_recover
            The db_recover utility runs after an unexpected DB or
            system failure to restore the database to  a  consis-
            tent  state.  See db_recover(1) for more information.

       db_checkpoint
            The db_checkpoint utility runs as a  daemon  process,
            monitoring  the database log and periodically issuing
            checkpoints.  See db_checkpoint(1) for more  informa-
            tion.

       db_deadlock
            The  db_deadlock  utility  runs  as a daemon process,
            periodically traversing the database lock  structures
            and aborting transactions when it detects a deadlock.
            See db_deadlock(1) for more information.

       db_dump
            The db_dump utility writes a copy of the database  to
            a   flat-text   file   in  a  portable  format.   See
            db_dump(1) for more information.

       db_load
            The db_load utility reads the flat-text file produced
            by  db_dump,  and loads it into a database file.  See
            db_load(1) for more information.

       db_stat
            The db_stat utility displays statistics for databases
            and  database  environments.  See db_stat(1) for more
            information.


NAMING AND THE DB ENVIRONMENT

       The  DB  application  environment  is  described  by   the
       db_appinit(3)  manual  page.   The  db_appinit function is
       used to create a consistent naming scheme for all  of  the
       subsystems sharing a DB environment.  If db_appinit is not
       called by a DB application, naming is performed as  speci-
       fied by the manual page for the specific subsystem.

       DB  applications that run with additional privilege should
       always call the db_appinit function to initialize DB  nam-
       ing for their application.  This ensures that the environ-
       ment variables DB_HOME and TMPDIR will only be used if the
       application explicitly specifies that they are safe.


ADMINISTERING THE DB ENVIRONMENT

       A DB environment consists of a database home directory and
       all the long-running daemons necessary to ensure continued
       functioning  of  DB and its applications.  In the presence
       of transactions,  the  checkpoint  daemon,  db_checkpoint,
       must be run as long as there are applications present (see
       db_checkpoint(1) for  details).   When  locking  is  being
       used,  the deadlock detection daemon, db_deadlock, must be
       run  as  long  as  there  are  applications  present  (see
       db_deadlock(1)   for  details).   The  db_archive  utility
       provides information to  facilitate  log  reclamation  and
       creation  of  database  snapshots  (see  db_archive(1) for
       details.   After  application  or  system   failure,   the
       db_recover utility must be run before any applications are
       restarted to return the database  to  a  consistent  state
       (see db_recover(1) for details).

       The  simplest  way to administer a DB application environ-
       ment is to create a single ``home'' directory which houses
       all the files for the applications that are sharing the DB
       environment.  In this model,  the  shared  memory  regions
       (i.e.,  the locking, logging, memory pool, and transaction
       regions) and log files will be  stored  in  the  specified
       directory  hierarchy.   In addition, all data files speci-
       fied using relative pathnames will be  named  relative  to
       this home directory.  When recovery needs to be run (e.g.,
       after system or application failure),  this  directory  is
       specified  as the home directory to db_recover(1), and the
       system is restored to a consistent state,  ready  for  the
       applications to be restarted.

       In situations where further customization is desired, such
       as placing the log files on a separate device, it is  rec-
       ommended  that the application installation process create
       a configuration file named ``DB_CONFIG'' in  the  database
       home   directory,   specifying   the  customization.   See
       db_appinit(3) for details on this procedure.

       The DB architecture does not support  placing  the  shared
       memory  regions  on  remote filesystems, e.g., the Network
       File System (NFS) and the Andrew File System  (AFS).   For
       this  reason, the database home directory must reside on a
       local filesystem.   Databases,  log  files  and  temporary
       files  may  be  placed on remote filesystems, although the
       application may incur a performance penalty for so  doing.

       It is important to realize that all applications sharing a
       single home directory implicitly trust each  other.   They
       have  access  to  each  other's  data as it resides in the
       shared memory buffer pool and will share resources such as
       buffer  space  and  locks.  At the same time, any applica-
       tions that access the same files must share an environment
       if  consistency  is  to be maintained across the different
       applications.


MULTI-THREADING

       The DB library is not itself multi-threaded.  The  library
       was deliberately architected to not use threads internally
       because of the portability  problems  that  using  threads
       within the library would introduce.

       DB  supports  multi-threaded  applications with the caveat
       that it loads and calls functions that are commonly avail-
       able   in  C  language  environments  and  which  may  not
       themselves be thread-safe.  Other than this usage, DB  has
       no  static  data  and  maintains  no local context between
       calls to DB functions.  To ensure  that  applications  can
       safely  use  threads  in the context of DB, porters to new
       operating systems and/or C libraries must confirm that the
       system  and C library functions used by the DB library are
       thread-safe.

       Object handles returned  from  DB  library  functions  are
       free-threaded, i.e., threads may use handles concurrently,
       by specifying the DB_THREAD flag to db_appinit(3) and  the
       other subsystem open functions.

       There  are  a  few  additional  caveats  concerning  using
       threads to access the DB library:

       1      Spinlocks must have been implemented for  the  com-
              piler/architecture   combination.    Attempting  to
              specify the DB_THREAD flag will fail  if  spinlocks
              are not available.

       2      The  DB_THREAD  flag must be specified for all sub-
              systems either explicitly  or  via  the  db_appinit
              function.   Setting  the  DB_THREAD  flag inconsis-
              tently may result in database corruption.

       3      Only a single thread may call  the  close  function
              for  a  returned database or subsystem handle.  See
              db_open(3) and  the  appropriate  subsystem  manual
              pages for more information.

       4      Either  the  DB_DBT_MALLOC  or DB_DBT_USERMEM flags
              must  be  set  in  a  DBT  used  for  key  or  data
              retrieval.  See db_open(3) for more information.

       5      The  DB_CURRENT,  DB_NEXT  and DB_PREV flags to the
              log_get function may not be used by a free-threaded
              handle.   If  such  calls  are  necessary, a thread
              should explicitly create a unique DB_LOG handle  by
              calling log_open(3).  See db_log(3) for more infor-
              mation.

       6      Each database operation (i.e., any call to a  func-
              tion  underlying the handles returned by db_open(3)
              and db_cursor(3)) is normally performed  on  behalf
              of  a unique locker.  If, within a single thread of
              control, multiple  calls  on  behalf  of  the  same
              locker are desired, then transactions must be used.
              For example, consider the case where a cursor  scan
              locates  a  record,  and then based on that record,
              accesses some other item in the database.  If these
              are  done using the default lockers for the handle,
              there is no guarantee  that  these  two  operations
              will  not  conflict.   If the application wishes to
              guarantee that  the  operations  do  not  conflict,
              locks  must be obtained on behalf of a transaction,
              instead of the default locker id, and a transaction
              must  be  specified  to the cursor creation and the
              subsequent db call.

       7      Transactions  may  not  span  threads,  i.e.,  each
              transaction  must begin and end in the same thread,
              and each transaction may only be used by  a  single
              thread.


ERROR RETURNS

       Except  for  the  historic dbm and hsearch interfaces (see
       db_dbm(3) and db_hsearch(3)), DB does not use  the  global
       variable  errno to return error values to the calling pro-
       cess or thread.  The return values for  all  DB  functions
       can be grouped into three categories:

        0   A  return value of 0 indicates that the operation was
            successful.

       >0   A return value that is greater than 0 indicates  that
            there  was  a system error.  The errno value returned
            by the system is returned by the function, e.g., when
            a  DB  function  is  unable  to  allocate memory, the
            return value from the function will be ENOMEM.

       <0   A return value that is less than 0 indicates a condi-
            tion  that  was  not a system failure, but was not an
            unqualified success, either.  For example, a  routine
            to  retrieve  a  key/data pair from the database will
            return DB_NOTFOUND when the key/data  pair  does  not
            appear  in the database, as opposed to the value of 0
            which would be returned if  the  key/data  pair  were
            found  in  the  database.   All  such  special values
            returned by DB functions are less than 0 in order  to
            avoid conflict with possible values of errno.


DATABASE AND PAGE SIZES

       DB  stores  database  file page numbers as unsigned 32-bit
       numbers and database file page sizes  as  unsigned  16-bit
       numbers.  This results in a maximum database size of 2^48.
       The minimum database page size is 512 bytes, resulting  in
       a minimum maximum database size of 2^41.

       DB  is potentially further limited if the host system does
       not have filesystem support for files  larger  than  2^32,
       including seeking to absolute offsets within such files.

       The maximum btree depth is 255.


BYTE ORDERING

       The  database files created by DB can be created in either
       little or big-endian  formats.   By  default,  the  native
       format  of  the  machine  on which the database is created
       will be used.  Any  format  database  can  be  used  on  a
       machine  with  a  different  native format, although it is
       possible that the application  will  incur  a  performance
       penalty for the run-time conversion.


EXTENDING DB

       DB  includes tools to simplify the development of applica-
       tion-specific logging and recovery.  Specifically, given a
       description  of  the information to be logged, these tools
       will automatically  create  logging  functions  (functions
       that  take the values as parameters and construct a single
       record that is written to the log), read functions  (func-
       tions  that  read  a  log record and unmarshall the values
       into a structure that maps onto the values  you  chose  to
       log),  a print function (for debugging), templates for the
       recovery functions,  and  automatic  dispatching  to  your
       recovery functions.


EXAMPLES

       There  are  several  different ways that the DB library is
       used:

       1      Applications that want to use  formatted  files  to
              store  data,  and  are  unconcerned with concurrent
              access and loss of data due to  catastrophic  fail-
              ure.   Generally,  these applications create short-
              lived databases that  are  discarded  or  recreated
              when the system fails.  Such applications will only
              be concerned with the DB access  methods.   The  DB
              access  methods will use the memory pool subsystem,
              but the application is  unlikely  to  be  aware  of
              this.   See the file examples/ex_access.c in the DB
              source distribution for a C language  code  example
              of  how  such  an  application  might  use  the  DB
              library.

       2      Applications similar to #1, but that also  wish  to
              use  db_appinit(3)  for environment initialization.
              See the file examples/ex_appinit.c in the DB source
              distribution  for  a C language code example of how
              such an application might use the DB library.

       3      Applications  that  wish  to  transaction   protect
              structures  other  than the DB access methods.  See
              the file examples/ex_trans.c in the DB source  dis-
              tribution for a C language code example of how such
              an application might use the DB library.

       4      Applications that use the DB  access  methods,  but
              are   concerned  about  catastrophic  failure,  and
              therefore want to transaction protect the  underly-
              ing  DB  files.  See the file examples/ex_tpcb.c in
              the DB source distribution for a  C  language  code
              example of how such an application might use the DB
              library.

       5      Applications that want to buffer input files  other
              than  the  DB  access  method  files.  See the file
              examples/ex_mpool.c in the DB  source  distribution
              for a C language code example of how such an appli-
              cation might use the DB library.

       6      Applications that want a general purpose lock  man-
              ager  separate  from  locking  support  for  the DB
              access methods.  See the file examples/ex_lock.c in
              the  DB  source  distribution for a C language code
              example of how such an application might use the DB
              library.


COMPATIBILITY

       The DB 2.0 library provides backward compatible interfaces
       for the  historic  UNIX  dbm(3),  ndbm(3)  and  hsearch(3)
       interfaces.   See  db_dbm(3) and db_hsearch(3) for further
       information on these interfaces.  It also provides a back-
       ward   compatible  interface  for  the  historic  DB  1.85
       release.  DB 2.0 does not provide  database  compatibility
       for  any  of  the above interfaces, and existing databases
       must be converted manually.  To convert existing databases
       from  the  DB 1.85 format to the DB 2.0 format, review the
       db_dump185(1) and db_load(1) manual pages.

       The name space in DB 2.0 has been  changed  from  that  of
       previous  DB versions, notably version 1.85, for portabil-
       ity and consistency reasons.  The only name collisions  in
       the  two  libraries  are  the  names  used  by the dbm(3),
       ndbm(3), hsearch(3) and the DB 1.85  compatibility  inter-
       faces.   To  include  both  DB 1.85 and DB 2.0 in a single
       library, remove the dbm(3), ndbm(3) and hsearch(3)  inter-
       faces  from  either  of the two libraries, and the DB 1.85
       compatibility interface from the DB 2.0 library.  This can
       be done by editing the library Makefiles and reconfiguring
       and rebuilding the DB 2.0 library.  Obviously, if you  use
       the  historic  interfaces, you will get the version in the
       library from which you did not remove it.  Similarly,  you
       will  not be able to access DB 2.0 files using the DB 1.85
       compatibility interface, since you have removed that  from
       the library as well.

       It  is  possible  to simply relink applications written to
       the DB 1.85 interface against the DB 2.0 library.   Recom-
       pilation  of  such  applications is slightly more complex.
       When the DB 2.0 library  is  installed,  it  installs  two
       include  files,  db.h  and  db_185.h.   The former file is
       likely to replace the DB 1.85 version's include file which
       had the same name.  If this did not happen, recompiling DB
       1.85 applications to use the DB  2.0  library  is  simple:
       recompile  as  done  historically, and load against the DB
       2.0 library instead of the DB 1.85 library.  If,  however,
       the  DB 2.0 installation process has replaced the system's
       db.h include file, replace the  application's  include  of
       db.h with inclusion of db_185.h, recompile as done histor-
       ically, and then load against the DB 2.0 library.

       Applications written using the historic interfaces of  the
       DB  library  should not require significant effort to port
       to the DB 2.0 interfaces.   While  the  functionality  has
       been  greatly  enhanced  in DB 2.0, the historic interface
       and functionality and is largely unchanged.  Reviewing the
       application's calls into the DB library and updating those
       calls to the new names, flags and return values should  be
       sufficient.

       While loading applications that use the DB 1.85 interfaces
       against the DB 2.0 library, or converting DB 1.85 function
       calls  to  DB  2.0 function calls will work, reconsidering
       your application's interface to the DB database library in
       light  of the additional functionality in DB 2.0 is recom-
       mended, as it is likely to result in enhanced  application
       performance.



SEE ALSO

       db_archive(1), db_checkpoint(1), db_deadlock(1), db_dump(1),
       db_load(1), db_recover(1), db(3), db_appinit(3), db_cursor(3),
       db_dbm(3), db_lock(3), db_log(3), db_mpool(3), db_open(3),
       db_txn(3)

       LIBTP: Portable, Modular Transactions for UNIX, Margo Seltzer,
       Michael Olson, USENIX proceedings, Winter 1992.