db_open



NAME

       db_open - database access methods


SYNOPSIS

       On Solaris, load with -lthread:
            cc [ flag ... ] file ...  -lthread [ library ... ]

       #include <db.h>

       int
       db_open(const char *file, DBTYPE type,
            int flags, int mode, DB_ENV *dbenv, DB_INFO *dbinfo, DB **dbpp);


DESCRIPTION

       The  DB  library  is  a family of groups of functions that
       provides a modular programming interface  to  transactions
       and  record-oriented  file  access.   The library includes
       support for transactions, locking, logging and  file  page
       caching,  as well as various indexed access methods.  Many
       of the functional groups  (e.g.,  the  file  page  caching
       functions)  are  useful  independent of the other DB func-
       tions, although  some  functional  groups  are  explicitly
       based  on  other functional groups (e.g., transactions and
       logging).  For a general description of  the  DB  package,
       see  db(3).   For a description of the access methods, see
       db_open(3).  For a description of  cursors  within  access
       methods,  see  db_cursor(3);  transactions, see db_txn(3);
       the lock manager, see db_lock(3);  the  log  manager,  see
       db_log(3);  the memory pool manager, see db_mpool(3).  For
       information on configuring the DB  transaction  processing
       environment,  and DB support utilities, see db_appinit(3),
       db_archive(1),   db_checkpoint(1),   db_deadlock(1)    and
       db_recover(1).   For  information on dumping and reloading
       DB databases, see db_dump(1) and db_load(1).

       This manual page describes the overall structure of the DB
       library access methods.

       The currently supported file formats are btree, hashed and
       recno.  The btree format is a representation of a  sorted,
       balanced tree structure.  The hashed format is an extensi-
       ble, dynamic hashing scheme.  The  recno  format  supports
       fixed  or  variable  length  records (optionally retrieved
       from a flat text file).

       The db_open function opens  the  database  represented  by
       file  for  both reading and writing.  Files never intended
       to be shared or preserved on disk may be created  by  set-
       ting the file parameter to NULL.

       The  db_open  function  copies a pointer to a DB structure
       (as typedef'd in the <db.h> include file), into the memory
       location  referenced  by  dbpp.  This structure includes a
       set of functions to perform various database  actions,  as
       described  below.   The db_open function returns the value
       of errno on failure and 0 on success.

       Note, while most of the access methods  use  file  as  the
       name  of  an  underlying file on disk, this is not guaran-
       teed.  Also, calling db_open  is  a  reasonably  expensive
       operation.   This is based on a model where the DBMS keeps
       a set of files open for a long time  rather  than  opening
       and closing them on each query.)

       The  type  argument  is  of type DBTYPE (as defined in the
       <db.h> include file) and must be set to one  of  DB_BTREE,
       DB_HASH,  DB_RECNO  or DB_UNKNOWN.  If type is DB_UNKNOWN,
       the database must already  exist  and  db_open  will  then
       determine if it's of type DB_BTREE, DB_HASH or DB_RECNO.

       The  flags  and  mode  arguments specify how files will be
       opened and/or created when they don't already exist.   The
       flags value is specified by or'ing together one or more of
       the following values:

       DB_CREATE
            Create any underlying files, as  necessary.   If  the
            files  do not already exist and the DB_CREATE flag is
            not specified, the call will fail.

       DB_NOMMAP
            Do not map this file  (see  db_mpool(3)  for  further
            information).

       DB_RDONLY
            Open  the  database for reading only.  Any attempt to
            write the database using the access methods will fail
            regardless  of the actual permissions of any underly-
            ing files.

       DB_THREAD
            Cause the DB handle returned by the db_open  function
            to  be  useable  by  multiple threads within a single
            address space, i.e., to be ``free-threaded''.

       DB_TRUNCATE
            ``Truncate'' the database if it exists, i.e.,  behave
            as  if the database were just created, discarding any
            previous contents.

       All files created by the access methods are  created  with
       mode  mode  (as described in chmod(2)) and modified by the
       process'  umask  value  at  the  time  of  creation   (see
       umask(2)).   The group ownership of created files is based
       on the system and directory defaults, and is  not  further
       specified by DB.



DB_ENV

       The  access  methods make calls to the other subsystems in
       the DB library based on the  dbenv  argument  to  db_open,
       which  is  a  pointer to a structure of type DB_ENV (type-
       def'd in <db.h>).  It is expected that  applications  will
       use  a  single  DB_ENV structure as the argument to all of
       the subsystems in the DB package.  In order to ensure com-
       patibility  with  future releases of DB, all fields of the
       DB_ENV structure that are not  explicitly  set  should  be
       initialized  to  0  before the first time the structure is
       used.  Do this by  declaring  the  structure  external  or
       static,  or  by  calling the C library routine bzero(3) or
       memset(3).

       The fields of DB_ENV used by db_open are described  below.
       As references to the DB_ENV structure may be maintained by
       db_open, it is necessary that  the  DB_ENV  structure  and
       memory  it references be valid until after the close func-
       tion is called.  If dbenv is NULL or any of its fields are
       set  to  0,  defaults  appropriate for the system are used
       where possible.

       The following DB_ENV  fields  may  be  initialized  before
       calling db_open:

       DB_LOG *lg_info;
            If  modifications  to the file being opened should be
            logged, the lg_info field  contains  a  return  value
            from  the  function log_open.  If lg_info is NULL, no
            logging is done by the DB access methods.

       DB_LOCKTAB *lk_info;
            If locking is required for the file being opened  (as
            is  the  case  when multiple processes or threads are
            accessing the same file), the lk_info field  contains
            a  return  value  from  the  function  lock_open.  If
            lk_info is NULL, no locking is done by the DB  access
            methods.

            If  both locking and transactions are being performed
            (i.e., both lk_info and tx_info  are  non-NULL),  the
            transaction  ID  will  be  used as the locker ID.  If
            only locking is being performed, db_open will acquire
            a  locker ID from lock_id(3), and will use it for all
            locks required for this instance of db_open.

       DB_MPOOL *mp_info;
            If the cache for the  file  being  opened  should  be
            maintained in a shared buffer pool, the mp_info field
            contains a return value from the function  memp_open.
            If  mp_info  is NULL, a memory pool may still be cre-
            ated by DB, but it will be private to the application
            and managed by DB.

       DB_TXNMGR *tx_info;
            If  the accesses to the file being opened should take
            place in the context of transactions (providing atom-
            icity and error recovery), the tx_info field contains
            a  return  value  from  the  function  txn_open  (see
            db_txn(3)).    If  transactions  are  specified,  the
            application is responsible for making suitable  calls
            to  txn_begin, txn_abort, and txn_commit.  If tx_info
            is NULL, no transaction support is  done  by  the  DB
            access methods.

            When  the access methods are used in conjunction with
            transactions, the application must abort the transac-
            tion (using txn_abort) if any of the transaction pro-
            tected access method calls  (i.e.,  any  calls  other
            than  open,  close  and sync) returns an error value.
            As described by db(3), an error value  is  any  value
            greater than 0.


DB_INFO

       The  access  methods are configured using the DB_INFO data
       structure argument to db_open.  The DB_INFO  structure  is
       typedef'd in <db.h> and has a large number of fields, most
       specific to a single access method,  although  a  few  are
       shared.   The fields that are common to all access methods
       are listed here; those specific to  an  individual  access
       method  are  described below.  No reference to the DB_INFO
       structure is maintained by DB, so it is possible  to  dis-
       card it as soon as the db_open call returns.

       In  order  to ensure compatibility with future releases of
       DB, all fields of the DB_INFO structure should be initial-
       ized  to  0  before  the  structure  is  used.  Do this by
       declaring the structure external or static, or by  calling
       the C library function bzero(3) or memset(3).

       If  possible, defaults appropriate for the system are used
       for the DB_INFO fields if dbinfo is NULL or any fields  of
       the DB_INFO structure are set to 0.  The following DB_INFO
       fields may be initialized before calling db_open:

       size_t db_cachesize;
            A suggested maximum size of the memory pool cache, in
            bytes.   If db_cachesize is 0, an appropriate default
            is used.  If the mp_info  field  is  also  specified,
            this field is ignored.

            Note, the minimum number of pages in the cache should
            be no less than 10, and the access methods will  fail
            if  an  insufficiently  large cache is specified.  In
            addition, for applications that exhibit strong local-
            ity  in  their  data  access patterns, increasing the
            size of the cache can significantly improve  applica-
            tion performance.

       int db_lorder;
            The  byte  order  for integers in the stored database
            metadata.  The number should represent the  order  as
            an integer, for example, big endian order is the num-
            ber 4,321, and little  endian  order  is  the  number
            1,234.   If  db_lorder  is  0,  the host order of the
            machine where the DB library was compiled is used.

            The access methods provide no  guarantees  about  the
            byte ordering of the data stored in the database, and
            applications are responsible for maintaining any nec-
            essary ordering.

       size_t db_pagesize;
            The  size  of  the  pages  used  to hold items in the
            database, in bytes.  The minimum  page  size  is  512
            bytes  and  the  maximum  page size is 64K bytes.  If
            db_pagesize is 0, a page size is  selected  based  on
            the   underlying  filesystem  I/O  block  size.   The
            selected size has a lower limit of 512 bytes  and  an
            upper limit of 16K bytes.

       void *(*db_malloc)(size_t);
            The  flag  DB_DBT_MALLOC,  when  specified in the DBT
            structure, will cause the DB library to allocate mem-
            ory  which  then  becomes  the  responsibility of the
            calling application.

            On systems where separate heaps  are  maintained  for
            applications  and  libraries  (notably  Windows  NT),
            specifying the DB_DBT_MALLOC flag will  fail  because
            the  DB library will allocate memory from a different
            heap than the application will use to  free  it.   To
            avoid this problem, the db_malloc field should be set
            to point to the application's allocation routine.  If
            db_malloc  is  non-NULL,  it will be used to allocate
            the memory returned when the  DB_DBT_MALLOC  flag  is
            set.   The  db_malloc function must match the calling
            conventions of the malloc(3) library routine.


BTREE

       The btree data structure is a sorted, balanced tree struc-
       ture  storing associated key/data pairs.  Searches, inser-
       tions, and deletions in the btree will all complete  in  O
       (lg  base  N) where base is the average number of keys per
       page.  Often, inserting ordered data into  btrees  results
       pages  that  are  half-full.  This implementation has been
       modified to make ordered (or  inverse  ordered)  insertion
       the best case, resulting in nearly perfect page space uti-
       lization.

       Space freed by deleting key/data pairs from  the  database
       is  never reclaimed, although it is reused where possible.
       This means that the btree storage structure is  grow-only.
       If  sufficiently  many  keys  are deleted from a tree that
       shrinking the tree is desirable, this can be  accomplished
       by  periodically  creating  a  new tree from a scan of the
       existing one.

       The following additional fields and flags may be  initial-
       ized  before  calling db_open, when using the btree access
       method:

       int (*bt_compare)(const DBT *, const DBT *);
            Compare is the  key  comparison  function.   It  must
            return  an  integer  less  than, equal to, or greater
            than zero if the first key argument is considered  to
            be  respectively less than, equal to, or greater than
            the second key argument.  The same  comparison  func-
            tion  must  be  used on a given tree every time it is
            opened.  If compare is NULL, the  keys  are  compared
            lexically,  with shorter keys collating before longer
            keys.

       int bt_minkey;
            The minimum number of keys that will be stored on any
            single  page.   This value is used to determine which
            keys will be stored on overflow pages, i.e. if a  key
            or  data  item is larger than the pagesize divided by
            the minkey value, it will be stored on overflow pages
            instead  of  in the page itself.  The bt_minkey value
            specified must be at least 2; if bt_minkey  is  0,  a
            value of 2 is used.

       size_t (*bt_prefix)(const DBT *, const DBT *);
            Prefix  is the prefix comparison function.  If speci-
            fied, this function must return the number  of  bytes
            of  the  second  key  argument  that are necessary to
            determine that it is greater than the first key argu-
            ment.   If  the keys are equal, the key length should
            be returned.

            This is used to compress the keys stored on the btree
            internal  pages.   The  usefulness  of  this  is data
            dependent, but in some data sets can produce signifi-
            cantly  reduced  tree  sizes  and  search  times.  If
            bt_prefix is NULL,  and  no  comparison  function  is
            specified,  a  default lexical comparison function is
            used.  If bt_prefix is NULL and a comparison function
            is specified, no prefix comparison is done.

       unsigned long flags;
            The  following  additional  flags may be specified by
            or'ing together one or more of the following values:

            DB_DUP
                 Permit duplicate keys in the tree,  i.e.  inser-
                 tion  when  the  key  of the key/data pair being
                 inserted already exists in the tree will be suc-
                 cessful.  The ordering of duplicates in the tree
                 is determined by the order of insertion,  unless
                 the  ordering is otherwise specified by use of a
                 cursor (see db_cursor(3) for more  information.)


HASH

       The  hash data structure is an extensible, dynamic hashing
       scheme.  Backward compatible interfaces to  the  functions
       described  in dbm(3), ndbm(3) and hsearch(3) are provided,
       however these interfaces are not compatible with  previous
       file formats.

       The  following additional fields and flags may be initial-
       ized before calling db_open, when using  the  hash  access
       method:

       unsigned int h_ffactor;
            Ffactor  indicates  a desired density within the hash
            table.  It is an approximation of the number of  keys
            allowed  to accumulate in any one bucket, determining
            when the hash table grows or  shrinks.   The  default
            value  is  0, indicating that the fill factor will be
            selected dynamically as pages are filled.

       u_int32_t (*h_hash)(const void *, u_int32_t);
            The h_hash field is a user defined hash function;  if
            h_hash  is  NULL,  a  default  hash function is used.
            Since no hash function performs equally well  on  all
            possible  data,  the  user may find that the built-in
            hash function performs poorly with a particular  data
            set.   User  specified  hash  functions  must  take a
            pointer to a byte string and a  length  as  arguments
            and return a u_int32_t value.

            If  a  hash  function  is  specified,  hash_open will
            attempt to determine if the hash  function  specified
            is  the  same  as the one with which the database was
            created, and will fail if it detects that it is  not.

       unsigned int h_nelem;
            An  estimate of the final size of the hash table.  If
            not set or set  too  low,  hash  tables  will  expand
            gracefully  as  keys  are  entered, although a slight
            performance degradation may be noticed.  The  default
            value is 1.

       unsigned long flags;
            The  following  additional  flags may be specified by
            or'ing together one or more of the following values:

            DB_DUP
                 Permit duplicate keys in the tree,  i.e.  inser-
                 tion  when  the  key  of the key/data pair being
                 inserted already exists in the tree will be suc-
                 cessful.  The ordering of duplicates in the tree
                 is determined by the order of insertion,  unless
                 the  ordering is otherwise specified by use of a
                 cursor (see db_cursor(3) for more  information.)


RECNO

       The  recno  access  method  provides support for fixed and
       variable length records, optionally backed by a flat  text
       (byte  stream)  file.   Both  fixed  and  variable  length
       records are accessed by their logical record number.

       The logical record numbers are mutable and change as lines
       are  added to and deleted from the file.  For example, the
       existence of record number five requires the existence  of
       records  one through four, and the deletion of record num-
       ber one causes records numbered two  through  five  to  be
       renumbered  to be records numbered one through four.  If a
       cursor were positioned after record number one,  it  would
       be  shifted down one logical record as well, continuing to
       reference the same record as  before.   For  this  reason,
       concurrent access to a recno database may be largely mean-
       ingless, although it is supported.

       Using the c_put or put interfaces to  create  new  records
       will  cause the creation of multiple, empty records if the
       record number is more than one greater  than  the  largest
       record  currently  in the database.  For example, the cre-
       ation of record number five, when records one through four
       do  not  exist,  causes  their logical creation with zero-
       length data.  If the created record is not at the  end  of
       the database, all records following the new record will be
       automatically renumbered.

       The following additional fields and flags may be  initial-
       ized  before  calling db_open, when using the recno access
       method:

       int re_delim;
            For variable length records, if the re_source file is
            specified  and  the  DB_DELIMITER  flag  is  set, the
            delimiting byte used to mark the end of a  record  in
            the  source file.  If the re_source file is specified
            and the DB_DELIMITER flag is not set, <newline> char-
            acters (i.e. ``\n'', 0x0a) are interpreted as end-of-
            record markers.

       u_int32_t re_len;
            The length of a fixed-length record.

       int re_pad;
            For fixed length records, if the DB_PAD flag is  set,
            the  pad  character for short records.  If the DB_PAD
            flag is not set, <space> characters (i.e., 0x20)  are
            used for padding.

       char *re_source;
            The purpose of the re_source field is to provide fast
            access and modification to databases  that  are  nor-
            mally  stored  as  flat text files.  In this case, no
            index is maintained across calls to db_open.

            If the re_source field is non-NULL, it  specifies  an
            underlying  flat  text  database file that is read to
            initialize a transient record number index.   In  the
            case of variable length records, the records are sep-
            arated by the  byte  value  re_delim.   For  example,
            standard UNIX byte stream files can be interpreted as
            a sequence of variable length  records  separated  by
            <newline> characters.

            In addition, when cached data would normally be writ-
            ten back to the underlying database file  (e.g.,  the
            close  or  sync  functions are called), the in-memory
            copy of the database is written back to the re_source
            file.  When the close function is called, the in-mem-
            ory copy of the database is discarded.

            Because there is no  meta-data  associated  with  the
            underlying  source  file,  any  differences  from the
            default values (e.g., fixed  record  length  or  byte
            separator  value)  must  be explicitly specified each
            time the file is opened.

            Because the close and sync functions write a  backing
            file  that is not transactionally protected, it is an
            error to specify a  re_source  file  and  either  the
            DB_THREAD  flag  or  a  non-NULL tx_info field in the
            DB_ENV argument to db_open.

            The re_source file must already  exist  (but  may  be
            zero-length) when db_open is called.

       unsigned long flags;
            The  following  additional  flags may be specified by
            or'ing together one or more of the following values:

            DB_DELIMITER
                 The re_delim field is set.

            DB_FIXEDLEN
                 The records are fixed-length,  not  byte  delim-
                 ited.   The  structure  element re_len specifies
                 the length of the record, and the structure ele-
                 ment re_pad is used as the pad character.

                 Any  records added to the database that are less
                 than re_len bytes long are automatically padded.
                 Any  attempt to insert records into the database
                 that are greater than  re_len  bytes  long  will
                 cause the call to fail immediately and return an
                 error.

            DB_PAD
                 The re_pad field is set.

            DB_SNAPSHOT
                 This flag requires that a copy of any  specified
                 re_source file be taken immediately when db_open
                 is called.  If this flag is  not  specified,  DB
                 may  choose  to retrieve unmodified records from
                 the re_source file  (modified  records  must  be
                 stored  elsewhere  since  they could potentially
                 cause the re_source file to change in size).



KEY/DATA PAIRS

       Storage and retrieval for the access methods are based  on
       key/data  pairs.   Key and data byte strings may reference
       strings of essentially unlimited length, although any  two
       keys  must  fit  into available memory at the same time so
       that they may be compared and any one data item  must  fit
       into available memory so that it may be returned.

       The access methods provide no guarantees about byte string
       alignment, and applications are responsible for  maintain-
       ing  any necessary alignment.  Use the DB_DBT_USERMEM flag
       to cause returned items to be placed in  memory  of  arbi-
       trary alignment.

       Both  keys  and data are represented by the following data
       structure:

       typedef struct {
              void *data;
              u_int32_t size;
              u_int32_t ulen;
              u_int32_t dlen;
              u_int32_t doff;
              u_int32_t flags;
       } DBT;

       In order to ensure compatibility with future  releases  of
       DB,  all  fields of the DBT structure that are not explic-
       itly set should be initialized to 0 before the first  time
       the structure is used.  Do this by declaring the structure
       external or static, or by calling the  C  library  routine
       bzero(3) or memset(3).

       By  default, the flags structure element is expected to be
       0.  In this default case, when being  provided  a  key  or
       data  item  by the application, the DB package expects the
       data structure element to point to a byte string  of  size
       bytes.  When returning a key/data item to the application,
       the DB package will store into the data structure  element
       a pointer to a byte string of size bytes.  By default, the
       memory referenced by this stored  pointer  is  only  valid
       until  the next call to the DB package using the DB handle
       returned by db_open.

       The elements of the DBT structure are defined as follows:

       void *data;
            A pointer to a byte string.

       u_int32_t size;
            The length of data, in bytes.

       u_int32_t ulen;
            The size of the user's buffer (referenced  by  data),
            in  bytes.   This  location  is not written by the DB
            functions.  See  the  DB_DBT_USERMEM  flag  for  more
            information.

       u_int32_t dlen;
            The  length of the partial record being read or writ-
            ten  by  the  application,   in   bytes.    See   the
            DB_DBT_PARTIAL flag for more information.

       u_int32_t doff;
            The  offset of the partial record being read or writ-
            ten  by  the  application,   in   bytes.    See   the
            DB_DBT_PARTIAL flag for more information.

       u_int32_t flags;
            The  flags  value is specified by or'ing together one
            or more of the following values:

            DB_DBT_MALLOC
                 Ignored except when retrieving information  from
                 a  database, e.g., a get call.  This flag causes
                 DB to allocate memory for the  returned  key  or
                 data item (using malloc(3)) and return a pointer
                 to it in the data field of the key or  data  DBT
                 structure.   The  allocated  memory  becomes the
                 responsibility of the calling  application.   It
                 is  an  error  to specify both DB_DBT_MALLOC and
                 DB_DBT_USERMEM.

            DB_DBT_USERMEM
                 Ignored except when retrieving information  from
                 a database, e.g., a get call.  The data field of
                 the key or data structure must reference  memory
                 that  is  at least ulen bytes in length.  If the
                 length of the requested item  is  less  than  or
                 equal  to  that  number  of  bytes,  the item is
                 copied into the memory referenced  by  the  data
                 field.   Otherwise,  an  error  is returned, the
                 size field is set to the length needed  for  the
                 requested item, and the errno variable is set to
                 ENOMEM.   It  is  an  error  to   specify   both
                 DB_DBT_MALLOC and DB_DBT_USERMEM.

            DB_DBT_PARTIAL
                 Ignored except when specified for a data parame-
                 ter,  where  this  flag   causes   the   partial
                 retrieval or storage of an item.  If the calling
                 application is  doing  a  get,  the  dlen  bytes
                 starting  doff  bytes  from the beginning of the
                 retrieved data record are returned  as  if  they
                 comprised  the  entire record.  If the specified
                 bytes do not exist in the  record,  the  get  is
                 successful, and 0 bytes are returned.

                 For  example, if the data portion of a retrieved
                 record was 100 bytes, and  a  partial  retrieval
                 was  done  using a DBT having a dlen field of 20
                 and a doff field of 85, the get call would  suc-
                 ceed, the data field would reference the last 15
                 bytes of the record, and the size field would be
                 set to 15.

                 If  the  calling application is doing a put, the
                 dlen bytes starting doff bytes from  the  begin-
                 ning  of  the  specified  key's  data record are
                 replaced by the data specified by the  data  and
                 size  structure  elements.   If  dlen is smaller
                 than size, the record will grow, and if dlen  is
                 larger  than  size,  the record will shrink.  If
                 the specified bytes do  not  exist,  the  record
                 will  be  extended using nul bytes as necessary,
                 and the put call will succeed.

                 It is an error to attempt a  partial  put  using
                 the  db_open returned put function in a database
                 that supports duplicate records.   Partial  puts
                 in  databases  supporting duplicate records must
                 be done using a db_cursor function.   It  is  an
                 error  to  attempt  a partial put with differing
                 dlen and size values in a  recno  database  with
                 fixed-length records.

                 For  example, if the data portion of a retrieved
                 record was 100 bytes, and a  partial  store  was
                 done  using  a  DBT having a dlen field of 20, a
                 doff field of 85, and a size field  of  30,  the
                 resulting  record  would be 115 bytes in length,
                 where the last 30 bytes would be those specified
                 by the put call.

       When  multiple  threads  are  using the returned DB handle
       concurrently, either the DB_DBT_MALLOC  or  DB_DBT_USERMEM
       flags  must  be specified for any DBT used for key or data
       retrieval.

       The data part of the key/data pair used  to  access  fixed
       and  variable  length records (the recno access method) is
       the same as the other access methods.  The  key,  used  to
       specify the logical record number, is different.

       In  the case of the recno access method, the data field of
       the key  is  a  pointer  to  a  memory  location  of  type
       db_recno_t,  typedef'd  in  the <db.h> include file.  This
       type is normally the largest unsigned integral type avail-
       able  to  the  implementation.   The size field of the key
       should   be    the    size    of    that    type,    e.g.,
       ``sizeof(db_recno_t)''.


DB OPERATIONS

       The  DB structure returned by db_open describes a database
       type, and includes a set of functions to  perform  various
       actions,  as  described  below.   Each  of these functions
       takes a pointer to a DB structure, and  may  take  one  or
       more  DBT *'s and a flag value as well.  Individual access
       methods may specify additional functions and  flags  which
       are  specific  to the method.  The fields of the DB struc-
       ture are as follows:

       DBTYPE type;
            The type of the underlying access  method  (and  file
            format).    Set   to  one  of  DB_BTREE,  DB_HASH  or
            DB_RECNO.  This field may be used  to  determine  the
            type of the database after a return from db_open with
            the type argument set to DB_UNKNOWN.

       int (*close)(DB *db, int flags);
            A pointer to a function to flush any cached  informa-
            tion  to  disk,  close  any open cursors (see db_cur-
            sor(3)), free any allocated resources, and close  any
            underlying files.  Since key/data pairs are cached in
            memory, failing to sync the file with  the  close  or
            sync  function  may  result  in  inconsistent or lost
            information.

            The flags parameter must be set to 0 or the following
            value:

            DB_NOSYNC
                 Do not flush cached information to disk.

            The  DB_NOSYNC flag is a dangerous option.  It should
            only be set if the application is doing logging (with
            or  without  transactions)  so  that  the database is
            recoverable after a system or application  crash,  or
            if  the  database  is  always  generated from scratch
            after any system or application crash.

            It is important to understand  that  flushing  cached
            information  to  disk  only  minimizes  the window of
            opportunity for corrupted data.  While  unlikely,  it
            is  possible  for  database corruption to happen if a
            system or application crash occurs while writing data
            to  the database.  To ensure that database corruption
            never occurs, applications must either:  use  logging
            to  guarantee  recoverability,  or edit a copy of the
            database,  and,  once  all  applications  using   the
            database  have successfully called close, replace the
            original database with the updated copy.

            When multiple threads are using the DB handle concur-
            rently,  only  a single thread may call the DB handle
            close function.

            The close function returns  the  value  of  errno  on
            failure and 0 on success.

       int (*cursor)(DB *db, DB_TXN *txnid, DBC **cursorp);
            A pointer to a function to create a cursor and copy a
            pointer to it into the memory referenced by  cursorp.

            A  cursor  is  a structure used to provide sequential
            access through a database.  This  interface  and  its
            associated  functions replaces the functionality pro-
            vided by the seq function in previous releases of the
            DB library.


            If  the file is being accessed under transaction pro-
            tection, the txnid  parameter  is  a  transaction  ID
            returned  from txn_begin, otherwise, NULL.  If trans-
            action protection is enabled, cursors must be  opened
            and  closed  within the context of a transaction, and
            the txnid parameter specifies the transaction context
            in  which  the  cursor may be used.  See db_cursor(3)
            for more information.

            The cursor function returns the  value  of  errno  on
            failure and 0 on success.

       int (*del)(DB *db, DB_TXN *txnid, DBT *key, int flags);
            A pointer to a function to remove key/data pairs from
            the database.  The key/data pair associated with  the
            specified key is discarded from the database.  In the
            presence of duplicate key values, all records associ-
            ated with the designated key will be discarded.

            If  the file is being accessed under transaction pro-
            tection, the txnid  parameter  is  a  transaction  ID
            returned from txn_begin, otherwise, NULL.

            The  flags parameter is currently unused, and must be
            set to 0.

            The del function returns the value of errno on  fail-
            ure,  0  on success, and DB_NOTFOUND if the specified
            key did not exist in the file.

       int (*fd)(DB *db, int *fdp);
            A pointer to a function that copies a file descriptor
            representative  of  the  underlying database into the
            memory referenced by fdp.  A file  descriptor  refer-
            encing  the  same  file  will be returned to all pro-
            cesses that call db_open with the same file argument.
            This  file  descriptor may be safely used as an argu-
            ment to the fcntl(2) and flock(2) locking  functions.
            The  file  descriptor  is  not necessarily associated
            with any of the underlying files used by  the  access
            method.

            The  fd  function was introduced in early versions of
            DB, before the lock manager was added, to  support  a
            coarse-grained  form of locking.  Applications should
            be converted to use the lock manager where  possible,
            and this interface should not be used by new applica-
            tions.

            The fd function returns the value of errno on failure
            and 0 on success.

       int (*get)(DB *db, DB_TXN *txnid,
                 DBT *key, DBT *data, int flags);
            A  pointer  to  a  function  that is an interface for
            keyed retrieval from the database.  The  address  and
            length  of the data associated with the specified key
            are returned in the structure referenced by data.

            In the presence of duplicate  key  values,  get  will
            return  the  first  data item for the designated key.
            Duplicates are sorted by insert  order  except  where
            this order has been overwritten by cursor operations.
            Retrieval of duplicates requires the  use  of  cursor
            operations.  See db_cursor(3) for details.

            If  the file is being accessed under transaction pro-
            tection, the txnid  parameter  is  a  transaction  ID
            returned from txn_begin, otherwise, NULL.

            The  flags parameter is currently unused, and must be
            set to 0.

            The get function returns the value of errno on  fail-
            ure, 0 on success, and DB_NOTFOUND if the key was not
            found.

       int (*put)(DB *db, DB_TXN *txnid,
                 DBT *key, DBT *data, int flags);
            A pointer to a function to store  key/data  pairs  in
            the  database.   If the database supports duplicates,
            the put function adds the new data value at  the  end
            of the duplicate set.

            If  the file is being accessed under transaction pro-
            tection, the txnid  parameter  is  a  transaction  ID
            returned from txn_begin, otherwise, NULL.

            The flags parameter must be set to 0 or the following
            value:

            DB_NOOVERWRITE
                 Enter the new key/data pair only if the key does
                 not already appear in the database.

            The  default behavior of the put function is to enter
            the  new  key/data  pair,  replacing  any  previously
            existing  key if duplicates are disallowed, or to add
            a duplicate entry if duplicates are allowed.  Even if
            the  designated database allows duplicates, a call to
            put with the DB_NOOVERWRITE flag set will fail if the
            key already exists in the database.

            The  put function returns the value of errno on fail-
            ure, 0 on success, and DB_KEYEXIST if the  DB_NOOVER-
            WRITE  flag was set and the key already exists in the
            file.

       int (*sync)(DB *db, int flags);
            A pointer to a function to flush any cached  informa-
            tion to disk.  If the database is in memory only, the
            sync function has no effect and will always  succeed.

            The  flags parameter is currently unused, and must be
            set to 0.

            See the close function description above for  a  dis-
            cussion of DB and cached data.

            The sync function returns the value of errno on fail-
            ure and 0 on success.

       int (*stat)(DB *db, void *gsp, void *lsp,
                 void *(*db_malloc)(size_t));
            A pointer to a function to create statistical  struc-
            tures  and  copy pointers to them into user-specified
            memory locations.

            In the presence  of  multiple  threads  or  processes
            accessing  an  active database, the returned informa-
            tion can be out-of-date.  This  function  may  access
            all  of  the pages in the database, and therefore may
            incur a severe performance penalty and  have  obvious
            negative effects on the underlying buffer pool.

            If  gsp  is non-NULL, a pointer to the global statis-
            tics for the database  are  copied  into  the  memory
            location  it  references.   If  lsp  is  non-NULL,  a
            pointer  to  the  per-DB-handle  statistics  for  the
            database  are copied into the memory location it ref-
            erences.  Calls to the sync function aggregate  local
            statistics  with  global  statistics and reinitialize
            the local statistics to 0.

            The statistical structures are created  in  allocated
            memory.   If  db_malloc  is non-NULL, it is called to
            allocate the memory, otherwise, the library malloc(3)
            function  is used.  The function db_malloc must match
            the calling conventions of the malloc(3) library rou-
            tine.   The  caller  is  responsible for deallocating
            this memory.

            In the case of a btree or recno database, the  global
            statistics   are   stored  in  a  structure  of  type
            DB_BTREE_STAT (typedef'd in <db.h>).   The  following
            fields will be filled in:

            u_int32_t bt_pagesize;
                 Underlying tree page size.
            u_int32_t bt_levels;
                 Number of levels in the tree.
            u_int32_t bt_nrecs;
                 Number  of  data  items in the tree (since there
                 may be multiple data items per key, this  number
                 may not be the same as the number of keys).
            u_int32_t bt_int_pg;
                 Number of tree internal pages.
            u_int32_t bt_leaf_pg;
                 Number of tree leaf pages.
            u_int32_t bt_dup_pg;
                 Number of tree duplicate pages.
            u_int32_t bt_over_pg;
                 Number of tree overflow pages.
            u_int32_t bt_free;
                 Number of pages on the free list.
            u_int32_t bt_freed;
                 Number of pages made available for reuse because
                 they were emptied.
            u_int32_t bt_int_pgfree;
                 Number of bytes free in tree internal pages.
            u_int32_t bt_leaf_pgfree;
                 Number of bytes free in tree leaf pages.

            u_int32_t bt_dup_pgfree;
                 Number of bytes free in tree duplicate pages.
            u_int32_t bt_over_pgfree;
                 Number of bytes free in tree overflow pages.
            u_int32_t bt_pfxsaved;
                 Number of bytes saved by prefix compression.
            u_int32_t bt_split;
                 Total number of tree page splits (includes  fast
                 and root splits).
            u_int32_t bt_rootsplit;
                 Number of root page splits.
            u_int32_t bt_fastsplit;
                 Number  of  fast  splits.   When sorted keys are
                 added to the database, the DB btree  implementa-
                 tion  will  split  left or right to increase the
                 page-fill factor.  This number is a  measure  of
                 how  often it was possible to make such a split.
            u_int32_t bt_added;
                 Number of keys added.
            u_int32_t bt_deleted;
                 Number of keys deleted.
            u_int32_t bt_get;
                 Number of keys retrieved.  (Note, when  returned
                 as  part  of  the  global statistics, this value
                 will not reflect any  keys  retrieved  when  the
                 database was open for read-only access.)
            u_int32_t bt_cache_hit;
                 Number  of  hits in tree fast-insert code.  When
                 sorted keys are added to the  database,  the  DB
                 btree  implementation  will  check the last page
                 where an insert occurred  before  doing  a  full
                 lookup.   This  number is a measure of how often
                 the lookup was successful.
            u_int32_t bt_cache_miss;
                 Number of misses in tree fast-insert code.   See
                 the  description of bt_cache_hit; this number is
                 a measure of how often the lookup failed.

            In the case of a btree or recno database,  the  local
            statistics   are   stored  in  a  structure  of  type
            DB_BTREE_LSTAT (typedef'd in <db.h>).  The  following
            fields will be filled in:

            u_int32_t bt_split;
                 Total  number of tree page splits (includes fast
                 and root splits).
            u_int32_t bt_rootsplit;
                 Number of root page splits.
            u_int32_t bt_fastsplit;
                 Number of fast splits.   When  sorted  keys  are
                 added  to the database, the DB btree implementa-
                 tion will split left or right  to  increase  the
                 page-fill  factor.   This number is a measure of
                 how often it was possible to make such a  split.

            u_int32_t bt_added;
                 Number of keys added.
            u_int32_t bt_deleted;
                 Number of keys deleted.
            u_int32_t bt_get;
                 Number of keys retrieved.
            u_int32_t bt_pgdeleted;
                 Number  of  pages deleted because they were emp-
                 tied.
            u_int32_t bt_cache_hit;
                 Number of hits in tree fast-insert  code.   When
                 sorted  keys  are  added to the database, the DB
                 btree implementation will check  the  last  page
                 where  an  insert  occurred  before doing a full
                 lookup.  This number is a measure of  how  often
                 the lookup was successful.
            u_int32_t bt_cache_miss;
                 Number  of misses in tree fast-insert code.  See
                 the description of bt_cache_hit; this number  is
                 a measure of how often the lookup failed.


ENVIRONMENT VARIABLES

       The  following  environment variables affect the execution
       of db_open:

       DB_HOME
            If the dbenv  argument  to  db_open  was  initialized
            using  db_appinit,  the  environment variable DB_HOME
            may be used as the path of the database home for  the
            interpretation  of  the  dir  argument to db_open, as
            described in db_appinit(3).  Specifically, db_open is
            affected   by   the  configuration  string  value  of
            DB_DATA_DIR.


EXAMPLES

       Applications that create short-lived  databases  that  are
       discarded  or  recreated  when  the  system  fails and are
       unconcerned with concurrent access and loss of data due to
       catastrophic  failure,  may  wish to use the db_open func-
       tionality without other parts of  the  DB  library.   Such
       applications  will  only  be  concerned with the DB access
       methods.  The DB access methods will use the  memory  pool
       subsystem,  but the application is unlikely to be aware of
       this.  See the file examples/ex_access.c in the DB  source
       distribution  for a C language code example of how such an
       application might use the DB library.


ERRORS

       The db_open function may fail and return errno for any  of
       the  errors  specified  for  the  following DB and library
       functions:  close(2),   fcntl(2),   fstat(2),   getpid(2),
       mmap(2), munmap(2), open(2), read(2), unlink(2), abort(3),
       calloc(3),  db->sync,   fflush(3),   free(3),   getenv(3),
       isdigit(3),    lock_get(3),    lock_id(3),    lock_put(3),
       lock_vec(3),  log_register(3),   log_unregister(3),   mal-
       loc(3),    memcpy(3),    memp_close(3),    memp_fclose(3),
       memp_fget(3), memp_fopen(3),  memp_fput(3),  memp_fset(3),
       memp_fsync(3),  memp_open(3), memp_register(3), memset(3),
       sigfillset(3),   sigprocmask(3),    stat(3),    strcpy(3),
       strdup(3),    strerror(3),   strlen(3),   t->re_irec   and
       vsnprintf(3).

       In addition, the db_open  function  may  fail  and  return
       errno for the following conditions:

       [EAGAIN]
            A lock was unavailable.

       [EINVAL]
            An  invalid  flag  value  or  parameter was specified
            (e.g., unknown database type, page size,  hash  func-
            tion,  recno pad byte, byte order) or a flag value or
            parameter that is incompatible with the current  file
            specification.

            TMPDIR If the dbenv argument to _open was NULL or not
            initialized using db_appinit, the  environment  vari-
            able  TMPDIR may be used as the directory in which to
            create the , as described in the _open section above.

            There  is  a  mismatch  between the version number of
            file and the software.

            A  re_source  file  was  specified  with  either  the
            DB_THREAD  flag  or  a  non-NULL tx_info field in the
            DB_ENV argument to db_open.

       [ENOENT]
            A non-existent re_source file was specified.

       [EPERM]
            Database corruption  was  detected.   All  subsequent
            database  calls  (other  than  db->close) will return
            EPERM.

       The db->close function may fail and return errno  for  any
       of  the  errors specified for the following DB and library
       functions:  close(2),  fcntl(2),   getpid(2),   munmap(2),
       open(2),  unlink(2),  abort(3),  db->db_malloc,  db->sync,
       fflush(3),  fprintf(3),  free(3),  getenv(3),  isdigit(3),
       lock_get(3),  lock_put(3),  lock_vec(3),  log_put(3), mal-
       loc(3), memcpy(3), memmove(3), memp_fget(3), memp_fput(3),
       memp_fset(3),  memset(3),  realloc(3), sigfillset(3), sig-
       procmask(3), snprintf(3), stat(3),  strcpy(3),  strdup(3),
       strerror(3), strlen(3) and vsnprintf(3).

       The  db->cursor function may fail and return errno for any
       of the errors specified for the following DB  and  library
       functions: free(3).


       In  addition,  the db->cursor function may fail and return
       errno for the following conditions:

       [EINVAL]
            An invalid flag value or parameter was specified.

       [EPERM]
            Database corruption  was  detected.   All  subsequent
            database  calls  (other  than  db->close) will return
            EPERM.

       The db->del function may fail and return errno for any  of
       the  errors  specified  for  the  following DB and library
       functions: db->db_malloc, fflush(3), fprintf(3),  free(3),
       lock_get(3),  lock_put(3),  lock_vec(3),  log_put(3), mal-
       loc(3), memcpy(3), memmove(3), memp_fget(3), memp_fput(3),
       memp_fset(3), memset(3), realloc(3) and vsnprintf(3),


       In  addition,  the  db->del  function  may fail and return
       errno for the following conditions:

       [EAGAIN]
            A lock was unavailable.

       [EINVAL]
            An invalid flag value or parameter was specified.

       [EPERM]
            Database corruption  was  detected.   All  subsequent
            database  calls  (other  than  db->close) will return
            EPERM.


       In addition, the db->fd function may fail and return errno
       for the following conditions:

       [ENOENT]
            The  db->fd  function  was  called  for  an in-memory
            database, or no underlying file has yet been created.

       [EPERM]
            Database  corruption  was  detected.   All subsequent
            database calls (other  than  db->close)  will  return
            EPERM.

       The  db->get function may fail and return errno for any of
       the errors specified for  the  following  DB  and  library
       functions:     db->db_malloc,    fflush(3),    fprintf(3),
       lock_get(3),  lock_put(3),  lock_vec(3),  malloc(3),  mem-
       cpy(3),   memp_fget(3),   memp_fput(3),   realloc(3)   and
       vsnprintf(3).


       In addition, the db->get  function  may  fail  and  return
       errno for the following conditions:

       [EAGAIN]
            A lock was unavailable.

       [EINVAL]
            An invalid flag value or parameter was specified.

            The  DB_THREAD  flag  was specified to the db_open(3)
            function   and   neither   the    DB_DBT_MALLOC    or
            DB_DBT_USERMEM flags were set in the DBT.

            A record number of 0 was specified.

       [EPERM]
            Database  corruption  was  detected.   All subsequent
            database calls (other  than  db->close)  will  return
            EPERM.

       The  db->put function may fail and return errno for any of
       the errors specified for  the  following  DB  and  library
       functions:  db->db_malloc, fflush(3), fprintf(3), free(3),
       lock_get(3), lock_put(3),  lock_vec(3),  log_put(3),  mal-
       loc(3), memcpy(3), memmove(3), memp_fget(3), memp_fput(3),
       memp_fset(3),  memset(3),  realloc(3),  t->bt_prefix   and
       vsnprintf(3).


       In  addition,  the  db->put  function  may fail and return
       errno for the following conditions:

       [EACCES]
            An attempt was made to modify a read-only database.

       [EAGAIN]
            A lock was unavailable.

       [EINVAL]
            An invalid flag value or parameter was specified.

            A record number of 0 was specified.

            An attempt was made to add a record to a fixed-length
            database that was too large to fit.

            An attempt was made to do a partial put.

       [EPERM]
            Database  corruption  was  detected.   All subsequent
            database calls (other  than  db->close)  will  return
            EPERM.

       [ENOSPC]
            A btree exceeded the maximum btree depth (255).

       The db->sync function may fail and return errno for any of
       the errors specified for  the  following  DB  and  library
       functions:    close(2),   fcntl(2),   open(2),   write(2),
       abort(3), db->db_malloc, fflush(3),  fprintf(3),  free(3),
       lock_get(3),  lock_put(3),  lock_vec(3),  log_put(3), mal-
       loc(3), memcpy(3), memmove(3), memp_fget(3), memp_fput(3),
       memp_fset(3),  memp_fsync(3),  memset(3), realloc(3), str-
       error(3), t->bt_prefix, t->re_irec and vsnprintf(3).

       In addition, the db->sync function  may  fail  and  return
       errno for the following conditions:

       [EINVAL]
            An invalid flag value or parameter was specified.

       [EPERM]
            Database  corruption  was  detected.   All subsequent
            database calls (other  than  db->close)  will  return
            EPERM.

       The db->stat function may fail and return errno for any of
       the errors specified for  the  following  DB  and  library
       functions: malloc(3).


BUGS

       The access methods provide no guarantees about byte string
       alignment, and applications are responsible for  maintain-
       ing any necessary alignment.

       The  name  DBT  is a mnemonic for ``data base thang'', and
       was used because noone could think of  a  reasonable  name
       that wasn't already used somewhere else.


SEE ALSO

       The  Ubiquitous  B-tree,  Douglas Comer, ACM Comput. Surv.
       11, 2 (June 1979), 121-138.

       Prefix B-trees, Bayer and Unterauer, ACM  Transactions  on
       Database Systems, Vol. 2, 1 (March 1977), 11-26.

       The  Art  of  Computer  Programming  Vol.  3:  Sorting and
       Searching, D.E. Knuth, 1968, pp 471-480.

       Dynamic Hash Tables, Per-Ake Larson, Communications of the
       ACM, April 1988.

       A  New  Hash  Package for UNIX, Margo Seltzer, USENIX Pro-
       ceedings, Winter 1991.

       Document  Processing  in  a  Relational  Database  System,
       Michael   Stonebraker,   Heidi  Stettner,  Joseph  Kalash,
       Antonin  Guttman,  Nadene  Lynn,  Memorandum  No.  UCB/ERL
       M82/32, May 1982.

       db_archive(1), db_checkpoint(1), db_deadlock(1), db_dump(1),
       db_load(1), db_recover(1), db(3), db_appinit(3), db_cursor(3),
       db_dbm(3), db_lock(3), db_log(3), db_mpool(3), db_open(3),
       db_txn(3)