Back to libsm overview

libsm sm_io general overview


$Id: io.html,v 1.3 2001-03-17 03:22:50 gshapiro Exp $

Introduction

The sm_io portion of the libsm library is similar to the stdio library. It is derived from the Chris Torek version of the stdio library (BSD). There are some key differences described below between sm_io and stdio but many similarities will be noticed.

A key difference between stdio and sm_io is that the functional code that does the open, close, read, write, etc. on a file can be different for different files. For example, with stdio the functional code (read, write) is either the default supplied in the library or a "programmer specified" set of functions set via sm_io_open(). Whichever set of functions are specified all open's, read's, write's, etc use the same set of functions. In contrast, with sm_io a different set of functions can be specified with each active file for read's, write's, etc. These different function sets are identified as file types (see sm_io_open()). Each function set can handle the actions directly, pass the action request to another function set or do some work before passing it on to another function set. The setting of a function set for a file type can be done for a file type at any time (even after the type is open).

A second difference is the use of rpools. An rpool is specified with the opening of a file (sm_io_open()). This allows of a file to be associated with an rpool so that when the rpool is released the open file will be closed; the sm_io_open() registers that sm_io_close() should be called when the rpool is released.

A third difference is that the I/O functions take a timeout argument. This allows the setting of a maximum amount of time allowable for the I/O to be completed. This means the calling program does not need to setup it's own timeout mechanism. NOTE: SIGALRM's should not be active in the calling program when an I/O function with a timeout is used.

When converting source code from stdio to sm_io be very careful to NOTE: the arguments to functions have been rationalized. That is, unlike stdio, all sm_io functions that take a file pointer (SM_FILE_T *) argument have the file pointer as the first argument. Also not all functions with stdio have an identical matching sm_io API: the API list has been thinned since a number of stdio API's overlapped in functionality. Remember many functions also have a timeout argument added.

When a file is going to be opened, the file type is included with sm_io_open(). A file type is either one automatically included with the sm_io library or one created by the program at runtime. File types can be either buffered or unbuffered. When buffered the buffering is either the builtin sm_io buffering or as done by the file type. File types can be disk files, strings, TCP/IP connections or whatever your imagination can come up with that can be read and/or written to.

Information about a particular file type or pointer can be obtained or set with the sm_io "info" functions. The sm_io_setinfo() and sm_io_getinfo() functions work on an active file pointer.

Include files

There is one main include file for use with sm_io: io.h. Since the use of rpools is specified with sm_io_open() an rpool may be created and thus rpool.h may need to be included as well (before io.h).

#include <rpool.h>
#include <io.h>

Functions/API's

Below is a list of the functions for sm_io listed in alphabetical order. Currently these functions return error codes and set errno when appropriate. These (may?/will?) change to raising exceptions later.

SM_FILE_T *sm_io_autoflush(SM_FILE_T *fp, SM_FILE_T *)

void sm_io_automode(SM_FILE_T *fp, SM_FILE_T *)

void sm_io_clearerr(SM_FILE_T *fp)

int sm_io_close(SM_FILE_T *fp, int timeout)

int sm_io_dup(SM_FILE_T *fp)

int sm_io_eof(SM_FILE_T *fp)

int sm_io_error(SM_FILE_T *fp)

char * sm_io_fgets(SM_FILE_T *fp, int timeout, char *buf, int n)

int sm_io_flush(SM_FILE_T *fp, int timeout)

int sm_io_fopen(char *pathname, int flags [, MODE_T mode])

int sm_io_fprintf(SM_FILE_T *fp, int timeout, const char *fmt, ...)

int sm_io_fputs(s, int, SM_FILE_T *fp)

int sm_io_fscanf(SM_FILE_T *fp, int timeout, char const *fmt, ...) 

int sm_io_getc(SM_FILE_T *fp, int timeout)

void sm_io_getinfo(SM_FILE_T *sfp, int what, void *valp)

SM_FILE_T * sm_io_open(SM_FILE_T type, int timeout, void *info, int flags, void *rpool)

int sm_io_purge(SM_FILE_T *fp)

int sm_io_putc(SM_FILE_T *fp, int timeout, int c)

size_t sm_io_read(SM_FILE_T *fp, int timeout, char *buf, size_t size)

SM_FILE_T * sm_io_open(SM_FILE_T type, int timeout, void *info, int flags, void *rpool)

void sm_io_rewind(SM_FILE_T *fp, int timeout)

int sm_io_seek(SM_FILE_T *fp, off_t offset, int timeout, int whence)

void sm_io_setinfo(SM_FILE_T *sfp, int what, void *valp)

int sm_io_setvbuf(SM_FILE_T *fp, int timeout, char *buf, int mode, size_t size)

int sm_io_sscanf(const char *str, char const *fmt, ...)

long sm_io_tell(SM_FILE_T *fp, int timeout)

int sm_io_ungetc(SM_FILE_T *fp, int timeout, int c)

size_t sm_io_write(SM_FILE_T *fp, int timeout, char *buf, size_t size)

int sm_snprintf(char *str, size_t n, char const *fmt, ...)

Timeouts

For many of the functions a timeout argument is given. This limits the amount of time allowed for the function to complete. There are three pre-defined values:

  • SM_TIME_DEFAULT - timeout using the default setting for this file type
  • SM_TIME_FOREVER - timeout will take forever; blocks until task completed
  • SM_TIME_IMMEDIATE - timeout (virtually) now
  • A function caller can also specify a positive integer value in milliseconds. A function will return with errno set to EINVAL if a bad value is given for timeout. When a function times out the function returns in error with errno set to EAGAIN. In the future this may change to an exception being thrown.

    Function Descriptions

    SM_FILE_T *
    sm_io_fopen(char *pathname, int flags)
    SM_FILE_T *
    sm_io_fopen(char *pathname, int flags, MODE_T mode)
    Open the file named by pathname, and associate a stream with it. The arguments are the same as for the open(2) system call.
    If memory could not be allocated, an exception is raised. If successful, an SM_FILE_T pointer is returned. Otherwise, NULL is returned and errno is set.

    SM_FILE_T *
    sm_io_open(const SM_FILE_T *type, int timeout, const void *info, int flags, void *rpool)
    Opens a file by type directed by info. Type is a filled-in SM_FILE_T structure from the following builtin list (descriptions below) or one specified by the program.
  • SmFtString
  • SmFtStdio
  • SmFtStdiofd
  • smioin *
  • smioout *
  • smioerr *
  • smiostdin *
  • smiostdout *
  • smiostderr *
  • SmFtSyslog

  • The above list of file types are already appropriately filled in. Those marked with a "*" are already open and may be used directly and immediately. For program specified types, to set the type argument easily and with minimal error the macro SM_IO_SET_TYPE should be used. The SM_FILE_T structure is fairly large, but only a small portion of it need to be initialized for a new type. See also "Writing Functions for a File Type".
    SM_IO_SET_TYPE(type, name, open, close, read, write, seek, get, set, timeout)
    

    Timeout is set as described in the Timeouts section.
    Info is information that describes for the file type what is to be opened and any associated information. For a disk file this would be a file path; with a TCP connection this could be an a structure containing an IP address and port.
    Flags is a set of sm_io flags that describes how the file is to be interacted with:
  • SM_IO_RDWR - read and write
  • SM_IO_RDONLY - read only
  • SM_IO_WRONLY - write only
  • SM_IO_APPEND - allow write to EOF only
  • SM_IO_APPENDRW - allow read-write from EOF only
  • SM_IO_RDWRTR - read and write with truncation of file first
  • Rpool is the address of the rpool that this open is to be associated with. When the rpool is released then the close function for this file type will be automatically called to close the file for cleanup. If NULL is specified for rpool then the close function is not associated (attached) to an rpool.
    On cannot allocate memory, an exception is raised. If the type is invalid, sm_io_open will abort the process. On success an SM_FILE_T * pointer is returned. On failure the NULL pointer is returned and errno is set.

    int
    sm_io_setinfo(SM_FILE_T *sfp, int what, void *valp)
    For the open file sfp set the indicated information (what) to the new value (valp). This will make the change for this SM_FILE_T only. The file type that sfp originally belonged to will still be configured the same way (this is to prevent side-effect to other open's of the same file type, particularly with threads). The value of what will be file-type dependent since this function is one of the per file type setable functions. One value for what that is valid for all file types is SM_WHAT_VECTORS. This sets the currently open file with a new function vector set for open, close, etc. The new values are taken from valp a SM_FILE_T filled in by the used via the macro SM_IO_SET_TYPE (see and "Writing Functions for a File Type" for more information).
    On success 0 (zero) is returned. On failure -1 is returned and errno is set.

    int
    sm_io_getinfo(SM_FILE_T *sfp, int what, void *valp)
    For the open file sfp get the indicated information (what) and place the result in (valp). This will obtain information for SM_FILE_T only and may be different than the information for the file type it was originally opened as. The value of what will be file type dependent since this function is one of the per file type setable functions. One value for what that is valid for all file types is SM_WHAT_VECTORS. This gets from the currently open file a copy of the function vectors and stores them in valp a SM_FILE_T (see "Writing Functions for a File Type" for more information).
    On success 0 (zero) is returned. On failure -1 is returned and errno is set.

    void
    sm_io_autoflush(SM_FILE_T *fp1, *SM_FILE_T fp2)
    Associate a read of fp1 with a data flush for fp2. If a read of fp1 discovers that there is no data available to be read, then fp2 will have it's data buffer flushed for writable data. It is assumed that fp1 is open for reading and fp2 is open for writing.
    On return the old file pointer associated with fp1 for flushing is returned. A return of NULL is no an error; this merely indicates no previous association.

    void
    sm_io_automode(SM_FILE_T *fp1, *SM_FILE_T fp2)
    Associate the two file pointers for blocking/non-blocking mode changes. In the handling of timeouts sm_io may need to switch the mode of a file between blocking and non-blocking. If the underlying file descriptor has been duplicated with dup(2) and these descriptors are used by sm_io (for example with an SmFtStdiofd file type), then this API should be called to associate them. Otherwise odd behavior (i.e. errors) may result that is not consistently reproducible nor easily identifiable.

    int
    sm_io_close(SM_FILE_T *sfp, int timeout)
    Release all resources (file handles, memory, etc.) associated with the open SM_FILE_T sfp. If buffering is active then the buffer is flushed before any resources are released. Timeout is set as described in the Timeouts section. The first resources released after buffer flushing will be the buffer itself. Then the close function specified in the file type at open will be called. It is the responsibility of the close function to release any file type specific resources allocated and to call sm_io_close() for the next file type layer(s) that the current file type uses (if any).
    On success 0 (zero) is returned. On failure SM_IO_EOF is returned and errno is set.

    Description of Builtin File Types

    There are several builtin file types as mentioned in sm_io_open(). More file types may be added later.

    SmFtString
    Operates on a character string. SmFtString is a file type only. The string starts at the location 0 (zero) and ends at the last character. A read will obtain the requested number of characters if available; else as many as possible. A read will not terminate the read characters with a NULL ('\0'). A write will place the number of requested characters at the current location. To append to a string either the pointer must currently be at the end of the string or a seek done to position the pointer. The file type handles the space needed for the string. Thus space needed for the string will be grown automagically without the user worrying about space management.
    SmFtStdio
    A predefined SM_FILE_T structure with function vectors pointing to functions that result in the file-type behaving as the system stdio normally does. The info portion of the sm_io_open is the path of the file to be opened. Note that this file type does not interact with the system's stdio. Thus a program mixing system stdio and sm_io stdio (SmFtStdio) will result in uncoordinated input and output.
    SmFtStdiofd
    A predefined SM_FILE_T structure with function vectors pointing to functions that result in the file-type behaving as the system stdio normally does. The info portion of the sm_io_open is a file descriptor (the value returned by open(2)). Note that this file type does not interact with the system's stdio. Thus a program mixing system stdio and sm_io stdio (SmFtStdio) will result in uncoordinated input and output.
    smioin
    smioout
    smioerr
    The three types smioin, smioout and smioerr are grouped together. These three types perform in the same manner as stdio's stdin, stdout and stderr. These types are both the names and the file pointers. They are already open when a program starts (unless the parent explicitly closed file descriptors 0, 1 and 2). Thus sm_io_open() should never be called for these types: the named file pointers should be used directly. Smioin and smioout are buffered by default. Smioerr is not buffered by default. Calls to stdio are safe to make when using these threesm_io file pointers. There is no interaction between sm_io and stdio. Hence, due to buffering, the sequence of input and output data from both sm_io and stdio at the same time may appear unordered. For coordination between sm_io and stdio use the three file pointers below (smiostdin, smiostdout, smiostderr).
    smiostdin
    smiostdout
    smiostderr
    The three types smiostdin, smioostdut and smiostderr are grouped together. These three types perform in the same manner as stdio's stdin, stdout and stderr. These types are both the names and file pointers. They are already open when a program starts (unless the parent explicitly close file descriptors 0, 1 and 2). Thus sm_io_open() should never be called: the named file pointers should be used directly. Calls to stdio are safe to make when using these threesm_io file pointers though no code is shared between the two libaries. However, the input and output between sm_io and stdio is coordinated for these three file pointers: smiostdin, smiostdout and smiostderr are layered on-top-of the system's stdio. Smiostdin, smiostdout and Smiostderr are not buffered by default. Hence, due to buffering in stdio only, the sequence of input and output data from both sm_io and stdio at the same time will appear ordered. If sm_io buffering is turned on then the input and output can appear unordered or lost.
    SmFtSyslog
    This opens the channel to the system log. Reads are not allowed. Writes cannot be undone once they have left the sm_io buffer. The man pages for syslog(3) should be read for information on syslog.


    Writing Functions for a File Type

    When writing functions to create a file type a function needs to be created for each function vector in the SM_FILE_T structure that will be passed to sm_io_open() or sm_io_setinfo(). Otherwise the setting will be rejected and errno set to EINVAL. Each function should accept and handle the number and types of arguments as described in the portion of the SM_FILE_T structure shown below:

            int      (*open) __P((SM_FILE_T *fp, const void *, int flags,
                                  const void *rpool));
            int      (*close) __P((SM_FILE_T *fp));
            int      (*read)  __P((SM_FILE_T *fp, char *buf, size_t size));
            int      (*write) __P((SM_FILE_T *fp, const char *buf, size_t size));
            off_t    (*seek)  __P((SM_FILE_T *fp, off_t offset, int whence));
            int      (*getinfo) __P((SM_FILE_T *fp, int what, void *valp));
            int      (*setinfo) __P((SM_FILE_T *fp, int what, void *valp));
    

    The macro SM_IO_SET_TYPE should be used to initialized an SM_FILE_T as a file type for an sm_io_open():

    SM_IO_SET_TYPE(type, name, open, close, read, write, seek, get, set, timeout)
    

    where:
  • type - is the SM_FILE_T being filled-in
  • name - a human readable character string for human identification purposes
  • open - the vector to the open function
  • close - the vector to the close function
  • read - the vector to the read function
  • write - the vector to the write function
  • seek - the vector to the seek function
  • set - the vector to the set function
  • get - the vector to the get function
  • timeout - the default to be used for a timeout when SM_TIME_DEFAULT specified
  • You should avoid trying to change or use the other structure members of the SM_FILE_T. The file pointer content (internal structure members) of an active file should only be set and observed with the "info" functions. The two exceptions to the above statement are the structure members cookie and ival. Cookie is of type void * while ival is of type int. These two structure members exist specificly for your created file type to use. The sm_io functions will not change or set these two structure members; only specific file type will change or set these variables.

    For maintaining information privately about status for a file type the information should be encapsulated in a cookie. A cookie is an opaque type that contains information that is only known to the file type layer itself. The sm_io package will know nothing about the contents of the cookie; sm_io only maintains the location of the cookie so that it may be passed to the functions of a file type. It is up to the file type to determine what to do with the cookie. It is the responsibility of the file type's open to create the cookie and point the SM_FILE_T's cookie at the address of the cookie. It is the responsibility of close to clean up any resources that the cookie and instance of the file type have used.

    For the cookie to be passed to all members of a function type cleanly the location of the cookie must assigned during the call to open. The file type functions should not attempt to maintain the cookie internally since the file type may have serveral instances (file pointers).

    The SM_FILE_T's member ival may be used in a manner similar to cookie. It is not to be used for maintaining the file's offset or access status (other members do that). It is intended as a "light" reference.

    The file type vector functions are called by the sm_io_*() functions after sm_io processing has occurred. The sm_io processing validates SM_FILE_T's and may then handle the call entirely itself or pass the request to the file type vector functions.

    All of the "int" functions should return -1 (minus one) on failure and 0 (zero) or greater on success. Errno should be set to provide diagnostic information to the caller if it has not already been set by another function the file type function used.

    Examples are a wonderful manner of clarifying details. Below is an example of an open function.

    This shows the setup.

    SM_FILE_T *fp;
    SM_FILE_T SM_IO_SET_TYPE(vector, "my_type", myopen, myclose, myread, mywrite,
    				myseek, myget, myset, SM_TIME_FOREVER);
    
    fp = sm_io_open(&vector, 1000, "data", SM_IO_RDONLY, NULL);
    
    if (fp == NULL)
    	return(-1);
    
    The above code open's a file of type "my_type". The info is set to a string "data". "data" may be the name of a file or have some special meaning to the file type. For sake of the example, we will have it be the name of a file in the home directory of the user running the program. Now the only file type function that is dependent on this information will be the open function.
    We have also specified read-only access (SM_IO_RDONLY) and that no rpool will be used. The timeout has been set to 1000 milliseconds which directs that the file and all associated setup should be done within 1000 milliseconds or return that the function erred (with errno==EAGAIN).
    int myopen(fp, info, flags, rpools)
    	SM_FILE_T *fp;
            const void *info; 
            int flags;
            void *rpool;
    {
    	/*
    	**  now we could do the open raw (i.e with read(2)), but we will
    	**  use file layering instead. We will use the stdio file
    	**  type (different than the system's stdio).
    	*/
    	struct passwd *pw;
    	char path[PATH_MAX];
    
    	pw = getpwuid(getuid());
    	sm_io_snprintf(path, PATH_MAX, "%s/%s", pw->pw_dir, info);
    
    	/*
    	**  Okay. Now the path pass-in has been prefixed with the
    	**  user's HOME directory. We'll call the regular stdio (SmFtStdio)
    	**  now to handle the rest of the open.
    	*/
    	fp->cookie = sm_io_open(SmFtStdio, path, flags, rpools);
    	if (fp->cookie == NULL)
    		return(-1) /* errno set by sm_io_open call */
    	else
    		return(0);
    }
    
    Later on when a write is performed the function mywrite will be invoked. To match the above myopen, mywrite could be written as:
    int mywrite(fp, buf, size)
    	SM_FILE_T *fp;
            char *buf;
            size_t size;
    {
    	/*
    	**  As an example, we can change, modify, refuse, filter, etc.
    	**  the content being passed through before we ask the SmFtStdio
    	**  to do the actual write.
    	**  This example is very simple and contrived, but this keeps it
    	**  clear.
    	*/
    	if (size == 0)
    		return(0); /* why waste the cycles? */
    	if (*buf == 'X')
    		*buf = 'Y';
    
    	/*
    	**  Note that the file pointer passed to the next level is the
    	**  one that was stored in the cookie during the open.
    	*/
    	return(sm_io_write(fp->cookie, buf, size));
    }
    
    As a thought-exercise for the fair reader: how would you modify the above two functions to make a "tee". That is the program will call sm_io_open or sm_io_write and two or more files will be opened and written to. (Hint: create a cookie to hold two or more file pointers).




    libsm sm_io default API definition

    Introduction

    A number of sm_io API's perform similar to their stdio counterparts (same name as when the "sm_io_" is removed). One difference between sm_io and stdio functions is that if a "file pointer" (FILE/SM_FILE_T) is one of the arguments for the function, then it is now the first argument. Sm_io is standardized so that when a file pointer is one of the arguments to function then it will always be the first argument. Many of the sm_io function take a timeout argument (see Timeouts).

    The API you have selected is one of these. Please consult the appropriate stdio man page for now.