-int relay_open(channel_path, bufsize, nbufs, channel_flags,
- channel_callbacks, start_reserve, end_reserve,
- rchan_start_reserve, resize_min, resize_max, mode,
- init_buf, init_buf_size)
-int relay_write(channel_id, *data_ptr, count, time_delta_offset, **wrote)
-rchan_reader *add_rchan_reader(channel_id, auto_consume)
-int remove_rchan_reader(rchan_reader *reader)
-rchan_reader *add_map_reader(channel_id)
-int remove_map_reader(rchan_reader *reader)
-int relay_read(reader, buf, count, wait, *actual_read_offset)
-void relay_buffers_consumed(reader, buffers_consumed)
-void relay_bytes_consumed(reader, bytes_consumed, read_offset)
-int relay_bytes_avail(reader)
-int rchan_full(reader)
-int rchan_empty(reader)
-int relay_info(channel_id, *channel_info)
-int relay_close(channel_id)
-int relay_realloc_buffer(channel_id, nbufs, async)
-int relay_replace_buffer(channel_id)
-int relay_reset(int rchan_id)
-
-----------
-int relay_open(channel_path, bufsize, nbufs,
- channel_flags, channel_callbacks, start_reserve,
- end_reserve, rchan_start_reserve, resize_min, resize_max, mode)
-
-relay_open() is used to create a new entry in relayfs. This new entry
-is created according to channel_path. channel_path contains the
-absolute path to the channel file on relayfs. If, for example, the
-caller sets channel_path to "/xlog/9", a "xlog/9" entry will appear
-within relayfs automatically and the "xlog" directory will be created
-in the filesystem's root. relayfs does not implement any policy on
-its content, except to disallow the opening of two channels using the
-same file. There are, nevertheless a set of guidelines for using
-relayfs. Basically, each facility using relayfs should use a top-level
-directory identifying it. The entry created above, for example,
-presumably belongs to the "xlog" software.
-
-The remaining parameters for relay_open() are as follows:
-
-- channel_flags - an ORed combination of attribute values controlling
- common channel characteristics:
-
- - logging scheme - relayfs use 2 mutually exclusive schemes
- for logging data to a channel. The 'lockless scheme'
- reserves and writes data to a channel without the need of
- any type of locking on the channel. This is the preferred
- scheme, but may not be available on a given architecture (it
- relies on the presence of a cmpxchg instruction). It's
- specified by the RELAY_SCHEME_LOCKLESS flag. The 'locking
- scheme' either obtains a lock on the channel for writing or
- disables interrupts, depending on whether the channel was
- opened for SMP or global usage (see below). It's specified
- by the RELAY_SCHEME_LOCKING flag. While a client may want
- to explicitly specify a particular scheme to use, it's more
- convenient to specify RELAY_SCHEME_ANY for this flag, which
- will allow relayfs to choose the best available scheme i.e.
- lockless if supported.
-
- - overwrite mode (default is RELAY_MODE_CONTINUOUS) -
- If RELAY_MODE_CONTINUOUS is specified, writes to the channel
- will succeed regardless of whether there are up-to-date
- consumers or not. If RELAY_MODE_NO_OVERWRITE is specified,
- the channel becomes 'full' when the total amount of buffer
- space unconsumed by readers equals or exceeds the total
- buffer size. With the buffer in this state, writes to the
- buffer will fail - clients need to check the return code from
- relay_write() to determine if this is the case and act
- accordingly - 0 or a negative value indicate the write failed.
-
- - SMP usage - this applies only when the locking scheme is in
- use. If RELAY_USAGE_SMP is specified, it's assumed that the
- channel will be used in a per-CPU fashion and consequently,
- the only locking that will be done for writes is to disable
- local irqs. If RELAY_USAGE_GLOBAL is specified, it's assumed
- that writes to the buffer can occur within any CPU context,
- and spinlock_irq_save will be used to lock the buffer.
-
- - delivery mode - if RELAY_DELIVERY_BULK is specified, the
- client will be notified via its deliver() callback whenever a
- sub-buffer has been filled. Alternatively,
- RELAY_DELIVERY_PACKET will cause delivery to occur after the
- completion of each write. See the description of the channel
- callbacks below for more details.
-
- - timestamping - if RELAY_TIMESTAMP_TSC is specified and the
- architecture supports it, efficient TSC 'timestamps' can be
- associated with each write, otherwise more expensive
- gettimeofday() timestamping is used. At the beginning of
- each sub-buffer, a gettimeofday() timestamp and the current
- TSC, if supported, are read, and are passed on to the client
- via the buffer_start() callback. This allows correlation of
- the current time with the current TSC for subsequent writes.
- Each subsequent write is associated with a 'time delta',
- which is either the current TSC, if the channel is using
- TSCs, or the difference between the buffer_start gettimeofday
- timestamp and the gettimeofday time read for the current
- write. Note that relayfs never writes either a timestamp or
- time delta into the buffer unless explicitly asked to (see
- the description of relay_write() for details).
-
-- bufsize - the size of the 'sub-buffers' making up the circular channel
- buffer. For the lockless scheme, this must be a power of 2.
-
-- nbufs - the number of 'sub-buffers' making up the circular
- channel buffer. This must be a power of 2.
-
- The total size of the channel buffer is bufsize * nbufs rounded up
- to the next kernel page size. If the lockless scheme is used, both
- bufsize and nbufs must be a power of 2. If the locking scheme is
- used, the bufsize can be anything and nbufs must be a power of 2. If
- RELAY_SCHEME_ANY is used, the bufsize and nbufs should be a power of 2.
-
- NOTE: if nbufs is 1, relayfs will bypass the normal size
- checks and will allocate an rvmalloced buffer of size bufsize.
- This buffer will be freed when relay_close() is called, if the channel
- isn't still being referenced.
-
-- callbacks - a table of callback functions called when events occur
- within the data relay that clients need to know about:
-
- - int buffer_start(channel_id, current_write_pos, buffer_id,
- start_time, start_tsc, using_tsc) -
-
- called at the beginning of a new sub-buffer, the
- buffer_start() callback gives the client an opportunity to
- write data into space reserved at the beginning of a
- sub-buffer. The client should only write into the buffer
- if it specified a value for start_reserve and/or
- channel_start_reserve (see below) when the channel was
- opened. In the latter case, the client can determine
- whether to write its one-time rchan_start_reserve data by
- examining the value of buffer_id, which will be 0 for the
- first sub-buffer. The address that the client can write
- to is contained in current_write_pos (the client by
- definition knows how much it can write i.e. the value it
- passed to relay_open() for start_reserve/
- channel_start_reserve). start_time contains the
- gettimeofday() value for the start of the buffer and start
- TSC contains the TSC read at the same time. The using_tsc
- param indicates whether or not start_tsc is valid (it
- wouldn't be if TSC timestamping isn't being used).
-
- The client should return the number of bytes it wrote to
- the channel, 0 if none.
-
- - int buffer_end(channel_id, current_write_pos, end_of_buffer,
- end_time, end_tsc, using_tsc)
-
- called at the end of a sub-buffer, the buffer_end()
- callback gives the client an opportunity to perform
- end-of-buffer processing. Note that the current_write_pos
- is the position where the next write would occur, but
- since the current write wouldn't fit (which is the trigger
- for the buffer_end event), the buffer is considered full
- even though there may be unused space at the end. The
- end_of_buffer param pointer value can be used to determine
- exactly the size of the unused space. The client should
- only write into the buffer if it specified a value for
- end_reserve when the channel was opened. If the client
- doesn't write anything i.e. returns 0, the unused space at
- the end of the sub-buffer is available via relay_info() -
- this data may be needed by the client later if it needs to
- process raw sub-buffers (an alternative would be to save
- the unused bytes count value in end_reserve space at the
- end of each sub-buffer during buffer_end processing and
- read it when needed at a later time. The other
- alternative would be to use read(2), which makes the
- unused count invisible to the caller). end_time contains
- the gettimeofday() value for the end of the buffer and end
- TSC contains the TSC read at the same time. The using_tsc
- param indicates whether or not end_tsc is valid (it
- wouldn't be if TSC timestamping isn't being used).
-
- The client should return the number of bytes it wrote to
- the channel, 0 if none.
-
- - void deliver(channel_id, from, len)
-
- called when data is ready for the client. This callback
- is used to notify a client when a sub-buffer is complete
- (in the case of bulk delivery) or a single write is
- complete (packet delivery). A bulk delivery client might
- wish to then signal a daemon that a sub-buffer is ready.
- A packet delivery client might wish to process the packet
- or send it elsewhere. The from param is a pointer to the
- delivered data and len specifies how many bytes are ready.
-
- - void user_deliver(channel_id, from, len)
-
- called when data has been written to the channel from user
- space. This callback is used to notify a client when a
- successful write from userspace has occurred, independent
- of whether bulk or packet delivery is in use. This can be
- used to allow userspace programs to communicate with the
- kernel client through the channel via out-of-band write(2)
- 'commands' instead of via ioctls, for instance. The from
- param is a pointer to the delivered data and len specifies
- how many bytes are ready. Note that this callback occurs
- after the bytes have been successfully written into the
- channel, which means that channel readers must be able to
- deal with the 'command' data which will appear in the
- channel data stream just as any other userspace or
- non-userspace write would.
-
- - int needs_resize(channel_id, resize_type,
- suggested_buf_size, suggested_n_bufs)
-
- called when a channel's buffers are in danger of becoming
- full i.e. the number of unread bytes in the channel passes
- a preset threshold, or when the current capacity of a
- channel's buffer is no longer needed. Also called to
- notify the client when a channel's buffer has been
- replaced. If resize_type is RELAY_RESIZE_EXPAND or
- RELAY_RESIZE_SHRINK, the kernel client should arrange to
- call relay_realloc_buffer() with the suggested buffer size
- and buffer count, which will allocate (but will not
- replace the old one) a new buffer of the recommended size
- for the channel. When the allocation has completed,
- needs_resize() is again called, this time with a
- resize_type of RELAY_RESIZE_REPLACE. The kernel client
- should then arrange to call relay_replace_buffer() to
- actually replace the old channel buffer with the newly
- allocated buffer. Finally, once the buffer replacement
- has completed, needs_resize() is again called, this time
- with a resize_type of RELAY_RESIZE_REPLACED, to inform the
- client that the replacement is complete and additionally
- confirming the current sub-buffer size and number of
- sub-buffers. Note that a resize can be canceled if
- relay_realloc_buffer() is called with the async param
- non-zero and the resize conditions no longer hold. In
- this case, the RELAY_RESIZE_REPLACED suggested number of
- sub-buffers will be the same as the number of sub-buffers
- that existed before the RELAY_RESIZE_SHRINK or EXPAND i.e.
- values indicating that the resize didn't actually occur.
-
- - int fileop_notify(channel_id, struct file *filp, enum relay_fileop)
-
- called when a userspace file operation has occurred or
- will occur on a relayfs channel file. These notifications
- can be used by the kernel client to trigger actions within
- the kernel client when the corresponding event occurs,
- such as enabling logging only when a userspace application
- opens or mmaps a relayfs file and disabling it again when
- the file is closed or unmapped. The kernel client can
- also return its own return value, which can affect the
- outcome of file operation - returning 0 indicates that the
- operation should succeed, and returning a negative value
- indicates that the operation should be failed, and that
- the returned value should be returned to the ultimate
- caller e.g. returning -EPERM from the open fileop will
- cause the open to fail with -EPERM. Among other things,
- the return value can be used to restrict a relayfs file
- from being opened or mmap'ed more than once. The currently
- implemented fileops are:
-
- RELAY_FILE_OPEN - a relayfs file is being opened. Return
- 0 to allow it to succeed, negative to
- have it fail. A negative return value will
- be passed on unmodified to the open fileop.
- RELAY_FILE_CLOSE- a relayfs file is being closed. The return
- value is ignored.
- RELAY_FILE_MAP - a relayfs file is being mmap'ed. Return 0
- to allow it to succeed, negative to have
- it fail. A negative return value will be
- passed on unmodified to the mmap fileop.
- RELAY_FILE_UNMAP- a relayfs file is being unmapped. The return
- value is ignored.
-
- - void ioctl(rchan_id, cmd, arg)
-
- called when an ioctl call is made using a relayfs file
- descriptor. The cmd and arg are passed along to this
- callback unmodified for it to do as it wishes with. The
- return value from this callback is used as the return value
- of the ioctl call.
-
- If the callbacks param passed to relay_open() is NULL, a set of
- default do-nothing callbacks will be defined for the channel.
- Likewise, any NULL rchan_callback function contained in a non-NULL
- callbacks struct will be filled in with a default callback function
- that does nothing.
-
-- start_reserve - the number of bytes to be reserved at the start of
- each sub-buffer. The client can do what it wants with this number
- of bytes when the buffer_start() callback is invoked. Typically
- clients would use this to write per-sub-buffer header data.
-
-- end_reserve - the number of bytes to be reserved at the end of each
- sub-buffer. The client can do what it wants with this number of
- bytes when the buffer_end() callback is invoked. Typically clients
- would use this to write per-sub-buffer footer data.
-
-- channel_start_reserve - the number of bytes to be reserved, in
- addition to start_reserve, at the beginning of the first sub-buffer
- in the channel. The client can do what it wants with this number of
- bytes when the buffer_start() callback is invoked. Typically
- clients would use this to write per-channel header data.
-
-- resize_min - if set, this signifies that the channel is
- auto-resizeable. The value specifies the size that the channel will
- try to maintain as a normal working size, and that it won't go
- below. The client makes use of the resizing callbacks and
- relay_realloc_buffer() and relay_replace_buffer() to actually effect
- the resize.
-
-- resize_max - if set, this signifies that the channel is
- auto-resizeable. The value specifies the maximum size the channel
- can have as a result of resizing.
-
-- mode - if non-zero, specifies the file permissions that will be given
- to the channel file. If 0, the default rw user perms will be used.
-
-- init_buf - if non-NULL, rather than allocating the channel buffer,
- this buffer will be used as the initial channel buffer. The kernel
- API function relay_discard_init_buf() can later be used to have
- relayfs allocate a normal mmappable channel buffer and switch over
- to using it after copying the init_buf contents into it. Currently,
- the size of init_buf must be exactly buf_size * n_bufs. The caller
- is responsible for managing the init_buf memory. This feature is
- typically used for init-time channel use and should normally be
- specified as NULL.
-
-- init_buf_size - the total size of init_buf, if init_buf is specified
- as non-NULL. Currently, the size of init_buf must be exactly
- buf_size * n_bufs.
-
-Upon successful completion, relay_open() returns a channel id
-to be used for all other operations with the relay. All buffers
-managed by the relay are allocated using rvmalloc/rvfree to allow
-for easy mmapping to user-space.
-
-----------
-int relay_write(channel_id, *data_ptr, count, time_delta_offset, **wrote_pos)
-
-relay_write() reserves space in the channel and writes count bytes of
-data pointed to by data_ptr to it. Automatically performs any
-necessary locking, depending on the scheme and SMP usage in effect (no
-locking is done for the lockless scheme regardless of usage). It
-returns the number of bytes written, or 0/negative on failure. If
-time_delta_offset is >= 0, the internal time delta, the internal time
-delta calculated when the slot was reserved will be written at that
-offset. This is the TSC or gettimeofday() delta between the current
-write and the beginning of the buffer, whichever method is being used
-by the channel. Trying to write a count larger than the bufsize
-specified to relay_open() (taking into account the reserved
-start-of-buffer and end-of-buffer space as well) will fail. If
-wrote_pos is non-NULL, it will receive the location the data was
-written to, which may be needed for some applications but is not
-normally interesting. Most applications should pass in NULL for this
-param.
-
-----------
-struct rchan_reader *add_rchan_reader(int rchan_id, int auto_consume)
-
-add_rchan_reader creates and initializes a reader object for a
-channel. An opaque rchan_reader object is returned on success, and is
-passed to relay_read() when reading the channel. If the boolean
-auto_consume parameter is 1, the reader is defined to be
-auto-consuming. auto-consuming reader objects are automatically
-created and used for VFS read(2) readers.
-
-----------
-void remove_rchan_reader(struct rchan_reader *reader)
-
-remove_rchan_reader finds and removes the given reader from the
-channel. This function is used only by non-VFS read(2) readers. VFS
-read(2) readers are automatically removed when the corresponding file
-object is closed.
-
-----------
-reader add_map_reader(int rchan_id)
-
-Creates and initializes an rchan_reader object for channel map
-readers, and is needed for updating relay_bytes/buffers_consumed()
-when kernel clients become aware of the need to do so by their mmap
-user clients.
-
-----------
-int remove_map_reader(reader)
-
-Finds and removes the given map reader from the channel. This function
-is useful only for map readers.
-
-----------
-int relay_read(reader, buf, count, wait, *actual_read_offset)
-
-Reads count bytes from the channel, or as much as is available within
-the sub-buffer currently being read. The read offset that will be
-read from is the position contained within the reader object. If the
-wait flag is set, buf is non-NULL, and there is nothing available, it
-will wait until there is. If the wait flag is 0 and there is nothing
-available, -EAGAIN is returned. If buf is NULL, the value returned is
-the number of bytes that would have been read. actual_read_offset is
-the value that should be passed as the read offset to
-relay_bytes_consumed, needed only if the reader is not auto-consuming
-and the channel is MODE_NO_OVERWRITE, but in any case, it must not be
-NULL.
-
-----------
-
-int relay_bytes_avail(reader)
-
-Returns the number of bytes available relative to the reader's current
-read position within the corresponding sub-buffer, 0 if there is
-nothing available. Note that this doesn't return the total bytes
-available in the channel buffer - this is enough though to know if
-anything is available, however, or how many bytes might be returned
-from the next read.
-
-----------
-void relay_buffers_consumed(reader, buffers_consumed)
-
-Adds to the channel's consumed buffer count. buffers_consumed should
-be the number of buffers newly consumed, not the total number
-consumed. NOTE: kernel clients don't need to call this function if
-the reader is auto-consuming or the channel is MODE_CONTINUOUS.
-
-In order for the relay to detect the 'buffers full' condition for a
-channel, it must be kept up-to-date with respect to the number of
-buffers consumed by the client. If the addition of the value of the
-bufs_consumed param to the current bufs_consumed count for the channel
-would exceed the bufs_produced count for the channel, the channel's
-bufs_consumed count will be set to the bufs_produced count for the
-channel. This allows clients to 'catch up' if necessary.
-
-----------
-void relay_bytes_consumed(reader, bytes_consumed, read_offset)
-
-Adds to the channel's consumed count. bytes_consumed should be the
-number of bytes actually read e.g. return value of relay_read() and
-the read_offset should be the actual offset the bytes were read from
-e.g. the actual_read_offset set by relay_read(). NOTE: kernel clients
-don't need to call this function if the reader is auto-consuming or
-the channel is MODE_CONTINUOUS.
-
-In order for the relay to detect the 'buffers full' condition for a
-channel, it must be kept up-to-date with respect to the number of
-bytes consumed by the client. For packet clients, it makes more sense
-to update after each read rather than after each complete sub-buffer
-read. The bytes_consumed count updates bufs_consumed when a buffer
-has been consumed so this count remains consistent.
-
-----------
-int relay_info(channel_id, *channel_info)
-
-relay_info() fills in an rchan_info struct with channel status and
-attribute information such as usage modes, sub-buffer size and count,
-the allocated size of the entire buffer, buffers produced and
-consumed, current buffer id, count of writes lost due to buffers full
-condition.
-
-The virtual address of the channel buffer is also available here, for
-those clients that need it.
-
-Clients may need to know how many 'unused' bytes there are at the end
-of a given sub-buffer. This would only be the case if the client 1)
-didn't either write this count to the end of the sub-buffer or
-otherwise note it (it's available as the difference between the buffer
-end and current write pos params in the buffer_end callback) (if the
-client returned 0 from the buffer_end callback, it's assumed that this
-is indeed the case) 2) isn't using the read() system call to read the
-buffer. In other words, if the client isn't annotating the stream and
-is reading the buffer by mmaping it, this information would be needed
-in order for the client to 'skip over' the unused bytes at the ends of
-sub-buffers.
-
-Additionally, for the lockless scheme, clients may need to know
-whether a particular sub-buffer is actually complete. An array of
-boolean values, one per sub-buffer, contains non-zero if the buffer is
-complete, non-zero otherwise.
-
-----------
-int relay_close(channel_id)
-
-relay_close() is used to close the channel. It finalizes the last
-sub-buffer (the one currently being written to) and marks the channel
-as finalized. The channel buffer and channel data structure are then
-freed automatically when the last reference to the channel is given
-up.
-
-----------
-int relay_realloc_buffer(channel_id, nbufs, async)
-
-Allocates a new channel buffer using the specified sub-buffer count
-(note that resizing can't change sub-buffer sizes). If async is
-non-zero, the allocation is done in the background using a work queue.
-When the allocation has completed, the needs_resize() callback is
-called with a resize_type of RELAY_RESIZE_REPLACE. This function
-doesn't replace the old buffer with the new - see
-relay_replace_buffer().
-
-This function is called by kernel clients in response to a
-needs_resize() callback call with a resize type of RELAY_RESIZE_EXPAND
-or RELAY_RESIZE_SHRINK. That callback also includes a suggested
-new_bufsize and new_nbufs which should be used when calling this
-function.
-
-Returns 0 on success, or errcode if the channel is busy or if
-the allocation couldn't happen for some reason.
-
-NOTE: if async is not set, this function should not be called with a
-lock held, as it may sleep.
-
-----------
-int relay_replace_buffer(channel_id)
-
-Replaces the current channel buffer with the new buffer allocated by
-relay_realloc_buffer and contained in the channel struct. When the
-replacement is complete, the needs_resize() callback is called with
-RELAY_RESIZE_REPLACED. This function is called by kernel clients in
-response to a needs_resize() callback having a resize type of
-RELAY_RESIZE_REPLACE.
-
-Returns 0 on success, or errcode if the channel is busy or if the
-replacement or previous allocation didn't happen for some reason.
-
-NOTE: This function will not sleep, so can called in any context and
-with locks held. The client should, however, ensure that the channel
-isn't actively being read from or written to.
-
-----------
-int relay_reset(rchan_id)
-
-relay_reset() has the effect of erasing all data from the buffer and
-restarting the channel in its initial state. The buffer itself is not
-freed, so any mappings are still in effect. NOTE: Care should be
-taken that the channnel isn't actually being used by anything when
-this call is made.
-
-----------
-int rchan_full(reader)
-
-returns 1 if the channel is full with respect to the reader, 0 if not.
-
-----------
-int rchan_empty(reader)
-
-returns 1 if the channel is empty with respect to the reader, 0 if not.
-
-----------
-int relay_discard_init_buf(rchan_id)
-
-allocates an mmappable channel buffer, copies the contents of init_buf
-into it, and sets the current channel buffer to the newly allocated
-buffer. This function is used only in conjunction with the init_buf
-and init_buf_size params to relay_open(), and is typically used when
-the ability to write into the channel at init-time is needed. The
-basic usage is to specify an init_buf and init_buf_size to relay_open,
-then call this function when it's safe to switch over to a normally
-allocated channel buffer. 'Safe' means that the caller is in a
-context that can sleep and that nothing is actively writing to the
-channel. Returns 0 if successful, negative otherwise.
-
-
-Writing directly into the channel
-=================================
-
-Using the relay_write() API function as described above is the
-preferred means of writing into a channel. In some cases, however,
-in-kernel clients might want to write directly into a relay channel
-rather than have relay_write() copy it into the buffer on the client's
-behalf. Clients wishing to do this should follow the model used to
-implement relay_write itself. The general sequence is:
-
-- get a pointer to the channel via rchan_get(). This increments the
- channel's reference count.
-- call relay_lock_channel(). This will perform the proper locking for
- the channel given the scheme in use and the SMP usage.
-- reserve a slot in the channel via relay_reserve()
-- write directly to the reserved address
-- call relay_commit() to commit the write
-- call relay_unlock_channel()
-- call rchan_put() to release the channel reference
-
-In particular, clients should make sure they call rchan_get() and
-rchan_put() and not hold on to references to the channel pointer.
-Also, forgetting to use relay_lock_channel()/relay_unlock_channel()
-has no effect if the lockless scheme is being used, but could result
-in corrupted buffer contents if the locking scheme is used.
-
-
-Limitations
-===========
-
-Writes made via the write() system call are currently limited to 2
-pages worth of data. There is no such limit on the in-kernel API
-function relay_write().
-
-User applications can currently only mmap the complete buffer (it
-doesn't really make sense to mmap only part of it, given its purpose).
-
-
-Latest version
-==============
-
-The latest version can be found at:
-
-http://www.opersys.com/relayfs