Prev: file-io-the-universal-io-model Next: processes
All system calls are executed atomically, so there are no conditions where a system call can interrupt another one.
The O_EXCL
flag passed to open
can be used
to guarantee exclusive creation of a file.
Without this flag it’s not possible to guarantee exclusively opening a file, because it would take more than one syscall.
Another use case (common in logging) is multiple processes appending data to the same file. Again, we require one syscall.
To do so, we can use O_APPEND
to open the file
exclusively in append mode.
fcntl
fcntl
is used for control operations on an open file
descriptor.
#include <fcntl.h>
int fcntl(int fd, int cmd, ...); // Return on success depends on cmd, or -1 on error.
fcntl
can be used to modify the access mode and open
file status flags of a file.
To get them, pass F_GETFL
to fcntl
.
int flags = fcntl(fd, F_GETFL);
And check the flags for synchronized writes:
if (flags & O_SYNC)
Flags can be set using F_SETFL
:
int flags;
= fcntl(fd, F_GETFL);
flags |= O_APPEND; // add O_APPEND to the flags
flags (fd, F_SETFL, flags); // set the flags fcntl
Multiple file descriptors may refer to the same open file. They could be open in the same process or different processes.
File descriptors can be duplicated using the dup
,
dup2
system calls, which is what is done when piping at
shell:
$ ./myscript > results.log 2>&1
dup
looks like this: it returns a value that corresponds
to the new file descriptor on success, which is the lowest unused file
descriptor.
#include <unistd.h>
int dup(int oldfd); // Returns (new) file descriptor on success, or –1 on error
But if you want to ensure the file descriptor has a specific number,
use dup2
:
#include <unistd.h>
int dup2(int oldfd, int newfd); // Returns (new) file descriptor on success, or –1 on error
This closes the newfd
if it exists, then returns the
value again.
You can also use fcntl
, which duplicates
oldfd
, and returning a value at least as large as
startfd
.
int newfd = fcntl(oldfd, F_DUPED, startfd);
Another call, dup3
also exists:
#define _GNU_SOURCE
#include <unistd.h>
int dup3(int oldfd, int newfd, int flags); // Returns (new) file descriptor on success, or –1 on error
The supported flag is O_CLOEXEC
, which allows the kernel
to enable the close-on-exec flag for the new file descriptor.
pread
and
pwrite
pread
and pwrite
allow you to pass an
offset, and a file, to read or write starting at an offset.
#include <unistd.h>
ssize_t pread(int fd, void *buf, size_t count, off_t offset); // Returns number of bytes read, 0 on EOF, or –1 on error
ssize_t pwrite(int fd, const void *buf, size_t count, off_t offset); // Returns number of bytes written, or –1 on error
This is useful for multithreaded programs, because the file offset
for each open file is global to all threads, so trying to use
lseek
on one thread would mess up the offset for other
threads.
readv
and writev
#include <sys/uio.h>
ssize_t readv(int fd, const struct iovec *iov, int iovcnt); // Returns number of bytes read, 0 on EOF, or –1 on error
ssize_t writev(int fd, const struct iovec *iov, int iovcnt); // Returns number of bytes written, or –1 on error
readv
and writev
allow for scatter-gather
I/O, which transfer multiple buffers of data to and from files in a
single system call.
Each element of iov
is a struct:
struct iovec {
void *iov_base; /* Start address of buffer */
size_t iov_len; /* Number of bytes to transfer to/from buffer */
};
readv
reads a contiguous sequence of bytes from the file
at fd
and then scatters these bytes into the buffers at
iov
. It fills up each buffer in order, before proceeding to
the next buffer.
readv
completes atomically, which is an improvement over
read
. readv
returns the number of bytes read
or 0 if it encountered EOF
.
writev
performs gather output, which concatenates data
from all the buffers specified by iov
and writes them as a
sequence of contigious bytes to the file referred to by the file
descriptor fd
.
writev
also completes atomically, and also can partially
write, so checking its return value (same as readv
is
important.
Linux also allows for scatter-gather I/O at a specified offset with
preadv
and pwritev
.
#define _BSD_SOURCE
#include <sys/uio.h>
ssize_t preadv(int fd, const struct iovec *iov, int iovcnt, off_t offset); // Returns number of bytes read, 0 on EOF, or –1 on error
ssize_t pwritev(int fd, const struct iovec *iov, int iovcnt, off_t offset); // Returns number of bytes written, or –1 on error
This is useful for multithreaded applications that also want to use scatter-gather I/O.
truncate
and
ftruncate
truncate
and ftruncate
set the size of a
file to the value specified by length
.
#include <unistd.h>
int truncate(const char *pathname, off_t length);
int ftruncate(int fd, off_t length); // Both return 0 on success, or –1 on error
If the file is longer than length
, the excess data is
lost. If the file is shorter than length, it is extended with null bytes
or a hole.
truncate
takes a pathname, and ftruncate
takes a file descriptor.
The O_NONBLOCK
flag when opening a flag can be used for
nonblocking I/O, which means:
open
returns an error instead of blocking.open
, subsequent I/O operations are also
non-blocking. If an I/O system call can’t complete immediately, either
partial data transfer is performed, or the system call fails with
EAGAIN
or EWOULDBLOCK
.Since off_t
used to hold a file offset is a signed long,
on 32-bit architectures, this would limit files to 2GB in size, which
disks were much larger than. There was a large file summit (LFS)
interface that allowed for the extension of SUSv2 to support larger
files.
The lfs interface has the same name as normal functions but with
64
appended, like open64
.
As well, lfs includes some other structs, like stat64
and off64_t
.
We can also just set the _FILE_OFFSET_BITS
macro to 64,
which converts all system calls into their LFS equivalents
automatically.
/dev/fd
directoryThe virtual directory /dev/fd
contains filenames
/dev/fd/n
, where 0..n
is a number
corresponding to one of the open file descriptors for the process.
Opening one of the files in /dev/fd
is the same as
duplicating the file descriptor.
Creating temporary files can be done with mkstemp
or
tmpfile
, from glibc.
#include <stdlib.h>
int mkstemp(char *template); // Returns file descriptor on success, or –1 on error
The template argument is a pathname in which the last 6 characters
must be XXXXXX
, which will be replaced to create a unique
filename.
mkstemp
creates the file with read and writer
permissions for the file owner, no permissions for other users, and
opens it with O_EXCL
, so the caller has exclusive access to
the file.
tmpfile
creates a uniquely named temporary file that is
opened for reading and writing. It is similarly opened with
O_EXCL
.
#include <stdio.h>
FILE *tmpfile(void); // Returns file pointer on success, or NULL on error
Prev: file-io-the-universal-io-model Next: processes