This project was developed for 42 school. For comprehensive information regarding the requirements, please consult the PDF file in the subject folder of the repository. Furthermore, I have provided my notes and a concise summary below.
+ keywords: multi-processes programming
+ unidirectional
The program will be executed as follows:
./pipex file1 cmd1 cmd2 file2
$> ./pipex infile "ls -l" "wc -l" outfile
Should behave like: < infile ls -l | wc -l > outfile
$> ./pipex infile "grep a1" "wc -w" outfile
Should behave like: < infile grep a1 | wc -w > outfile
It must take four arguments: file1
and file2
are file names, and cmd1
and cmd2
are shell commands with their parameters.
The program executes cmd1
with the contents of infile
as input, and redirects the output to cmd2
, which writes the result to outfile
.
The parent process is responsible for setting up the input and output redirection and coordinating the execution of the child processes. It creates the pipe to establish communication channels between the processes.
- The parent process calls pipe() to create a pipe and obtains the read and write file descriptors;
- The parent process calls fork() to create two children;
- The children inherits the file descriptors from the parent;
- The children close the unnecessary end of the pipe (e.g., the write end if it only needs to read, or the read end if it only needs to write);
- First child process
pid1
executescmd1
with the contents ofinfile
as input and writes data to the pipe using the write file descriptor; - Second child process
pid2
executescmd2
, taking the pipe's read end as its input, and writes the result tooutfile
; - The parent process waits for both child processes to finish before exiting.
int main(int argc, char* argv[])
{
int fd[2];
int pid1;
int pid2;
if (pipe(fd) == -1)
return (1);
pid1 = fork();
if (pid1 < 0)
return (2);
if (pid1 == 0) {
// first child (ping)
dup2(fd[1], STDOUT_FILENO);
close(fd[0]);
close(fd[1]);
// get access to the path environment variable
execlp("ping", "ping", "-c", "5", "google.com", NULL);
}
// else is not necessary. after here, code only executed by the parent
// duplicate fd, both pointing to the same pipe
pid2 = fork();
if (pid2 < 0)
return (3);
if (pid2 == 0) {
// child process 2 (grep)
dup2(fd[0], STDIN_FILENO);
close(fd[0]);
close(fd[1]);
execlp("grep", "grep", "round-trip", NULL);
}
close(fd[0]);
close(fd[1]);
waitpid(pid1, NULL, 0);
waitpid(pid2, NULL, 0);
return (0);
}
Task | Prototype | Description |
---|---|---|
fork() |
pid_t fork(void) , id zero if child process, not-zero if main process, negative if error |
Forking the execution line - parent and child processes in parallel, copy memory over. After its call, the parent and child processes are independent and can execute different code paths |
fd | fd = 0 (STDIN), fd = 1 (STDOUT), fd = 2 (STDERR), fd = 3 (file.txt) |
Unique number across a process. Key to an input/output resource, maintained by OS process's table |
pid_t |
pid_t fork(void) |
Data type, pid stands for process id |
pipe() |
int pipe(int pipefd[2]) , file descriptor |
Communicate between processes, 'buffer' that saves memory that you can read(fd[0] , STDIN) and write(fd[1] , STDOUT) from it |
exit() |
noreturn void exit(int status) |
cause normal process termination(and return control to the operating system). exit(1) is used to terminate the program with an error status, while return is used to exit from a function and provide a return value. The status can be EXIT_SUCCESS(0) or EXIT_FAILURE(1) |
wait() |
waitpid(pipex.pid1, NULL, 0) |
Stop the execution until the process is finished. NULL means that the parent process is not interested in the exit status of the child. Zero specifies the options for the waitpid() , in this case, the parent process will block until the specified child process terminates. Parent process waits for the first child process pipex.pid1 to finish its execution before proceeding further |
dup() |
int dup(int oldfd) , new file descriptor |
Duplicates fd. You can have two fd pointing to the same file, but here isn't possible to set the new fd value |
dup2() |
int dup2(int oldfd, int newfd) , new file descriptor. On error, -1 is returned, and errno is set to indicate the error |
Duplicates fd, allocates a new file descriptor that refers to the same open file description as the descriptor oldfd. So, you can set the new value. If file descriptor newfd was previously open, it is closed before being reused |
PATH |
echo $PATH which ls |
(Unix-like operating systems) contains a list of directories, each one representing a search location for executable files. Otherwise, you will receive a 'command not found' error /usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin . A flexible and convenient way to execute commands without needing to know the exact location of the executable |
open() |
int open(const char *pathname, int flags) , returns a file descriptor |
open and possibly create a file. You must add the file permissions! In my case I used 0644 (octal format) = owner has read and write permissions 4 + 2 = 6 , group has read-only permission 4 , others also have read-only permission 4 |
O_TRUNC |
x | if the file already exists, its contents should be cleared before any data is written to it. Ensure that the output file starts with a clean state |
O_CREAT |
x | This flag is used to create the file if it does not exist |
close() |
int close(int fd) , returns zero on success. On error, -1 |
it takes an integer parameter representing the file descriptor to close. It is standard that you need to close one of the processes of the pipe, e.g.if you write, close the read end and vice-versa |
access() |
int access(const char *pathname, int mode) , On success (all requested permissions granted, or mode is F_OK and the file exists), zero is returned. Otherwise, -1 is returned |
it checks whether the calling process can access the file pathname, F_OK tests for the existence of the file. R_OK, W_OK, and X_OK test whether the file exists and grants read, write, and execute permissions, respectively |
execlp() |
int execlp(const char *file, const char *arg, ...) , only return if an error has occurred(-1) |
initial argument is the name of a file that is to be executed, subsequent ellipses(arg0, arg1, ..., argn). Together they describe a list of one or more pointers to null-terminated strings that represent the argument list available to the executed program. NULL terminated. The file is sought in the colon-separated list of directory pathnames specified in the PATH environment variable |
execve() |
int execve(const char *pathname, char *const argv[], char *const envp[]) |
you can execute a different program within your process, effectively replacing it through the function. Everything after execve() won't run! execve() only returns something if an error occurs (-1) |
struct |
typedef struct |
Declare a new datatype of your own, unify several variables of different datatypes into a single, new variable. dot notation (.) is used to access members of a struct when you have an actual instance of the struct, whereas the arrow notation (->) is used to access members of a struct when you have a pointer to the struct |
linked list |
typedef struct node {int number; struct node *next;} node; |
more dynamic data structure, you can expand or shrink it, as it is spread out in computer memory (it doesn't have contiguous memory as arrays). However, how to find it? Every number that I care about will have metadata(pointer to the next element). The last node will be NULL(absence of an address, 0x0). Plot it anywhere! Where there is room. Nodes connected via pointers (the tradeoff is: it uses more memory) |
fsanitize |
-fsanitize=address -g |
check sanitizer support: Run the command clang --help grep sanitize in your terminal to see if the sanitizer options are listed |
lldb |
Run -> lldb ./pipex run grocery_list.txt "head -4" "cat" sorted3.txt |
interactive debugger tool (attach events/errors to the program), explore source code. To enable debugging symbols with LLDB, you need to compile your program with the -g flag. This flag tells the compiler (e.g. gcc or clang) to include debug information in the executable file. Relaunch -> target create ./pipex . Other commands breakpoint b , backtrace bt , graphical-user-interface gui |
valgrind |
valgrind --track-fds=yes |
Check if all your fds are closed at the end of the process. Do it in your terminal and not in VSCode, otherwise, it will show the fds opened to allow communication between its sandbox and your computer. |
cool tests |
/dev/random |
empty string as first cmd and ls as second cmd (this should throw an error, but produce an output anyway), /dev/random as infile , test if there are open fds valgrind --track-fds=yes , handling errors with ft_split (what happen if cmd is NULL ?) |
empty string |
./pipex grocery_list.txt " " "" sorted.txt |
When using empty strings as arguments, perror will produce a success message. Therefore, I developed a flag to treat errors differently in those cases void msg_error(char *err, int empty) . When empty is true, it will force a error description through errno = 1; |
MultiPass | Ubuntu |
use ubuntu OS to test with Valgrind (so I can check if there are leaks or opened fds in my code). I chose to use Multipass for creating an Ubuntu VM in my MacOS multipass start pipex multipass shell pipex multipass stop pipex ![]() |