Pipex

This project was developed for 42 school. For comprehensive information regarding the requirements, please consult the PDF file in the subject folder of the repository. Furthermore, I have provided my notes and a concise summary below.

+ keywords: multi-processes programming
+ unidirectional

Mindmap (shinckel, 2023)

High-level Overview

The program will be executed as follows:

./pipex file1 cmd1 cmd2 file2

$> ./pipex infile "ls -l" "wc -l" outfile
Should behave like: < infile ls -l | wc -l > outfile

$> ./pipex infile "grep a1" "wc -w" outfile
Should behave like: < infile grep a1 | wc -w > outfile

It must take four arguments: file1 and file2 are file names, and cmd1 and cmd2 are shell commands with their parameters. The program executes cmd1 with the contents of infile as input, and redirects the output to cmd2, which writes the result to outfile. The parent process is responsible for setting up the input and output redirection and coordinating the execution of the child processes. It creates the pipe to establish communication channels between the processes.

The parent process calls pipe() to create a pipe and obtains the read and write file descriptors;
The parent process calls fork() to create two children;
The children inherits the file descriptors from the parent;
The children close the unnecessary end of the pipe (e.g., the write end if it only needs to read, or the read end if it only needs to write);
First child process pid1 executes cmd1 with the contents of infile as input and writes data to the pipe using the write file descriptor;
Second child process pid2 executes cmd2, taking the pipe's read end as its input, and writes the result to outfile;
The parent process waits for both child processes to finish before exiting.

int	main(int argc, char* argv[])
{
	int fd[2];
	int pid1;
	int pid2;
	
	if (pipe(fd) == -1)
		return (1);
	pid1 = fork();
	if (pid1 < 0)
		return (2);
	if (pid1 == 0) {
		// first child (ping)
		dup2(fd[1], STDOUT_FILENO);
		close(fd[0]);
		close(fd[1]);
		// get access to the path environment variable
		execlp("ping", "ping", "-c", "5", "google.com", NULL);
	}
	// else is not necessary. after here, code only executed by the parent
	// duplicate fd, both pointing to the same pipe
	pid2 = fork();
	if (pid2 < 0)
		return (3);
	if (pid2 == 0) {
		// child process 2 (grep)
		dup2(fd[0], STDIN_FILENO);
		close(fd[0]);
		close(fd[1]);
		execlp("grep", "grep", "round-trip", NULL);
	}
	close(fd[0]);
	close(fd[1]);
	waitpid(pid1, NULL, 0);
	waitpid(pid2, NULL, 0);
	return (0);
}

Concepts

Task	Prototype	Description
`fork()`	`pid_t fork(void)`, id zero if child process, not-zero if main process, negative if error	Forking the execution line - parent and child processes in parallel, copy memory over. After its call, the parent and child processes are independent and can execute different code paths
fd	`fd = 0 (STDIN), fd = 1 (STDOUT), fd = 2 (STDERR), fd = 3 (file.txt)`	Unique number across a process. Key to an input/output resource, maintained by OS process's table
`pid_t`	`pid_t fork(void)`	Data type, pid stands for process id
`pipe()`	`int pipe(int pipefd[2])`, file descriptor	Communicate between processes, 'buffer' that saves memory that you can read(`fd[0]`, STDIN) and write(`fd[1]`, STDOUT) from it
`exit()`	`noreturn void exit(int status)`	cause normal process termination(and return control to the operating system). exit(1) is used to terminate the program with an error status, while return is used to exit from a function and provide a return value. The status can be EXIT_SUCCESS(0) or EXIT_FAILURE(1)
`wait()`	`waitpid(pipex.pid1, NULL, 0)`	Stop the execution until the process is finished. NULL means that the parent process is not interested in the exit status of the child. Zero specifies the options for the `waitpid()`, in this case, the parent process will block until the specified child process terminates. Parent process waits for the first child process `pipex.pid1` to finish its execution before proceeding further
`dup()`	`int dup(int oldfd)`, new file descriptor	Duplicates fd. You can have two fd pointing to the same file, but here isn't possible to set the new fd value
`dup2()`	`int dup2(int oldfd, int newfd)`, new file descriptor. On error, -1 is returned, and `errno` is set to indicate the error	Duplicates fd, allocates a new file descriptor that refers to the same open file description as the descriptor oldfd. So, you can set the new value. If file descriptor `newfd` was previously open, it is closed before being reused
`PATH`	`echo $PATH` `which ls`	(Unix-like operating systems) contains a list of directories, each one representing a search location for executable files. Otherwise, you will receive a 'command not found' error `/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin`. A flexible and convenient way to execute commands without needing to know the exact location of the executable
`open()`	`int open(const char *pathname, int flags)`, returns a file descriptor	open and possibly create a file. You must add the file permissions! In my case I used `0644`(octal format) = owner has read and write permissions `4 + 2 = 6`, group has read-only permission `4`, others also have read-only permission `4`
`O_TRUNC`	x	if the file already exists, its contents should be cleared before any data is written to it. Ensure that the output file starts with a clean state
`O_CREAT`	x	This flag is used to create the file if it does not exist
`close()`	`int close(int fd)`, returns zero on success. On error, -1	it takes an integer parameter representing the file descriptor to close. It is standard that you need to close one of the processes of the pipe, e.g.if you write, close the read end and vice-versa
`access()`	`int access(const char *pathname, int mode)`, On success (all requested permissions granted, or mode is F_OK and the file exists), zero is returned. Otherwise, -1 is returned	it checks whether the calling process can access the file pathname, F_OK tests for the existence of the file. R_OK, W_OK, and X_OK test whether the file exists and grants read, write, and execute permissions, respectively
`execlp()`	`int execlp(const char file, const char arg, ...)`, only return if an error has occurred(-1)	initial argument is the name of a file that is to be executed, subsequent ellipses(arg0, arg1, ..., argn). Together they describe a list of one or more pointers to null-terminated strings that represent the argument list available to the executed program. NULL terminated. The file is sought in the colon-separated list of directory pathnames specified in the PATH environment variable
`execve()`	`int execve(const char pathname, char const argv[], char *const envp[])`	you can execute a different program within your process, effectively replacing it through the function. Everything after `execve()` won't run! `execve()` only returns something if an error occurs (-1)
`struct`	`typedef struct`	Declare a new datatype of your own, unify several variables of different datatypes into a single, new variable. dot notation (.) is used to access members of a struct when you have an actual instance of the struct, whereas the arrow notation (->) is used to access members of a struct when you have a pointer to the struct
`linked list`	`typedef struct node {int number; struct node *next;} node;`	more dynamic data structure, you can expand or shrink it, as it is spread out in computer memory (it doesn't have contiguous memory as arrays). However, how to find it? Every number that I care about will have metadata(pointer to the next element). The last node will be NULL(absence of an address, 0x0). Plot it anywhere! Where there is room. Nodes connected via pointers (the tradeoff is: it uses more memory)
`fsanitize`	`-fsanitize=address -g`	check sanitizer support: Run the command `clang --help grep sanitize` in your terminal to see if the sanitizer options are listed
`lldb`	Run -> `lldb ./pipex` `run grocery_list.txt "head -4" "cat" sorted3.txt`	interactive debugger tool (attach events/errors to the program), explore source code. To enable debugging symbols with LLDB, you need to compile your program with the -g flag. This flag tells the compiler (e.g. gcc or clang) to include debug information in the executable file. Relaunch -> `target create ./pipex`. Other commands `breakpoint` `b`, `backtrace` `bt`, `graphical-user-interface` `gui`
`valgrind`	`valgrind --track-fds=yes`	Check if all your fds are closed at the end of the process. Do it in your terminal and not in VSCode, otherwise, it will show the fds opened to allow communication between its sandbox and your computer.
`cool tests`	`/dev/random`	empty string as first `cmd` and `ls` as second `cmd` (this should throw an error, but produce an output anyway), `/dev/random` as `infile`, test if there are open fds `valgrind --track-fds=yes`, handling errors with `ft_split` (what happen if cmd is `NULL`?)
`empty string`	`./pipex grocery_list.txt " " "" sorted.txt`	When using empty strings as arguments, `perror` will produce a `success` message. Therefore, I developed a flag to treat errors differently in those cases `void msg_error(char *err, int empty)`. When empty is true, it will force a error description through `errno = 1;`
MultiPass	`Ubuntu`	use ubuntu OS to test with Valgrind (so I can check if there are leaks or opened fds in my code). I chose to use Multipass for creating an Ubuntu VM in my MacOS `multipass start pipex` `multipass shell pipex` `multipass stop pipex`

Name		Name	Last commit message	Last commit date
Latest commit History 96 Commits
header		header
libft @ 1e4179c		libft @ 1e4179c
src		src
subject		subject
.gitignore		.gitignore
.gitmodules		.gitmodules
Makefile		Makefile
README.md		README.md
grocery_list.txt		grocery_list.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Pipex

High-level Overview

Concepts

About

Releases

Packages

Languages

shinckel/pipex

Folders and files

Latest commit

History

Repository files navigation

Pipex

High-level Overview

Concepts

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages