Skip to content

Commit

Permalink
Boost subshell performance by avoiding expensive pwd related syscalls
Browse files Browse the repository at this point in the history
In ksh93v- a sh.pwdfd variable was introduced, ostensibly for
implementing 'cd -f', 'pwd -f' and '${.sh.pwdfd}' as new features. I
might backport these in the future, but those aren't the focus of this
commit. Rather, what caught my interest was the effect of this change on
subshell performance (results from shbench with 20 iterations, compiled
with CCFLAGS='-O2 -fPIC'):

----------------------------------------------------------------------------------------------------------------------------------
name           /tmp/ksh93u+2012-08-01  /tmp/ksh93u+m-dev       /tmp/ksh93v-2012-08-24  /tmp/ksh93v-2014-12-24  /tmp/ksh93u+redhat
----------------------------------------------------------------------------------------------------------------------------------
fibonacci.ksh  0.230 [0.227-0.235]     0.216 [0.211-0.222]     0.269 [0.266-0.273]     0.247 [0.242-0.251]     0.264 [0.260-0.269]
parens.ksh     0.208 [0.204-0.212]     0.173 [0.170-0.182]     0.147 [0.145-0.149]     0.125 [0.121-0.127]     0.140 [0.137-0.144]
----------------------------------------------------------------------------------------------------------------------------------

As can be seen above, 93u+ and 93u+m perform worse in the parens
benchmark than the ksh93v- release which implements the change, as well
as a copy of ksh93u+ compiled with Red Hat's patches for cd (fdstatus,
rmdirfix, cdfix and cdfix2 applied in that order). However, 93u+ and
93u+m perform better in the fibonacci benchmark.

93v- is faster in the parens benchmark because it obtains the file
descriptor for the current working directory via openat(2) in
sh_diropenat, then caches it in sh.pwdfd. This fd is then reused in
sh_subshell(), which allows 93v- to avoid taxing open(2) and fcntl(2)
syscalls.

However, the 93v- implementation has a flaw. Unlike in 93u+, it always
duplicates sh.pwdfd with fcntl, even when it's a superfluous operation.
This is the cause of the performance regression in the fibonacci
benchmark.

To increase ksh93u+m's performance beyond both 93u+ and 93v-, I have
backported 93v-'s code, but with a crucial change:
- sh_subshell() no longer duplicates sh.pwdfd with fcntl during
  virtual subshell scope initialization. Rather, a sh_pwdupdate
  function is used in tandem with sh_diropenat in b_cd() to
  do all of this when the directory is being changed.
  This fixes the regression in the fibonacci benchmark.
  sh_pwdupdate will be called whenever sh.pwdfd changes to
  prevent file descriptor leaks while also avoiding erroneous close
  operations on the parent shell's file descriptor for the PWD.

Other changes applied include:
- The removal of support for systems which lack fchdir. ksh93v- did this
  twelve years ago in the oldest archived alpha release, which I think
  is a more than long enough deprecation window.
- If sh.pwdfd is -1 (i.e., the shell was initialized in a nonexistent
  directory), don't fork preemptively. It's more efficient to let cd
  check sp->pwdfd via a short function before attempting to change the
  directory. This allows subshells run in nonexistent directories to
  avoid forking, provided cd is never invoked.
- Removed <=1995 hackery from nv_open and env_init. During SH_INIT
  we should simply always invoke path_pwd(); giving $PWD NV_TAGGED is
  not just unnecessary, but possibly harmful when one attempts to use
  'typeset -t' with the PWD variable.
- With the removal of that code, sh.pwd is guaranteed to get set during
  shell initialization (if it wasn't already). As such, the code no
  longer needs to cope with a NULL pwd and many such checks have
  been removed.
- sh.pwd no longer has a pointless and misleading const modifier. This
  allows for the elimination of many casts involving sh.pwd.
- sh.pwdfd is reset every time the pwd is (re)initialized in
  path_pwd().
- libast/features/fcntl.c: Account for systems which don't implement
  O_SEARCH and/or O_DIRECTORY. (I avoided backporting the ksh93v-
  feature test code because it's ginormous and filled with far too
  much compatibility hackery. If anyone wants to backport the mess
  that are the libast syscall intercepts, be my guest. In any case
  ksh doesn't need any of that to function correctly.)
- Goat sacrifices are hereby banned from 93u+m (for the curious,
  please see att#567).

With these changes ksh93u+m now handily beats the previous ksh releases
in the fibonacci and parens benchmarks (ksh2020 included for comparison):

----------------------------------------------------------------------------------------------------------------------------------
name           /tmp/ksh93u+2012-08-01  /tmp/ksh93v-2014-12-24  /tmp/ksh2020            /tmp/ksh93u+m-dev       /tmp/ksh93u+m-patch
----------------------------------------------------------------------------------------------------------------------------------
fibonacci.ksh  0.230 [0.224-0.236]     0.247 [0.244-0.252]     0.369 [0.364-0.375]     0.217 [0.213-0.222]     0.202 [0.198-0.206]
parens.ksh     0.207 [0.203-0.213]     0.124 [0.121-0.127]     0.165 [0.162-0.169]     0.173 [0.171-0.176]     0.082 [0.080-0.085]
----------------------------------------------------------------------------------------------------------------------------------
  • Loading branch information
JohnoKing committed Dec 19, 2024
1 parent 53937b9 commit a1df0fb
Show file tree
Hide file tree
Showing 10 changed files with 131 additions and 108 deletions.
71 changes: 56 additions & 15 deletions src/cmd/ksh93/bltins/cd_pwd.c
Original file line number Diff line number Diff line change
Expand Up @@ -46,14 +46,32 @@ static void rehash(Namval_t *np,void *data)
nv_rehash(np,data);
}

/*
* Obtain a file handle to the directory "path" relative to directory "dir"
*/
int sh_diropenat(int dir, const char *path)
{
int fd,shfd;
if((fd = openat(dir, path, O_DIRECTORY|O_NONBLOCK|O_cloexec)) < 0)
#if O_SEARCH
if(errno != EACCES || (fd = openat(dir, path, O_SEARCH|O_DIRECTORY|O_NONBLOCK|O_cloexec)) < 0)
#endif
return fd;
/* Move fd to a number > 10 and register the fd number with the shell */
shfd = sh_fcntl(fd, F_dupfd_cloexec, 10);
close(fd);
return(shfd);
}

int b_cd(int argc, char *argv[],Shbltin_t *context)
{
char *dir;
Pathcomp_t *cdpath = 0;
const char *dp;
int saverrno=0;
int rval,pflag=0,eflag=0,ret=1;
char *oldpwd;
int newdirfd;
char *oldpwd, *cp;
Namval_t *opwdnod, *pwdnod;
NOT_USED(context);
while((rval = optget(argv,sh_optcd))) switch(rval)
Expand Down Expand Up @@ -118,13 +136,8 @@ int b_cd(int argc, char *argv[],Shbltin_t *context)
* If sh_subshell() in subshell.c cannot use fchdir(2) to restore the PWD using a saved file descriptor,
* we must fork any virtual subshell now to avoid the possibility of ending up in the wrong PWD on exit.
*/
if(sh.subshell && !sh.subshare)
{
#if _lib_fchdir
if(!test_inode(sh.pwd,e_dot))
#endif
sh_subfork();
}
if(sh.subshell && !sh.subshare && (!sh_validate_subpwdfd() || !test_inode(sh.pwd,e_dot)))
sh_subfork();
/*
* Do $CDPATH processing, except if the path is absolute or the first component is '.' or '..'
*/
Expand Down Expand Up @@ -181,21 +194,49 @@ int b_cd(int argc, char *argv[],Shbltin_t *context)
}
if(!pflag)
{
char *cp;
stkseek(sh.stk,PATH_MAX+PATH_OFFSET);
if(*(cp=stkptr(sh.stk,PATH_OFFSET))=='/')
if(!pathcanon(cp,PATH_DOTDOT))
continue;
}
if((rval=chdir(path_relative(stkptr(sh.stk,PATH_OFFSET)))) >= 0)
goto success;
if(errno!=ENOENT && saverrno==0)
cp = path_relative(stkptr(sh.stk,PATH_OFFSET));
rval = newdirfd = sh_diropenat((sh.pwdfd>0)?sh.pwdfd:AT_FDCWD,cp);
if(newdirfd>0)
{
/* chdir for directories on HSM/tapeworms may take minutes */
if((rval=fchdir(newdirfd)) >= 0)
{
sh_pwdupdate(newdirfd);
goto success;
}
sh_close(newdirfd);
}
#if !O_SEARCH
else if((rval=chdir(cp)) >= 0)
sh_pwdupdate(sh_diropenat(AT_FDCWD,cp));
#endif
if(saverrno==0)
saverrno=errno;
}
while(cdpath);
if(rval<0 && *dir=='/' && *(path_relative(stkptr(sh.stk,PATH_OFFSET)))!='/')
rval = chdir(dir);
/* use absolute chdir() if relative chdir() fails */
{
rval = newdirfd = sh_diropenat((sh.pwdfd>0)?sh.pwdfd:AT_FDCWD,dir);
if(newdirfd>0)
{
/* chdir for directories on HSM/tapeworms may take minutes */
if((rval=fchdir(newdirfd)) >= 0)
{
sh_pwdupdate(newdirfd);
goto success;
}
sh_close(newdirfd);
}
#if !O_SEARCH
else if((rval=chdir(dir)) >= 0)
sh_pwdupdate(sh_diropenat(AT_FDCWD,dir));
#endif
}
if(rval<0)
{
if(saverrno)
Expand All @@ -221,7 +262,7 @@ int b_cd(int argc, char *argv[],Shbltin_t *context)
if(*dp && (*dp!='.'||dp[1]) && strchr(dir,'/'))
sfputr(sfstdout,dir,'\n');
nv_putval(opwdnod,oldpwd,NV_RDONLY);
free((void*)sh.pwd);
free(sh.pwd);
if(*dir == '/')
{
size_t len = strlen(dir);
Expand Down
2 changes: 1 addition & 1 deletion src/cmd/ksh93/features/externs
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
hdr nc
mem exception.name,_exception.name math.h
lib setreuid,setregid,nice,fork,fchdir
lib setreuid,setregid,nice,fork
lib pathnative,pathposix
lib memcntl sys/mman.h

Expand Down
3 changes: 3 additions & 0 deletions src/cmd/ksh93/include/defs.h
Original file line number Diff line number Diff line change
Expand Up @@ -114,6 +114,7 @@ extern char *sh_checkid(char*,char*);
extern void sh_chktrap(void);
extern void sh_deparse(Sfio_t*,const Shnode_t*,int,int);
extern int sh_debug(const char*,const char*,const char*,char *const[],int);
extern int sh_diropenat(int,const char *);
extern char **sh_envgen(void);
extern Sfdouble_t sh_arith(const char*);
extern void *sh_arithcomp(char*);
Expand Down Expand Up @@ -141,6 +142,8 @@ extern Dt_t *sh_subfuntree(int);
extern void sh_subjobcheck(pid_t);
extern int sh_subsavefd(int);
extern void sh_subtmpfile(void);
extern void sh_pwdupdate(int);
extern int sh_validate_subpwdfd(void);
extern char *sh_substitute(const char*,const char*,char*);
extern void sh_timetraps(void);
extern const char *_sh_translate(const char*);
Expand Down
3 changes: 2 additions & 1 deletion src/cmd/ksh93/include/shell.h
Original file line number Diff line number Diff line change
Expand Up @@ -371,7 +371,8 @@ struct Shell_s
int offsets[10];
Sfio_t **sftable;
unsigned char *fdstatus;
const char *pwd;
char *pwd;
int pwdfd; /* file descriptor for pwd */
void *jmpbuffer;
void *mktype;
Sfio_t *strbuf;
Expand Down
2 changes: 2 additions & 0 deletions src/cmd/ksh93/sh/fault.c
Original file line number Diff line number Diff line change
Expand Up @@ -687,6 +687,8 @@ noreturn void sh_done(int sig)
if(sh_isoption(SH_NOEXEC))
kiaclose((Lex_t*)sh.lex_context);
#endif /* SHOPT_KIA */
if(sh.pwdfd > 0)
close(sh.pwdfd);
/* Exit with portable 8-bit status (128 + signum) if last child process exits due to signal */
if(sh.chldexitsig)
savxit = savxit & ~SH_EXITSIG | 0200;
Expand Down
6 changes: 1 addition & 5 deletions src/cmd/ksh93/sh/init.c
Original file line number Diff line number Diff line change
Expand Up @@ -1906,11 +1906,7 @@ static void env_init(void)
}
if(save_env_n)
sh.save_env_n = save_env_n;
if(nv_isnull(PWDNOD) || nv_isattr(PWDNOD,NV_TAGGED))
{
nv_offattr(PWDNOD,NV_TAGGED);
path_pwd();
}
path_pwd();
if((cp = nv_getval(SHELLNOD)) && (sh_type(cp)&SH_TYPE_RESTRICTED))
sh_onoption(SH_RESTRICTED); /* restricted shell */
}
Expand Down
4 changes: 0 additions & 4 deletions src/cmd/ksh93/sh/name.c
Original file line number Diff line number Diff line change
Expand Up @@ -1507,11 +1507,7 @@ Namval_t *nv_open(const char *name, Dt_t *root, int flags)
{
cp++;
if(sh_isstate(SH_INIT))
{
nv_putval(np, cp, NV_RDONLY);
if(np==PWDNOD)
nv_onattr(np,NV_TAGGED);
}
else
{
char *sub=0, *prefix= sh.prefix;
Expand Down
8 changes: 5 additions & 3 deletions src/cmd/ksh93/sh/path.c
Original file line number Diff line number Diff line change
Expand Up @@ -213,8 +213,8 @@ char *path_pwd(void)
if(sh.pwd)
{
if(*sh.pwd=='/')
return (char*)sh.pwd;
free((void*)sh.pwd);
return sh.pwd;
free(sh.pwd);
}
/* First see if PWD variable is correct */
pwdnod = sh_scoped(PWDNOD);
Expand Down Expand Up @@ -249,7 +249,9 @@ char *path_pwd(void)
if(!tofree)
cp = sh_strdup(cp);
sh.pwd = cp;
return (char*)sh.pwd;
/* Set sh.pwdfd */
sh_pwdupdate(sh_diropenat(AT_FDCWD,cp));
return sh.pwd;
}

/*
Expand Down
126 changes: 47 additions & 79 deletions src/cmd/ksh93/sh/subshell.c
Original file line number Diff line number Diff line change
Expand Up @@ -39,14 +39,6 @@
# define PIPE_BUF 512
#endif

#ifndef O_SEARCH
# ifdef O_PATH
# define O_SEARCH O_PATH
# else
# define O_SEARCH O_RDONLY
# endif
#endif

/*
* Note that the following structure must be the same
* size as the Dtlink_t structure
Expand Down Expand Up @@ -90,12 +82,9 @@ static struct subshell
char comsub;
unsigned int rand_seed; /* parent shell $RANDOM seed */
int rand_last; /* last random number from $RANDOM in parent shell */
int rand_state; /* 0 means sp->rand_seed hasn't been set, 1 is the opposite */
char rand_state; /* 0 means sp->rand_seed hasn't been set, 1 is the opposite */
uint32_t srand_upper_bound; /* parent shell's upper bound for $SRANDOM */
#if _lib_fchdir
int pwdfd; /* file descriptor for PWD */
char pwdclose;
#endif /* _lib_fchdir */
int pwdfd; /* parent shell's file descriptor for PWD */
} *subshell_data;

static unsigned int subenv;
Expand Down Expand Up @@ -457,6 +446,28 @@ void sh_subjobcheck(pid_t pid)
}
}

/*
* Set the file descriptor for the current shell's PWD without wiping
* out the stored file descriptor for the parent shell's PWD.
*/
void sh_pwdupdate(int fd)
{
struct subshell *sp = subshell_data;
if(!(sh.subshell && !sh.subshare && sh.pwdfd == sp->pwdfd) && sh.pwdfd > 0)
sh_close(sh.pwdfd);
sh.pwdfd = fd;
}

/*
* Check if the parent shell has a valid PWD fd to return to.
* Only for use by cd inside of virtual subshells.
*/
int sh_validate_subpwdfd(void)
{
struct subshell *sp = subshell_data;
return sp->pwdfd > 0;
}

/*
* Run command tree <t> in a virtual subshell
* If comsub is not null, then output will be placed in temp file (or buffer)
Expand Down Expand Up @@ -505,8 +516,6 @@ Sfio_t *sh_subshell(Shnode_t *t, volatile int flags, int comsub)
path_get(e_dot);
sh.pathinit = 0;
}
if(!sh.pwd)
path_pwd();
sp->bckpid = sh.bckpid;
if(comsub)
sh_stats(STAT_COMSUB);
Expand All @@ -521,38 +530,9 @@ Sfio_t *sh_subshell(Shnode_t *t, volatile int flags, int comsub)
sh.comsub = comsub;
if(!sh.subshare)
{
struct subshell *xp;
char *save_debugtrap = 0;
#if _lib_fchdir
sp->pwdfd = -1;
for(xp=sp->prev; xp; xp=xp->prev)
{
if(xp->pwdfd>0 && xp->pwd && strcmp(xp->pwd,sh.pwd)==0)
{
sp->pwdfd = xp->pwdfd;
break;
}
}
if(sp->pwdfd<0)
{
int n = open(e_dot,O_SEARCH);
if(n>=0)
{
sp->pwdfd = n;
if(n<10)
{
sp->pwdfd = sh_fcntl(n,F_DUPFD,10);
close(n);
}
if(sp->pwdfd>0)
{
fcntl(sp->pwdfd,F_SETFD,FD_CLOEXEC);
sp->pwdclose = 1;
}
}
}
#endif /* _lib_fchdir */
sp->pwd = (sh.pwd?sh_strdup(sh.pwd):0);
sp->pwd = sh_strdup(sh.pwd);
sp->pwdfd = sh.pwdfd;
sp->mask = sh.mask;
sh_stats(STAT_SUBSHELL);
/* save trap table */
Expand Down Expand Up @@ -626,21 +606,6 @@ Sfio_t *sh_subshell(Shnode_t *t, volatile int flags, int comsub)
if(sh.savesig < 0)
{
sh.savesig = 0;
#if _lib_fchdir
if(sp->pwdfd < 0 && !sh.subshare) /* if we couldn't get a file descriptor to our PWD ... */
sh_subfork(); /* ...we have to fork, as we cannot fchdir back to it. */
#else
if(!sh.subshare)
{
if(sp->pwd && access(sp->pwd,X_OK)<0)
{
free(sp->pwd);
sp->pwd = NULL;
}
if(!sp->pwd)
sh_subfork();
}
#endif /* _lib_fchdir */
/* Virtual subshells are not safe to suspend (^Z, SIGTSTP) in the interactive main shell. */
if(sh_isstate(SH_INTERACTIVE))
{
Expand Down Expand Up @@ -824,25 +789,29 @@ Sfio_t *sh_subshell(Shnode_t *t, volatile int flags, int comsub)
free(savsig);
}
sh.options = sp->options;
/* restore the present working directory */
#if _lib_fchdir
if(sp->pwdfd > 0 && fchdir(sp->pwdfd) < 0)
#else
if(sp->pwd && strcmp(sp->pwd,sh.pwd) && chdir(sp->pwd) < 0)
#endif /* _lib_fchdir */
if(sh.pwdfd != sp->pwdfd)
{
saveerrno = errno;
fatalerror = 2;
/*
* Restore the parent shell's present working directory.
* Note: cd will always fork if sp->pwdfd is -1 (after calling sh_validate_subpwdfd()),
* which only occurs when a subshell is started with sh.pwdfd == -1. As such, in this
* if block sp->pwdfd is always > 0 (whilst sh.pwdfd is guaranteed to differ, and
* might not be valid).
*/
if(fchdir(sp->pwdfd) < 0)
{
/* Couldn't fchdir back; close the fd and cope with the error */
sh_close(sp->pwdfd);
saveerrno = errno;
fatalerror = 2;
}
else if(sp->pwd && strcmp(sp->pwd,sh.pwd))
path_newdir(sh.pathlist);
if(fatalerror != 2)
sh_pwdupdate(sp->pwdfd);
}
else if(sp->pwd && strcmp(sp->pwd,sh.pwd))
path_newdir(sh.pathlist);
if(sh.pwd)
free((void*)sh.pwd);
free(sh.pwd);
sh.pwd = sp->pwd;
#if _lib_fchdir
if(sp->pwdclose)
close(sp->pwdfd);
#endif /* _lib_fchdir */
if(sp->mask!=sh.mask)
umask(sh.mask=sp->mask);
if(sh.coutpipe!=sp->coutpipe)
Expand Down Expand Up @@ -918,8 +887,7 @@ Sfio_t *sh_subshell(Shnode_t *t, volatile int flags, int comsub)
UNREACHABLE();
case 2:
/* reinit PWD as it will be wrong */
if(sh.pwd)
free((void*)sh.pwd);
free(sh.pwd);
sh.pwd = NULL;
path_pwd();
errno = saveerrno;
Expand Down
Loading

0 comments on commit a1df0fb

Please sign in to comment.