-
Notifications
You must be signed in to change notification settings - Fork 328
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for linefeed \n
character in paths saved to ~/.local/share/lf/files
(paths delimiter bug)
#1114
Comments
The standard solution here is to use I think it would not be a bad idea to change the separator to |
I can also see a more dirty solution: save each copied/cut file in a separate file (the more you have selected, the more temporary files have been created). That will solve the whole Unicode/delimiter problem because each character in a single file is a part of one valid path name (1 path per file). P.S. I know about |
If Stackoverflow is to be believed, this is just not true, even on Windows
(where paths might be Unicode, AFAIK. POSIX paths are just sequences of bytes, allowed to be invalid utf8 or to be invalid in whatever your preferred encoding of Unicode is, as some programming languages keep reminding me):
https://stackoverflow.com/questions/54205087/how-can-i-create-a-file-with-null-bytes-in-the-filename
Even if you could somehow create such a file, most unix tools wouldn't be able to handle it. According to the link, the Linux kernel wouldn't be able to handle it. It actually defined a path as a sequence of non-null bytes.
|
Ok, now I can live happily again. I had some knowledge about this, but thankfully it was wrong. For once, I have a screenshot of UNIX/Windows prohibited characters in file names, but I completely skipped over the "printable" part which is also included in the screenshot (NULL is not printable). The second thing was a laboratory work in Operating Systems class. In the PDF, it is said that any character is ok for a name (except of course But in the end it is only logical that in C all strings are ended with NULL char. So yes, I was wrong and that's wonderful. |
Let me point out the options so far:
|
I'll add one more option: if we do shell encoding of file names (as opposed to base64), that has the advantage of only breaking backwards compatibility for files with special characters in their names. This does have the compatibility problem @Andrew15-5 mentioned in their original comment. Decoding of such filenames could be done with I'm not sure if this is worth the fuss. If we can find a way to transition to |
~/.local/share/lf/files
to be able to handle special names\n
character in paths saved to ~/.local/share/lf/files
(paths delimiter bug)
I noticed that there is a I'm thinking that perhaps we could switch to using |
This issue comes up from time to time and there was even a hot discussion about this in the past but in the end I still think it is more useful to keep things simple. Part of the problem that I haven't seen in this discussion is that environment variables can not contain |
I can agree with this, but there are things:
Yes, apparently you can't store |
@Andrew15-5 It's not a problem when you handle things in Go but users are not using Go for customization and the information should be passed to shell somehow. For example, you can't use |
I'm not familiar with Go, but you should be able to create a new process by providing a list such as Oh, wait. Are you talking about things like
I see your point. But we do still have an option to create a separate file for each path. On Linux and some other platforms (BSD), you can have up to 255 character length file names. This means that you can have |
I had an idea to use Something like the below implementation: diff --git a/app.go b/app.go
index 0c40d6c..c262326 100644
--- a/app.go
+++ b/app.go
@@ -113,27 +113,26 @@ func loadFiles() (list []string, cp bool, err error) {
}
defer files.Close()
- s := bufio.NewScanner(files)
-
- s.Scan()
-
- switch s.Text() {
+ reader := bufio.NewReader(files)
+ line, _, err := reader.ReadLine()
+ switch string(line) {
case "copy":
cp = true
case "move":
cp = false
default:
- err = fmt.Errorf("unexpected option to copy file(s): %s", s.Text())
+ err = fmt.Errorf("unexpected option to copy file(s): %s", string(line))
return
}
- for s.Scan() && s.Text() != "" {
- list = append(list, s.Text())
+ bytes, err := io.ReadAll(reader)
+ if err != nil {
+ err = fmt.Errorf("loading file list: %s", err)
+ return
}
- if s.Err() != nil {
- err = fmt.Errorf("scanning file list: %s", s.Err())
- return
+ if len(bytes) > 0 {
+ list = strings.Split(string(bytes), gOpts.filesep)
}
log.Printf("loading files: %v", list)
@@ -160,10 +159,7 @@ func saveFiles(list []string, cp bool) error {
fmt.Fprintln(files, "move")
}
- for _, f := range list {
- fmt.Fprintln(files, f)
- }
-
+ files.WriteString(strings.Join(list, gOpts.filesep))
files.Sync()
return nil
} Although this might end up being a breaking change since the format of |
Since the project in beta/alpha, this means that breaking changes are expected. To be a complete/mature file manager it has to support LF in file names since such file names are possible. But is definitely not a priority. On the other hand, if this will be the last breaking change to the file structure, then it's better to change the structure ASAP, IMO. Now user base is small with time it will only grow. |
@Andrew15-5 I don't know what you mean by beta/alpha but we have been avoiding breaking changes as much as possible for quite some time. @joelim-work Wouldn't that still be the case that users can not use To be honest, this newline in filenames issues have been going around for years and at this point I don't think there is an easy solution. An alternative approach might be to proactively give warnings to users before shell commands. So maybe we can consider adding a check before running shell commands to see if the filenames contain a newline and/or the |
Really? This doesn't mean that there weren't breaking changes, because they were in every second version or so (just by looking through https://github.com/gokcehan/lf/releases). I think there was a WIP status, but not anymore. I used to the semver, so I can't really tell just by looking at version r31.
Hmm, I think we are likely to introduce a breaking change anyway, so this is almost a fact. A hard refuse on super rare special names encountered (that can make a chaos) does make me feel safer. So you are saying that some kind of switch or an additional flag must be passed every time to bypass the error? Because otherwise the file manager wouldn't be 100% file manager if it can't manage special named files just because they're named strangely.
Right now (I didn't test it for a while though) when copying such a file it will take 2+ lines in the |
@Andrew15-5 All I'm saying is that we try to avoid breaking changes "as much as possible". In practice, it means if you have a suggestion that involves a breaking change, you also need to have an argument that is as strong as the change itself. I mentioned this because there has been previous suggestions for this issue that would break like 90% of all custom commands in existence. That is not the kind of breaking changes that we make these days. Instead, we often use the breaking label for changes that you should be aware of. For example, if we were to add a filename check for shell commands, we would likely have it listed as a breaking change in the release notes, even though it does not really break anything in practice, since it was already not working in the first place. Technically, you can always argue that every change in a software is a breaking change. In any case, you are welcome to have your own judgement on the stability of this software. In your case, I suggest you to interpret our every release as a major version in semver. Note, I consider I agree, some builtin commands like |
I understand what "as much as possible" means (in this context).
According to SemVer breaking changes is when the major number is increased. So things like new features that only add something new and don't break anything old is a non-breaking change, hence a minor number is increased. I mean, since you are deliberately (I assume) don't use SemVer or any other kind of v1.1(.1), it would indicate that the versions are treated differently. Now I know that each version is like a major. But it's probably more like major or minor. If there were or will be hotfix versions (to fix just bugs), then it's like patch too.
I would agree if we are closing our eyes on the fact that a file manager should manage files with its own API/commands. I consider using other file managers or shell as cheating, because it relies on 3rd party software (including shell, instead of direct Kernel/OS API). It should be better than this! All non-built-in/auxiliary commands/API is just an enhancement to allow for greater number of (nonessential) features and for bigger flexibility. Can you comment on this:
Because it kind of looks like you just want to add a precaution to not work with files with the |
@Andrew15-5 Why should there be a mechanism to override the error, if we already know beforehand that the command will fail? |
No-no. Mm, I imagine it a bit differently. Yes, the built-in commands will fail either way. But shell script/command might not fail (shell commands can handle arguments with I was thinking... I don't know... I have a thought, but now I don't think it's the right thing, but here it is. If the special name triggers the error, then if we (pass a magical flag to) ignore the error, then the algorithm of copying/pasting/deleting will be different. Normally in the |
@gokcehan @Andrew15-5 To clarify a few points in this discussion:
I forgot to mention this earlier, but theoretically you can set
There used to be such a WIP notice in the Just because there are breaking changes in every release, that does not mean that they can always be accepted. Breaking changes are not created equally and have to be decided on a case-by-case basis. In general a breaking change is more likely to be accepted if:
This isn't a problem with other file managers because they do not have the same design as The current format of The more I think about this, the less attractive I find the idea of using diff --git a/app.go b/app.go
index 0c40d6c..4ce07b3 100644
--- a/app.go
+++ b/app.go
@@ -2,6 +2,7 @@ package main
import (
"bufio"
+ "encoding/base64"
"fmt"
"io"
"log"
@@ -128,7 +129,9 @@ func loadFiles() (list []string, cp bool, err error) {
}
for s.Scan() && s.Text() != "" {
- list = append(list, s.Text())
+ if f, err := base64.StdEncoding.DecodeString(s.Text()); err == nil {
+ list = append(list, string(f))
+ }
}
if s.Err() != nil {
@@ -161,7 +164,7 @@ func saveFiles(list []string, cp bool) error {
}
for _, f := range list {
- fmt.Fprintln(files, f)
+ fmt.Fprintln(files, base64.StdEncoding.EncodeToString([]byte(f)))
}
files.Sync() This behavior could be toggled using an option named |
I think I already left a note about this. You already forgot that we are dealing with the very rare but totally valid case (using
I thought that I can think of another less ideal hack. We use a dynamic I want to know if Right now, I see 2 clear winners that will 100% help manage any file name:
If creating a server, and its (shellscript) API won't be a huge problem, then maybe we could transition the |
Again, in my opinion, a file manager has to be able to manage any files with after a clean installation. If we are going to "hide" or make an opt-in option for handling files with It's probably "too late" or something, but a lot of things are not "too late" it's just a strong emotional connection with all the work or use that is associated with the |
I do not think it should be expected from software in any stage of development to cover all the possibilities. That is actually true for most things: people will not get vaccines for all possible infections, cars won't be tested for all possible incidents but for the most common... Anyway, if anyone wants to cover new lines issue, you should also consider marks and tags |
I understand that it would be good to cover it, but I just disagree with the notion that it is a necessity for completeness/ok only for beta version software. While I always value the pursuit of perfection, it's important to consider the balance between effort and benefit. Especially in the context of unpaid effort in FOSS. As for the display format, while I don't anticipate ever seeing it, I prefer to have it displayed in a single line. So maybe |
Good.
What do you mean by that? Support for LF is ok/great in any version, but I'm talking about necessity to support it in the end, i.e. stable release (i.e., 1.0.0 in SemVer).
If you are worried that unpaid effort to add support for LF will be greater than benefit, then I can take over the implementation. But I don't know when I will be able to do this, considering that I don't even know Go and I failed to quickly setup Go LSP (a lot of import errors), although I don't think it will take more than 2 days to learn all the basics. And as most of us, I don't have enough time. The only question is which implementation @gokcehan will choose, if any (issue is still open, therefore I assume it (LF) will be supported eventually).
Since our UI taste (of how to display LF) is different, we can provide an option to change between different styles:
Default can be inline variant so that file will take the same number of lines as any other file would (the way users are used to see files). BTW, LF is for Line Feed |
They handle them enough to be able to copy/cut/delete them. And this is probably 90%–100% of functionality anyway. Because you can't create a file, and you don't create a file with a GUI FM (like Nautilus) — you create all files in 100s of different GUI apps. This is why I said "100%" previously. But in any FM you do create a lot of folders, but it is still nothing compared to an actual handling of files/folders, so this is why I said "90%".
Heh, you only mentioned ranger for renaming, huh? I wonder why. /s Fortunately during renaming Nautilus does show LF character which you can copy and paste it in the rename field, and it will save it just fine (or you can copy
IDK, ask the author. I probably would've started the same discussion there (maybe it already exists), but I don't use it anymore. I already left a note about that:
The fact that |
You've seen right through me :) Nautilus file display and renaming (as long as you can copy LF from somewhere) work great.
I did not mean lf should not support LF, just that it should support single line representation with |
You got me there. I threw them into discussion as examples only because I used them for some time and don't remember using anything else. I mean, I did use Dolphin, and it probably works very similar to Nautilus, but I don't have it installed, and I didn't use it as much. I could go out of my way and search for a better collection of stable FMs that do fully support LF char, but I'm too busy (and lazy).
Wonderful!
It only sounds like a headache to implement, but not to use (if implemented in a good way). I can see some solutions to some problems that can arise after setting "display LF literally" option, but also some can be more serious and will require more thinking, but overall, I don't see any UI/UX problems right now. And it's too early to discuss this anyway. |
To respond to some more points in this discussion:
It is already impractical enough that we are talking about dealing with filenames with newlines in them, but now we are extending this discussion to the filesystem of a hypothetical user in which every single possible byte happens to be used in filenames. As I have mentioned above, I agree with scrapping the idea of using
You are correct that
This is rather opinionated, and I am not convinced about changing the default behavior. You might personally be very invested in having
As such, changing the default behavior may benefit you, but result in a net negative experience for other users.
Thank you for bringing this up, I can also reproduce a crash when tagging a file with newlines. I think whatever is used to address
I am worried that adding a real newline could be confusing, intuitively you would think that each entry takes the same amount of space (i.e. one line). As for Printing a hardcoded value of diff --git a/ui.go b/ui.go
index 9a18120..2de2f49 100644
--- a/ui.go
+++ b/ui.go
@@ -468,7 +468,7 @@ func (win *win) printDir(screen tcell.Screen, dir *dir, context *dirContext, dir
maxFilenameWidth -= len(info)
}
- filename := []rune(f.Name())
+ filename := []rune(strings.ReplaceAll(f.Name(), "\n", "$'\\n'"))
if runeSliceWidth(filename) > maxFilenameWidth {
truncatePos := (maxFilenameWidth - 1) * gOpts.truncatepct / 100
lastPart := runeSliceWidthLastRange(filename, maxFilenameWidth-truncatePos-1) Anyway regarding the decision on what action to take (if any), I will leave that decision up to @gokcehan, who I imagine might be tired from seeing all the comments here and in other issues like #47. |
You are talking about the encoding approach. I was referring to the in-memory approach, which is used instead/with load=$(cat ~/.local/share/lf/files)
mode=$(echo "$load" | head -n 1)
list=$(echo "$load" | sed 1d)
# or
# mode=$(head -n 1 ~/.local/share/lf/files)
# list=$(sed 1d ~/.local/share/lf/files) everyone would use: mode=$(lf -mode)
list=$(lf -files) or something like: mode=$(lf -storage mode)
list=$(lf -storage files) or: mode=$(lf -files mode)
list=$(lf -files list) I think it's super pretty and very minimalistic and doesn't rely on anything but the built-in functionality (no POSIX commands, no file read). This again, will rely on Go's implementation of table/set/hashmap or on some kind of DB (SQLite, Redis). Then the I want to hear an opinion about such approach and which difficulties it can introduce during implementation and execution (from someone who knows Go and |
I think what you're trying to describe already exists. The server lf -remote "query $id maps" Support for the list of cut/copied files was not added because it was deemed redundant, since In any case, such a command just outputs a stream of bytes. How are you planning to delimit entries while allowing for newlines in filenames? Using
I assume you are referring to the people mentioned in #1329. All of them have responded at some point in this issue. |
Oh yeah, cool. I skimmed through the changes too fast (a while ago) and forgot about this.
That's the idea (like
This is the question of adding a new flag (or changing behavior of existing map a $a=$(printf '\n\0') # example map
Are we talking about Windows (maybe some other supported OSes)? In UNIX, you can't use
I refer to people in this (#1114) issue. Like not specifically you, but since you are very active, then you could be the first one in the group. |
Sorry I might be wrong on the possibility of Similar to From the direction of this conversation, it looks like you're requesting for some kind of |
Some time ago I was working with
~/.local/share/lf/files
file and after some edge cases I quickly figured out that right now file/dir names are stored unreliably. The easiest example of this is when there is a linefeed\n
symbol in the name: it will be saved in the "file" as 2 lines, e.g., as 2 names. And you really can't have 1 delimiter, as any character can be in path (because Linux support all Unicode characters).Right now I was working with
opener.sh
forcmd open $.../opener.sh
and because of POSIX (I like to make as many of my scripts as portable as possible) shell I stumped into the same problem with Unicode names and delimiter (mainly because there are no arrays in POSIX shell). Luckily, I was able to find some kind of portable solution: usingbase64
encoder/decoder fromcoreutils
(which is not POSIX, but a lot of distros are GNU/Linux).The point is that using any simple and fast encoding you can greatly limit the output charset and therefore choose any delimiter you like (even the same linefeed). Since I've limited myself to mostly POSIX shellscript I have limited encoders (or I can make my own), but you are making the program in Go and I think your options are far richer than mine. Hence, the implementation of this enhancement should be pretty straightforward and not as bulky as mine, IMO.
So, basically my idea is to put encoded path
echo "$unicode_path" | base64
into the same~/.local/share/lf/files
file and delimit each encoded path with a linefeed\n
(same as now). And when you need to paste selected files/dirs you just useecho "$encoded_path" | base64 --decode
and then process each Unicode path separately in Go (or shell or whatever).The only downgrade will be in user-friendlyness because after encoding, the "file" will be "ciphered" for the end user. And to overcome this problem, you will have to add a note in the manual for how to get the original (decoded) data from the "file" (or at least which steps are performed to create the "file").
The text was updated successfully, but these errors were encountered: