Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Couple of questions #19

Closed
Rambatino opened this issue Aug 5, 2023 · 3 comments
Closed

Couple of questions #19

Rambatino opened this issue Aug 5, 2023 · 3 comments

Comments

@Rambatino
Copy link

Rambatino commented Aug 5, 2023

Hi, I'm thinking of using this to increase the rate of instrument transactions that we can process, by using a local WAL I can increase the throughput as I can process the requests to the database in a worker.

So reading this:

	// Sync is whether to synchronize writes through os buffer cache and down onto the actual disk.
	// Setting sync is required for durability of a single write operation, but also results in slower writes.
	//
	// If false, and the machine crashes, then some recent writes may be lost.
	// Note that if it is just the process that crashes (machine does not) then no writes will be lost.
	//
	// In other words, Sync being false has the same semantics as a write
	// system call. Sync being true means write followed by fsync.
	Sync bool

I'm a little bit confused - if there's a fatal crash in the process, how will writes not be lost? If they're stored in a buffer in memory, before fsync, then how are those writes recoverred?

Second question, if I'm simultaneously writing and reading to (and then deleting from) the WAL from different threads:

I use:

w.WAL.Write(b)

to write, and:

reader := w.WAL.NewReader()
			for {
				val, pos, err := reader.Next()
				if err == io.EOF {
					break
				}
				fmt.Println(string(val))
				fmt.Println(pos) // get position of the data for next read
				w.ch <- val
			}

to read. Does reader := w.WAL.NewReader() return all the segments up and until the point in time that the function is called? I think it does looking at:

	if segId == 0 || wal.activeSegment.id <= segId {
		reader := wal.activeSegment.NewReader()
		segmentReaders = append(segmentReaders, reader)
	}

and then:

func (seg *segment) NewReader() *segmentReader {
	return &segmentReader{
		segment:     seg,
		blockNumber: 0,
		chunkOffset: 0,
	}
}

seems to be 0 chunks in the new reader that was created and therefore it doesn't process any messages in there?

What's also the safest way to delete so that I never reprocess a message twice (although it isn't the end of the world if I do (if it's chronological), it's just costs time).

I can work it out with sufficient testing, but I figured it may be worth asking here.

Thank you in advance 🧡

@Rambatino
Copy link
Author

Basically, thinking about it more, I need to be in a situation where I'm writing to the WAL, it's persisting, and if anything happens, like a crash, I can restart the current process, it reads from the WAL and it knows that it can drain the WAL and write all those messages to the database and then carry on, as if there was no WAL and that all the messages to the db had been written in sync

@roseduan
Copy link
Contributor

roseduan commented Aug 6, 2023

  1. the WAL code relies on the standard os package in Golang, all the new coming writes will go into the system cache first, so even the process crashes, the OS will flush the data into disk, which will also ensure durability.
    Data loss is a rare occurrence, all WAL implementations(like wal in mysql, pg, leveldb and rocksdb) have the same behavior.
  2. Yes, it will return all data in wal until NewReader is called, the chunkOffset is 0 means that it will iterate the data from beginning to end in the segment file.

According to your description, the WAL is suitable for you, because it is sequential IO, which will improve your system thought.

Thanks for your attention, give us feedback if you have any problems using WAL, enjoy!

@roseduan
Copy link
Contributor

roseduan commented Aug 7, 2023

Welcome to add your case at #13, which will encourage us a lot, thanks!

@roseduan roseduan closed this as completed Aug 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants