From 8b9f09cecd8680cc6cb1408cae3cb21c3a1d0748 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Viktor=20S=C3=B6derqvist?= Date: Fri, 20 Dec 2024 18:04:37 +0100 Subject: [PATCH] Doc fixes part 3, various topics pages (#197) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * introduction.md, fixes #111 * installation.md fixes #108 * indexing.md fixes #107 * hashes.md fixes #106 * functions-intro fixes #104 * faq.md fixes #103 * eval-intro.md: Delete old verbatim replication, fixes #102 * data-types.md fixes #100 * Command-tips fixes #99 * admin.md fixes #96 * Delete get-started.md fixes #105 --------- Signed-off-by: Viktor Söderqvist --- topics/admin.md | 9 +- topics/command-tips.md | 30 ++++--- topics/data-types.md | 4 +- topics/eval-intro.md | 174 ++------------------------------------ topics/faq.md | 24 ++---- topics/functions-intro.md | 82 ++++++++---------- topics/get-started.md | 12 --- topics/hashes.md | 3 - topics/indexing.md | 56 ++++++------ topics/installation.md | 32 +++---- topics/introduction.md | 12 ++- topics/protocol.md | 2 +- wordlist | 12 +++ 13 files changed, 140 insertions(+), 312 deletions(-) delete mode 100644 topics/get-started.md diff --git a/topics/admin.md b/topics/admin.md index f9a2661d..6874e70c 100644 --- a/topics/admin.md +++ b/topics/admin.md @@ -7,7 +7,9 @@ description: Advice for configuring and managing Valkey in production ### Linux -* Deploy Valkey using the Linux operating system. Valkey is also tested on OS X, and from time to time on FreeBSD and OpenBSD systems. However, Linux is where most of the stress testing is performed, and where most production deployments are run. +* Deploy Valkey using the Linux operating system. + Valkey is also regularly tested on macOS and FreeBSD, and from time to time on other OpenBSD, NetBSD, DragonFlyBSD and Solaris-derived systems. + However, Linux is where most of the stress testing is performed, and where most production deployments are run. * Set the Linux kernel overcommit memory setting to 1. Add `vm.overcommit_memory = 1` to `/etc/sysctl.conf`. Then, reboot or run the command `sysctl vm.overcommit_memory=1` to activate the setting. See [FAQ: Background saving fails with a fork() error on Linux?](faq.md#background-saving-fails-with-a-fork-error-on-linux) for details. @@ -37,14 +39,13 @@ description: Advice for configuring and managing Valkey in production ### Security -* By default, Valkey does not require any authentication and listens to all the network interfaces. This is a big security issue if you leave Valkey exposed on the internet or other places where attackers can reach it. See for example [this attack](https://web.archive.org/web/20241019230117/http://antirez.com/news/96) to see how dangerous it can be. Please check our [security page](security.md) and the [quick start](quickstart.md) for information about how to secure Valkey. +* By default, Valkey does not require any authentication and listens to all the network interfaces. This is a big security issue if you leave Valkey exposed on the internet or other places where attackers can reach it. Please check our [security page](security.md) and the [quick start](quickstart.md) for information about how to secure Valkey. ## Running Valkey on EC2 * Use HVM based instances, not PV based instances. -* Do not use old instance families. For example, use m3.medium with HVM instead of m1.medium with PV. * The use of Valkey persistence with EC2 EBS volumes needs to be handled with care because sometimes EBS volumes have high latency characteristics. -* You may want to try the new diskless replication if you have issues when replicas are synchronizing with the primary. +* You may want to try diskless replication if you have issues when replicas are synchronizing with the primary. ## Upgrading or restarting a Valkey instance without downtime diff --git a/topics/command-tips.md b/topics/command-tips.md index 67e88ec9..e63292f5 100644 --- a/topics/command-tips.md +++ b/topics/command-tips.md @@ -3,40 +3,42 @@ title: "Command tips" description: Get additional information about a command --- -Command tips are an array of strings. +This page documents a small part of the reply of the [`COMMAND`](../command.md). +In the reply of the COMMAND command, each command is represented by an array. +The 8th element in this array is the command tips. +It's an array of strings. + These provide Valkey clients with additional information about the command. The information can instruct Valkey Cluster clients as to how the command should be executed and its output processed in a clustered deployment. -Unlike the command's flags (see the 3rd element of `COMMAND`'s reply), which are strictly internal to the server's operation, tips don't serve any purpose other than being reported to clients. +Unlike the command's flags (see the 3rd element of [`COMMAND`](../command.md)'s reply), which are strictly internal to the server's operation, tips don't serve any purpose other than being reported to clients. -Command tips are arbitrary strings. -However, the following sections describe proposed tips and demonstrate the conventions they are likely to adhere to. -## nondeterministic_output +## `nondeterministic_output` This tip indicates that the command's output isn't deterministic. That means that calls to the command may yield different results with the same arguments and data. That difference could be the result of the command's random nature (e.g., `RANDOMKEY` and `SPOP`); the call's timing (e.g., `TTL`); or generic differences that relate to the server's state (e.g., `INFO` and `CLIENT LIST`). **Note:** -Prior to Redis OSS 7.0, this tip was the _random_ command flag. +Prior to Redis OSS 7.0, this tip was the `random` command flag. -## nondeterministic_output_order +## `nondeterministic_output_order` The existence of this tip indicates that the command's output is deterministic, but its ordering is random (e.g., `HGETALL` and `SMEMBERS`). **Note:** -Prior to Redis OSS 7.0, this tip was the _sort_\__for_\__script_ flag. +Prior to Redis OSS 7.0, this tip was the `sort_for_script` flag. -## request_policy +## `request_policy:`*value* This tip can help clients determine the shards to send the command in clustering mode. -The default behavior a client should implement for commands without the _request_policy_ tip is as follows: +The default behavior a client should implement for commands without the `request_policy` tip is as follows: 1. The command doesn't accept key name arguments: the client can execute the command on an arbitrary shard. 1. For commands that accept one or more key name arguments: the client should route the command to a single shard, as determined by the hash slot of the input keys. -In cases where the client should adopt a behavior different than the default, the _request_policy_ tip can be one of: +In cases where the client should adopt a behavior different than the default, the `request_policy` tip can be one of: - **all_nodes:** the client should execute the command on all nodes - primaries and replicas alike. An example is the `CONFIG SET` command. @@ -53,10 +55,10 @@ In cases where the client should adopt a behavior different than the default, th However, note that `SUNIONSTORE` isn't considered as _multi_shard_ because all of its keys must belong to the same hash slot. - **special:** indicates a non-trivial form of the client's request policy, such as the `SCAN` command. -## response_policy +## `response_policy:`*value* This tip can help clients determine the aggregate they need to compute from the replies of multiple shards in a cluster. -The default behavior for commands without a _request_policy_ tip only applies to replies with of nested types (i.e., an array, a set, or a map). +The default behavior for commands without a `request_policy` tip only applies to replies with of nested types (i.e., an array, a set, or a map). The client's implementation for the default behavior should be as follows: 1. The command doesn't accept key name arguments: the client can aggregate all replies within a single nested data structure. @@ -65,7 +67,7 @@ These should be packed in a single in no particular order. 1. For commands that accept one or more key name arguments: the client needs to retain the same order of replies as the input key names. For example, `MGET`'s aggregated reply. -The _response_policy_ tip is set for commands that reply with scalar data types, or when it's expected that clients implement a non-default aggregate. +The `response_policy` tip is set for commands that reply with scalar data types, or when it's expected that clients implement a non-default aggregate. This tip can be one of: * **one_succeeded:** the clients should return success if at least one shard didn't reply with an error. diff --git a/topics/data-types.md b/topics/data-types.md index 853beac1..cea10c4d 100644 --- a/topics/data-types.md +++ b/topics/data-types.md @@ -97,6 +97,4 @@ The [HyperLogLog](hyperloglogs.md) data structures provide probabilistic estimat To extend the features provided by the included data types, use one of these options: 1. Write your own custom [server-side functions in Lua](programmability.md). -1. Write your own Valkey module using the [modules API](modules-intro.md) or check out the [community-supported modules](../modules/). - -
+2. Write your own Valkey module using the [modules API](modules-intro.md) or check out the [modules](../modules/). diff --git a/topics/eval-intro.md b/topics/eval-intro.md index 7d29e97e..fc1159f1 100644 --- a/topics/eval-intro.md +++ b/topics/eval-intro.md @@ -175,7 +175,7 @@ In this case, the application should first load it with `SCRIPT LOAD` and then c Most of [Valkey' clients](../clients/) already provide utility APIs for doing that automatically. Please consult your client's documentation regarding the specific details. -### `!EVALSHA` in the context of pipelining +### `EVALSHA` in the context of pipelining Special care should be given executing `EVALSHA` in the context of a [pipelined request](pipelining.md). The commands in a pipelined request run in the order they are sent, but other clients' commands may be interleaved for execution between these. @@ -202,7 +202,7 @@ However, from the point of view of the Valkey client, there are only two ways to Practically speaking, it is much simpler for the client to assume that in the context of a given connection, cached scripts are guaranteed to be there unless the administrator explicitly invoked the `SCRIPT FLUSH` command. The fact that the user can count on Valkey to retain cached scripts is semantically helpful in the context of pipelining. -## The `!SCRIPT` command +## The `SCRIPT` command The Valkey `SCRIPT` provides several ways for controlling the scripting subsystem. These are: @@ -228,175 +228,17 @@ These are: ## Script replication -In standalone deployments, a single Valkey instance called _primary_ manages the entire database. -A [clustered deployment](cluster-tutorial.md) has at least three primaries managing the sharded database. -Valkey uses [replication](replication.md) to maintain one or more replicas, or exact copies, for any given primary. +In a primary-replica setup (see [replication](replication.md)), write commands performed by a script on the primary are also sent to replicas to maintain consistency. +When the script execution finishes, the sequence of commands that the script generated are wrapped into a [`MULTI`/`EXEC` transaction](transactions.md) and are sent to the replicas and written to the AOF file, if an AOF file is used. (See [Persistence](persistence.md).) +This is called *effects replication*. -Because scripts can modify the data, Valkey ensures all write operations performed by a script are also sent to replicas to maintain consistency. -There are two conceptual approaches when it comes to script replication: +In the past, it was also possible to use *verbatim replication* which means that a script was replicated as a whole, but this was removed in 7.0. -1. Verbatim replication: the primary sends the script's source code to the replicas. - Replicas then execute the script and apply the write effects. - This mode can save on replication bandwidth in cases where short scripts generate many commands (for example, a _for_ loop). - However, this replication mode means that replicas redo the same work done by the primary, which is wasteful. - More importantly, it also requires [all write scripts to be deterministic](#scripts-with-deterministic-writes). -1. Effects replication: only the script's data-modifying commands are replicated. - Replicas then run the commands without executing any scripts. - While potentially lengthier in terms of network traffic, this replication mode is deterministic by definition and therefore doesn't require special consideration. - -Verbatim script replication was the only mode supported until Redis OSS 3.2, in which effects replication was added. -The _lua-replicate-commands_ configuration directive and [`server.replicate_commands()`](lua-api.md#server.replicate_commands) Lua API can be used to enable it. - -In Redis OSS 5.0, effects replication became the default mode. -As of Redis OSS 7.0, verbatim replication is no longer supported. - -### Replicating commands instead of scripts - -Starting with Redis OSS 3.2, it is possible to select an alternative replication method. -Instead of replicating whole scripts, we can replicate the write commands generated by the script. -We call this **script effects replication**. - -**Note:** -starting with Redis OSS 5.0, script effects replication is the default mode and does not need to be explicitly enabled. - -In this replication mode, while Lua scripts are executed, Valkey collects all the commands executed by the Lua scripting engine that actually modify the dataset. -When the script execution finishes, the sequence of commands that the script generated are wrapped into a [`MULTI`/`EXEC` transaction](transactions.md) and are sent to the replicas and AOF. - -This is useful in several ways depending on the use case: - -* When the script is slow to compute, but the effects can be summarized by a few write commands, it is a shame to re-compute the script on the replicas or when reloading the AOF. - In this case, it is much better to replicate just the effects of the script. -* When script effects replication is enabled, the restrictions on non-deterministic functions are removed. - You can, for example, use the `TIME` or `SRANDMEMBER` commands inside your scripts freely at any place. -* The Lua PRNG in this mode is seeded randomly on every call. - -Unless already enabled by the server's configuration or defaults (before Redis OSS 7.0), you need to issue the following Lua command before the script performs a write: - -```lua -server.replicate_commands() -``` - -The [`server.replicate_commands()`](lua-api.md#server.replicate_commands) function returns _true) if script effects replication was enabled; -otherwise, if the function was called after the script already called a write command, -it returns _false_, and normal whole script replication is used. - -This function is deprecated as of Redis OSS 7.0, and while you can still call it, it will always succeed. - -### Scripts with deterministic writes - -**Note:** -Starting with Redis OSS 5.0, script replication is by default effect-based rather than verbatim. -In Redis OSS 7.0, verbatim script replication had been removed entirely. -The following section only applies to versions lower than Redis OSS 7.0 when not using effect-based script replication. - -An important part of scripting is writing scripts that only change the database in a deterministic way. -Scripts executed in a Valkey instance are, by default until version 5.0, propagated to replicas and to the AOF file by sending the script itself -- not the resulting commands. -Since the script will be re-run on the remote host (or when reloading the AOF file), its changes to the database must be reproducible. - -The reason for sending the script is that it is often much faster than sending the multiple commands that the script generates. -If the client is sending many scripts to the primary, converting the scripts into individual commands for the replica / AOF would result in too much bandwidth for the replication link or the Append Only File (and also too much CPU since dispatching a command received via the network is a lot more work for Valkey compared to dispatching a command invoked by Lua scripts). - -Normally replicating scripts instead of the effects of the scripts makes sense, however not in all the cases. -So starting with Redis OSS 3.2, the scripting engine is able to, alternatively, replicate the sequence of write commands resulting from the script execution, instead of replication the script itself. - -In this section, we'll assume that scripts are replicated verbatim by sending the whole script. -Let's call this replication mode **verbatim scripts replication**. - -The main drawback with the _whole scripts replication_ approach is that scripts are required to have the following property: -the script **always must** execute the same Valkey _write_ commands with the same arguments given the same input data set. -Operations performed by the script can't depend on any hidden (non-explicit) information or state that may change as the script execution proceeds or between different executions of the script. -Nor can it depend on any external input from I/O devices. - -Acts such as using the system time, calling Valkey commands that return random values (e.g., `RANDOMKEY`), or using Lua's random number generator, could result in scripts that will not evaluate consistently. - -To enforce the deterministic behavior of scripts, Valkey does the following: - -* Lua does not export commands to access the system time or other external states. -* Valkey will block the script with an error if a script calls a Valkey command able to alter the data set **after** a Valkey _random_ command like `RANDOMKEY`, `SRANDMEMBER`, `TIME`. - That means that read-only scripts that don't modify the dataset can call those commands. - Note that a _random command_ does not necessarily mean a command that uses random numbers: any non-deterministic command is considered as a random command (the best example in this regard is the `TIME` command). -* In Redis OSS version 4.0, commands that may return elements in random order, such as `SMEMBERS` (because Sets are _unordered_), exhibit a different behavior when called from Lua, -and undergo a silent lexicographical sorting filter before returning data to Lua scripts. - So `server.call("SMEMBERS",KEYS[1])` will always return the Set elements in the same order, while the same command invoked by normal clients may return different results even if the key contains exactly the same elements. - However, starting with Redis OSS 5.0, this ordering is no longer performed because replicating effects circumvents this type of non-determinism. - In general, even when developing for Redis OSS 4.0, never assume that certain commands in Lua will be ordered, but instead rely on the documentation of the original command you call to see the properties it provides. -* Lua's pseudo-random number generation function `math.random` is modified and always uses the same seed for every execution. - This means that calling [`math.random`](lua-api.md#runtime-libraries) will always generate the same sequence of numbers every time a script is executed (unless `math.randomseed` is used). - -All that said, you can still use commands that write and random behavior with a simple trick. -Imagine that you want to write a Valkey script that will populate a list with N random integers. - -The initial implementation in Ruby could look like this: - -``` -require 'rubygems' -require 'redis' - -r = Redis.new - -RandomPushScript = < 0) do - res = server.call('LPUSH',KEYS[1],math.random()) - i = i-1 - end - return res -EOF - -r.del(:mylist) -puts r.eval(RandomPushScript,[:mylist],[10,rand(2**32)]) -``` - -Every time this code runs, the resulting list will have exactly the -following elements: - -``` -127.0.0.1:6379> LRANGE mylist 0 -1 - 1) "0.74509509873814" - 2) "0.87390407681181" - 3) "0.36876626981831" - 4) "0.6921941534114" - 5) "0.7857992587545" - 6) "0.57730350670279" - 7) "0.87046522734243" - 8) "0.09637165539729" - 9) "0.74990198051087" -10) "0.17082803611217" -``` - -To make the script both deterministic and still have it produce different random elements, -we can add an extra argument to the script that's the seed to Lua's pseudo-random number generator. -The new script is as follows: - -``` -RandomPushScript = < 0) do - res = server.call('LPUSH',KEYS[1],math.random()) - i = i-1 - end - return res -EOF - -r.del(:mylist) -puts r.eval(RandomPushScript,1,:mylist,10,rand(2**32)) -``` - -What we are doing here is sending the seed of the PRNG as one of the arguments. -The script output will always be the same given the same arguments (our requirement) but we are changing one of the arguments at every invocation, -generating the random seed client-side. -The seed will be propagated as one of the arguments both in the replication link and in the Append Only File, -guaranteeing that the same changes will be generated when the AOF is reloaded or when the replica processes the script. - -Note: an important part of this behavior is that the PRNG that Valkey implements as `math.random` and `math.randomseed` is guaranteed to have the same output regardless of the architecture of the system running Valkey. -32-bit, 64-bit, big-endian and little-endian systems will all produce the same output. +The [`server.replicate_commands()`](lua-api.md#server.replicate_commands) function is deprecated and has no effect, but it exists to avoid breaking existing scripts. ## Debugging Eval scripts -Starting with Redis OSS 3.2, Valkey has support for native Lua debugging. +Valkey has a built-in Lua debugger. The Valkey Lua debugger is a remote debugger consisting of a server, which is Valkey itself, and a client, which is by default [`valkey-cli`](cli.md). The Lua debugger is described in the [Lua scripts debugging](ldb.md) section of the Valkey documentation. diff --git a/topics/faq.md b/topics/faq.md index 28cce051..a8bad50f 100644 --- a/topics/faq.md +++ b/topics/faq.md @@ -20,7 +20,7 @@ there is always an updated version of the data set on disk. ## What's the Valkey memory footprint? -To give you a few examples (all obtained using 64-bit instances): +To give you a few examples: * An empty instance uses ~ 3MB of memory. * 1 Million small Keys -> String Value pairs use ~ 85MB of memory. @@ -28,9 +28,6 @@ To give you a few examples (all obtained using 64-bit instances): Testing your use case is trivial. Use the `valkey-benchmark` utility to generate random data sets then check the space used with the `INFO memory` command. -64-bit systems will use considerably more memory than 32-bit systems to store the same keys, especially if the keys and values are small. This is because pointers take 8 bytes in 64-bit systems. But of course the advantage is that you can -have a lot of memory in 64-bit systems, so in order to run large Valkey servers a 64-bit system is more or less required. The alternative is sharding. - ## Why does Valkey keep its entire dataset in memory? In the past, developers experimented with Virtual Memory and other systems in order to allow larger than RAM datasets, but after all we are very happy if we can do one thing well: data served from memory, disk used for storage. So for now there are no plans to create an on disk backend for Valkey. Most of what @@ -104,24 +101,19 @@ in RAM is also atomic from the point of view of the disk snapshot. ## How can Valkey use multiple CPUs or cores? -It's not very frequent that CPU becomes your bottleneck with Valkey, as usually Valkey is either memory or network bound. -For instance, when using pipelining a Valkey instance running on an average Linux system can deliver 1 million requests per second, so if your application mainly uses O(N) or O(log(N)) commands, it is hardly going to use too much CPU. - -However, to maximize CPU usage you can start multiple instances of Valkey in -the same box and treat them as different servers. At some point a single -box may not be enough anyway, so if you want to use multiple CPUs you can -start thinking of some way to shard earlier. - -You can find more information about using multiple Valkey instances in the [Partitioning page](cluster-tutorial.md). +Enable I/O threading to offload client communication to threads. +In Valkey 8, the I/O threading implementation has been rewritten and greatly improved. +Reading commands from clients and writing replies back uses considerable CPU time. +By offloading this work to separate threads, the main thread can focus on executing commands. -As of version 4.0, Valkey has started implementing threaded actions. For now this is limited to deleting objects in the background and blocking commands implemented via Valkey modules. For subsequent releases, the plan is to make Valkey more and more threaded. +You can also start multiple instances of Valkey in the same box and combine them into a [cluster](cluster-tutorial.md). ## What is the maximum number of keys a single Valkey instance can hold? What is the maximum number of elements in a Hash, List, Set, and Sorted Set? -Valkey can handle up to 2^32 keys, and was tested in practice to +Valkey can handle up to 232 keys, and was tested in practice to handle at least 250 million keys per instance. -Every hash, list, set, and sorted set, can hold 2^32 elements. +Every hash, list, set, and sorted set, can hold 232 elements. In other words your limit is likely the available memory in your system. diff --git a/topics/functions-intro.md b/topics/functions-intro.md index 66f479b1..1d503de8 100644 --- a/topics/functions-intro.md +++ b/topics/functions-intro.md @@ -1,21 +1,24 @@ --- title: "Functions" description: > - Scripting with Redis OSS 7 and beyond + Scripting with functions stored on the server --- -Valkey Functions is an API for managing code to be executed on the server. This feature, which became available in Redis OSS 7, supersedes the use of [EVAL](eval-intro.md) in prior versions of Valkey. +Valkey Functions is an API for managing code to be executed on the server. +This feature is as a complement to [EVAL scripts](eval-intro.md). -## Prologue (or, what's wrong with Eval Scripts?) +## What's wrong with EVAL? -Prior versions of Valkey made scripting available only via the `EVAL` command, which allows a Lua script to be sent for execution by the server. -The core use cases for [Eval Scripts](eval-intro.md) is executing part of your application logic inside Valkey, efficiently and atomically. +There's nothing wrong with `EVAL`, but there are some differences between EVAL scripts and Functions. +With the [`EVAL`](../commands/eval.md) command, scripts are sent to the server for immediate execution. +The core use cases for `EVAL` scripts is executing part of your application logic inside Valkey, efficiently and atomically. Such script can perform conditional updates across multiple keys, possibly combining several different data types. Using `EVAL` requires that the application sends the entire script for execution every time. -Because this results in network and script compilation overheads, Valkey provides an optimization in the form of the `EVALSHA` command. By first calling `SCRIPT LOAD` to obtain the script's SHA1, the application can invoke it repeatedly afterward with its digest alone. +Because this results in network and script compilation overheads, Valkey provides an optimization in the form of the [`EVALSHA`](../commands/evalsha.md) command. +By first calling [`SCRIPT LOAD`](../commands/script-load.md) to obtain the script's SHA1, the application can invoke it repeatedly afterward with its digest alone. -By design, Valkey only caches the loaded scripts. +Valkey only caches the loaded scripts. That means that the script cache can become lost at any time, such as after calling `SCRIPT FLUSH`, after restarting the server, or when failing over to a replica. The application is responsible for reloading scripts during runtime if any are missing. The underlying assumption is that scripts are a part of the application and not maintained by the Valkey server. @@ -24,68 +27,53 @@ This approach suits many light-weight scripting use cases, but introduces severa 1. All client application instances must maintain a copy of all scripts. That means having some mechanism that applies script updates to all of the application's instances. 1. Calling cached scripts within the context of a [transaction](transactions.md) increases the probability of the transaction failing because of a missing script. Being more likely to fail makes using cached scripts as building blocks of workflows less attractive. -1. SHA1 digests are meaningless, making debugging the system extremely hard (e.g., in a `MONITOR` session). -1. When used naively, `EVAL` promotes an anti-pattern in which scripts the client application renders verbatim scripts instead of responsibly using the [`!KEYS` and `ARGV` Lua APIs](lua-api.md#runtime-globals). -1. Because they are ephemeral, a script can't call another script. This makes sharing and reusing code between scripts nearly impossible, short of client-side preprocessing (see the first point). +1. SHA1 digests are not readable for humans, making debugging the system hard (e.g. in a [`MONITOR`](../commands/monitor.md) session). +1. When used naively, `EVAL` promotes an anti-pattern in which the client application renders verbatim scripts instead of responsibly using the [`KEYS` and `ARGV` Lua APIs](lua-api.md#runtime-globals). +1. Because they are ephemeral, a script can't call another script. This makes sharing and reusing code between scripts nearly impossible, short of client-side preprocessing. -To address these needs while avoiding breaking changes to already-established and well-liked ephemeral scripts, Redis OSS v7.0 introduces Valkey Functions. +To address these needs while avoiding breaking changes to already-established and well-liked ephemeral scripts, functions were introduced in version 7.0. ## What are Valkey Functions? -Valkey functions are an evolutionary step from ephemeral scripting. - -Functions provide the same core functionality as scripts but are first-class software artifacts of the database. +Functions provide the same core functionality as scripts but are first-class artifacts of the database. Valkey manages functions as an integral part of the database and ensures their availability via data persistence and replication. Because functions are part of the database and therefore declared before use, applications aren't required to load them during runtime nor risk aborted transactions. An application that uses functions depends only on their APIs rather than on the embedded script logic in the database. Whereas ephemeral scripts are considered a part of the application's domain, functions extend the database server itself with user-provided logic. -They can be used to expose a richer API composed of core Valkey commands, similar to modules, developed once, loaded at startup, and used repeatedly by various applications / clients. -Every function has a unique user-defined name, making it much easier to call and trace its execution. - -The design of Valkey Functions also attempts to demarcate between the programming language used for writing functions and their management by the server. -Lua, the only language interpreter that Valkey presently support as an embedded execution engine, is meant to be simple and easy to learn. -However, the choice of Lua as a language still presents many Valkey users with a challenge. - -The Valkey Functions feature makes no assumptions about the implementation's language. -An execution engine that is part of the definition of the function handles running it. -An engine can theoretically execute functions in any language as long as it respects several rules (such as the ability to terminate an executing function). - -Presently, as noted above, Valkey ships with a single embedded [Lua 5.1](lua-api.md) engine. -There are plans to support additional engines in the future. -Valkey functions can use all of Lua's available capabilities to ephemeral scripts, -with the only exception being the [Valkey Lua scripts debugger](ldb.md). +They can be loaded at startup and be used repeatedly by various applications and clients. +Functions are also persisted to the AOF file and replicated from primary to replicas, so they are as durable as the data itself. +When Valkey is used as an ephemeral cache, additional mechanisms (described below) are required to make functions more durable. Functions also simplify development by enabling code sharing. -Every function belongs to a single library, and any given library can consist of multiple functions. +Every function has a user-defined name and belongs to a library, and a library can consist of multiple functions. The library's contents are immutable, and selective updates of its functions aren't allowed. Instead, libraries are updated as a whole with all of their functions together in one operation. This allows calling functions from other functions within the same library, or sharing code between functions by using a common code in library-internal methods, that can also take language native arguments. -Functions are intended to better support the use case of maintaining a consistent view for data entities through a logical schema, as mentioned above. -As such, functions are stored alongside the data itself. -Functions are also persisted to the AOF file and replicated from primary to replicas, so they are as durable as the data itself. -When Valkey is used as an ephemeral cache, additional mechanisms (described below) are required to make functions more durable. - Like all other operations in Valkey, the execution of a function is atomic. A function's execution blocks all server activities during its entire time, similarly to the semantics of [transactions](transactions.md). These semantics mean that all of the script's effects either have yet to happen or had already happened. The blocking semantics of an executed function apply to all connected clients at all times. Because running a function blocks the Valkey server, functions are meant to finish executing quickly, so you should avoid using long-running functions. +Functions are written in [Lua 5.1](lua-api.md). +Valkey functions can use all of Lua's available capabilities to ephemeral scripts, +with the only exception being the [Valkey Lua scripts debugger](ldb.md). + ## Loading libraries and functions Let's explore Valkey Functions via some tangible examples and Lua snippets. At this point, if you're unfamiliar with Lua in general and specifically in Valkey, you may benefit from reviewing some of the examples in [Introduction to Eval Scripts](eval-intro.md) and [Lua API](lua-api.md) pages for a better grasp of the language. -Every Valkey function belongs to a single library that's loaded to Valkey. -Loading a library to the database is done with the `FUNCTION LOAD` command. -The command gets the library payload as input, -the library payload must start with Shebang statement that provides a metadata about the library (like the engine to use and the library name). +Every Valkey function belongs to a library. +Loading a library to the database is done with the [`FUNCTION LOAD`](../commands/function-load.md) command. +The library source code must start with a Shebang line that provides metadata about the library, like the language (always "lua") and the library name. The Shebang format is: + ``` -#! name= +#!lua name= ``` Let's try loading an empty library: @@ -124,7 +112,7 @@ mylib Notice that the `FUNCTION LOAD` command returns the name of the loaded library, this name can later be used `FUNCTION LIST` and `FUNCTION DELETE`. -We've provided `FCALL` with two arguments: the function's registered name and the numeric value `0`. This numeric value indicates the number of key names that follow it (the same way `EVAL` and `EVALSHA` work). +We've provided [`FCALL`](../commands/fcall.md) with two arguments: the function's registered name and the numeric value `0`. This numeric value indicates the number of key names that follow it (the same way `EVAL` and `EVALSHA` works). We'll explain immediately how key names and additional arguments are available to the function. As this simple example doesn't involve keys, we simply use 0 for now. @@ -141,10 +129,10 @@ To ensure the correct execution of Valkey Functions, both in standalone and clus Any input to the function that isn't the name of a key is a regular input argument. Now, let's pretend that our application stores some of its data in Hashes. -We want an `HSET`-like way to set and update fields in said Hashes and store the last modification time in a new field named `_last_modified_`. +We want an [`HSET`](../commands/hset.md)-like way to set and update fields in said Hashes and store the last modification time in a new field named `_last_modified_`. We can implement a function to do all that. -Our function will call `TIME` to get the server's clock reading and update the target Hash with the new fields' values and the modification's timestamp. +Our function will call [`TIME`](../commands/time.md) to get the server's clock reading and update the target Hash with the new fields' values and the modification's timestamp. The function we'll implement accepts the following input arguments: the Hash's key name and the field-value pairs to update. The Lua API for Valkey Functions makes these inputs accessible as the first and second arguments to the function's callback. @@ -233,7 +221,7 @@ server.register_function('my_hlastmodified', my_hlastmodified) ``` While all of the above should be straightforward, note that the `my_hgetall` also calls [`server.setresp(3)`](lua-api.md#server.setresp). -That means that the function expects [RESP3](https://github.com/redis/redis-specifications/blob/master/protocol/RESP3.md) replies after calling `server.call()`, which, unlike the default RESP2 protocol, provides dictionary (associative arrays) replies. +That means that the function expects [RESP3](protocol.md) replies after calling `server.call()`, which, unlike the default RESP2 protocol, returns the replies as maps (associative arrays). Doing so allows the function to delete (or set to `nil` as is the case with Lua tables) specific fields from the reply, and in our case, the `_last_modified_` field. Assuming you've saved the library's implementation in the _mylib.lua_ file, you can replace it with: @@ -307,7 +295,7 @@ local function check_keys(keys) end if error ~= nil then - server.log(redis.LOG_WARNING, error); + server.log(server.LOG_WARNING, error); return server.error_reply(error) end return nil @@ -372,9 +360,9 @@ And your Valkey log file should have lines in it that are similar to: ## Functions in cluster As noted above, Valkey automatically handles propagation of loaded functions to replicas. -In a Valkey Cluster, it is also necessary to load functions to all cluster nodes. This is not handled automatically by Valkey Cluster, and needs to be handled by the cluster administrator (like module loading, configuration setting, etc.). +In a [cluster](cluster-tutorial.md), it is necessary to load functions to all primaries. -As one of the goals of functions is to live separately from the client application, this should not be part of the Valkey client library responsibilities. Instead, `valkey-cli --cluster-only-masters --cluster call host:port FUNCTION LOAD ...` can be used to execute the load command on all primary nodes. +As one of the goals of functions is to live separately from the client application, this should not be part of the Valkey client library responsibilities. Instead, `valkey-cli --cluster-only-primaries --cluster call host:port FUNCTION LOAD ...` can be used to execute the load command on all primary nodes. Also, note that `valkey-cli --cluster add-node` automatically takes care to propagate the loaded functions from one of the existing nodes to the new node. diff --git a/topics/get-started.md b/topics/get-started.md deleted file mode 100644 index de9bda2a..00000000 --- a/topics/get-started.md +++ /dev/null @@ -1,12 +0,0 @@ ---- -title: "Quick starts" -hideListLinks: true -description: > - Valkey quick start guides ---- - -Valkey can be used as a database, cache, streaming engine, message broker, and more. The following quick start guides will show you how to use Valkey for the following specific purposes: - -1. [Quick start](quickstart.md) - -You can find answers to frequently asked questions in the [FAQ](faq.md). diff --git a/topics/hashes.md b/topics/hashes.md index 422396f7..f5242ffc 100644 --- a/topics/hashes.md +++ b/topics/hashes.md @@ -49,8 +49,6 @@ as well, like `HINCRBY`: (integer) 4972 ``` -You can find the [full list of hash commands in the documentation](https://redis.io/commands#hash). - It is worth noting that small hashes (i.e., a few elements with small values) are encoded in special way in memory that make them very memory efficient. @@ -63,7 +61,6 @@ encoded in special way in memory that make them very memory efficient. See the [complete list of hash commands](../commands/#hash). - ## Examples * Store counters for the number of times bike:1 has been ridden, has crashed, or has changed owners: diff --git a/topics/indexing.md b/topics/indexing.md index bd7acdfb..86839462 100644 --- a/topics/indexing.md +++ b/topics/indexing.md @@ -654,35 +654,37 @@ removing the elements which are outside our search box. Turning this into code is simple. Here is a Ruby example: - def spacequery(x0,y0,x1,y1,exp) - bits=exp*2 - x_start = x0/(2**exp) - x_end = x1/(2**exp) - y_start = y0/(2**exp) - y_end = y1/(2**exp) - (x_start..x_end).each{|x| - (y_start..y_end).each{|y| - x_range_start = x*(2**exp) - x_range_end = x_range_start | ((2**exp)-1) - y_range_start = y*(2**exp) - y_range_end = y_range_start | ((2**exp)-1) - puts "#{x},#{y} x from #{x_range_start} to #{x_range_end}, y from #{y_range_start} to #{y_range_end}" - - # Turn it into interleaved form for ZRANGE query. - # We assume we need 9 bits for each integer, so the final - # interleaved representation will be 18 bits. - xbin = x_range_start.to_s(2).rjust(9,'0') - ybin = y_range_start.to_s(2).rjust(9,'0') - s = xbin.split("").zip(ybin.split("")).flatten.compact.join("") - # Now that we have the start of the range, calculate the end - # by replacing the specified number of bits from 0 to 1. - e = s[0..-(bits+1)]+("1"*bits) - puts "ZRANGE myindex [#{s} [#{e} BYLEX" - } +```ruby +def spacequery(x0,y0,x1,y1,exp) + bits=exp*2 + x_start = x0/(2**exp) + x_end = x1/(2**exp) + y_start = y0/(2**exp) + y_end = y1/(2**exp) + (x_start..x_end).each{|x| + (y_start..y_end).each{|y| + x_range_start = x*(2**exp) + x_range_end = x_range_start | ((2**exp)-1) + y_range_start = y*(2**exp) + y_range_end = y_range_start | ((2**exp)-1) + puts "#{x},#{y} x from #{x_range_start} to #{x_range_end}, y from #{y_range_start} to #{y_range_end}" + + # Turn it into interleaved form for ZRANGE query. + # We assume we need 9 bits for each integer, so the final + # interleaved representation will be 18 bits. + xbin = x_range_start.to_s(2).rjust(9,'0') + ybin = y_range_start.to_s(2).rjust(9,'0') + s = xbin.split("").zip(ybin.split("")).flatten.compact.join("") + # Now that we have the start of the range, calculate the end + # by replacing the specified number of bits from 0 to 1. + e = s[0..-(bits+1)]+("1"*bits) + puts "ZRANGE myindex [#{s} [#{e} BYLEX" } - end + } +end - spacequery(50,100,100,300,6) +spacequery(50,100,100,300,6) +``` While non immediately trivial this is a very useful indexing strategy that in the future may be implemented in Valkey in a native way. diff --git a/topics/installation.md b/topics/installation.md index 444c3922..feba57d1 100644 --- a/topics/installation.md +++ b/topics/installation.md @@ -72,10 +72,10 @@ By default Valkey binds to **all the interfaces** and has no authentication at a 1. Make sure the port Valkey uses to listen for connections (by default 6379 and additionally 16379 if you run Valkey in cluster mode, plus 26379 for Sentinel) is firewalled, so that it is not possible to contact Valkey from the outside world. 2. Use a configuration file where the `bind` directive is set in order to guarantee that Valkey listens on only the network interfaces you are using. For example, only the loopback interface (127.0.0.1) if you are accessing Valkey locally from the same computer. -3. Use the `requirepass` option to add an additional layer of security so that clients will be required to authenticate using the `AUTH` command. -4. Use [spiped](https://www.tarsnap.com/spiped.html) or another SSL tunneling software to encrypt traffic between Valkey servers and Valkey clients if your environment requires encryption. +3. Set up authentication using [Access Control List (ACL)](acl.md) or use the `requirepass` option to add an additional layer of security so that clients will be required to authenticate using the `AUTH` command. +4. Use [TLS](encryption.md) to encrypt traffic between Valkey servers and Valkey clients if your environment requires encryption. -Note that a Valkey instance exposed to the internet without any security [is very simple to exploit](https://web.archive.org/web/20241119215618/http://antirez.com/news/96), so make sure you understand the above and apply **at least** a firewall layer. After the firewall is in place, try to connect with `valkey-cli` from an external host to confirm that the instance is not reachable. +Make sure you understand the above and apply **at least** a firewall layer. After the firewall is in place, try to connect with `valkey-cli` from an external host to confirm that the instance is not reachable. ## Use Valkey from your application @@ -86,7 +86,11 @@ You'll find a [full list of clients for different languages in this page](../cli ## Valkey persistence -You can learn [how Valkey persistence works on this page](persistence.md). It is important to understand that, if you start Valkey with the default configuration, Valkey will spontaneously save the dataset only from time to time. For example, after at least five minutes if you have at least 100 changes in your data. If you want your database to persist and be reloaded after a restart make sure to call the **SAVE** command manually every time you want to force a data set snapshot. Alternatively, you can save the data on disk before quitting by using the **SHUTDOWN** command: +You can learn [how Valkey persistence works on this page](persistence.md). +It is important to understand that, if you start Valkey with the default configuration, Valkey will spontaneously save the dataset only from time to time. +For example, after at least five minutes if you have at least 100 changes in your data. +If you want your database to persist and be reloaded after a restart, make sure to call the [SAVE](../commands/save.md) command manually every time you want to force a data set snapshot. +Alternatively, you can save the data on disk before quitting by using the [SHUTDOWN](../commands/shutdown.md) command: ``` $ valkey-cli shutdown @@ -94,17 +98,16 @@ $ valkey-cli shutdown This way, Valkey will save the data on disk before quitting. Reading the [persistence page](persistence.md) is strongly suggested to better understand how Valkey persistence works. -## Install Valkey properly +## Install Valkey as a system service -Running Valkey from the command line is fine just to hack a bit or for development. However, at some point you'll have some actual application to run on a real server. For this kind of usage you have two different choices: +Running Valkey from the command line is fine just to hack a bit or for development. However, at some point you'll have some actual application to run on a real server. +For this kind of usage, it's highly recommended to install Valkey as a system service so that everything will start properly after a system restart. +The available packages for supported Linux distributions already include the capability of starting the Valkey server as a service. -* Run Valkey using screen. -* Install Valkey in your Linux box in a proper way using an init script, so that after a restart everything will start again properly. +Valkey supports systemd, but this document was written for init scripts, before systemd was widely adapted. +There are many guides online for how to set up a systemd service. -A proper install using an init script is strongly recommended. - -**Note:** -The available packages for supported Linux distributions already include the capability of starting the Valkey server from `/etc/init`. +The remainder of this section explains how to set up Valkey using an init script, for distros like Alpine Linux that don't use systemd. If you have not yet run `make install` after building the Valkey source, you will need to do so before continuing. By default, `make install` will copy the `valkey-server` and `valkey-cli` binaries to `/usr/local/bin`. @@ -168,9 +171,8 @@ Make sure that everything is working as expected: 3. Check that your Valkey instance is logging to the `/var/log/valkey_6379.log` file. 4. If it's a new machine where you can try it without problems, make sure that after a reboot everything is still working. -**Note:** +## Configuring Valkey + The above instructions don't include all of the Valkey configuration parameters that you could change. For example, to use AOF persistence instead of RDB persistence, or to set up replication, and so forth. You should also read the example [valkey.conf](https://github.com/valkey-io/valkey/blob/unstable/valkey.conf) file, which is heavily annotated to help guide you on making changes. Further details can also be found in the [configuration article on this site](valkey.conf.md). - -
diff --git a/topics/introduction.md b/topics/introduction.md index e176505d..61f9579d 100644 --- a/topics/introduction.md +++ b/topics/introduction.md @@ -29,8 +29,12 @@ Valkey also includes: * [LRU eviction of keys](lru-cache.md) * [Automatic failover](sentinel.md) -You can use Valkey from [most programming languages](../clients/). - -Valkey is written in **ANSI C** and works on most POSIX systems like Linux, -\*BSD, and Mac OS X, without external dependencies. Linux and OS X are the two operating systems where Valkey is developed and tested the most, and we **recommend using Linux for deployment**. Valkey may work in Solaris-derived systems like SmartOS, but support is *best effort*. +You can use Valkey from most programming languages. See [clients](../clients/). + +Valkey is written in **ANSI C 11** with Atomics and a few GCC/Clang built-ins like `__builtin_clz()`. +It works on most POSIX systems like Linux, \*BSD and MacOS, without external dependencies. +Linux and MacOS are the two operating systems where Valkey is developed and tested the most, and we **recommend using Linux for deployment**. +Valkey may work on Solaris-derived systems like Illumos, but support is *best effort*. +Supported hardware includes x86-64 (AKA amd64), x86 (32-bit) and AArch64 (64-bit ARM). +It is also known to work on IBM z/Architecture like s390x and builds for this system are available from the Fedora distro. There is no official support for Windows builds. diff --git a/topics/protocol.md b/topics/protocol.md index 08c44c07..61539c5d 100644 --- a/topics/protocol.md +++ b/topics/protocol.md @@ -1,7 +1,7 @@ --- title: "Serialization protocol specification" description: Valkey's serialization protocol (RESP) is the wire protocol that clients implement ----- +--- To communicate with the Valkey server, Valkey clients use a protocol called REdis Serialization Protocol (RESP). While the protocol was designed for Redis, it's used by many other client-server software projects. diff --git a/wordlist b/wordlist index 24d7aebf..ee76d8aa 100644 --- a/wordlist +++ b/wordlist @@ -4,6 +4,7 @@ .rdb ˈrɛd-ɪs A1 +AArch64 acknowledgement ACKs acl @@ -20,6 +21,7 @@ allkeys-random allocator allocator's allocators +amd64 AMD64 analytics antirez @@ -44,6 +46,7 @@ Async async Asyncio atomicity +Atomics Atomicvar Attribution-ShareAlike Auth @@ -196,7 +199,10 @@ Diskless diskless DistLock distlock +distro +distros dnf +DragonFlyBSD dup-sentinel Dynomite earts @@ -344,6 +350,7 @@ Identinal idletime idx idx'-th +Illumos incr incrby incrby_get_mget @@ -735,6 +742,7 @@ runlevels RW Rx/Tx S[1-9] +s390x SaaS sadd sadd_smembers @@ -782,6 +790,7 @@ Snapcraft snapd Snapshotting snapshotting +Solaris Solaris-derived somekey SomeOtherValue @@ -832,6 +841,7 @@ SYNC_RDB_START syncd syscall systemctl +systemd T1 T2 taskset @@ -924,6 +934,8 @@ whos-using-redis WQE WQEs WSL2 +x86 +x86-64 xack xadd xadd_2