Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

octopus-merge (part 2: blob-merge) #1585

Merged
merged 10 commits into from
Sep 30, 2024
23 changes: 23 additions & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -243,6 +243,7 @@ members = [
"gix-object",
"gix-glob",
"gix-diff",
"gix-merge",
"gix-date",
"gix-traverse",
"gix-dir",
Expand Down
5 changes: 3 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -130,10 +130,11 @@ is usable to some extent.
* [gix-submodule](https://github.com/Byron/gitoxide/blob/main/crate-status.md#gix-submodule)
* [gix-status](https://github.com/Byron/gitoxide/blob/main/crate-status.md#gix-status)
* [gix-worktree-state](https://github.com/Byron/gitoxide/blob/main/crate-status.md#gix-worktree-state)
* `gitoxide-core`
* **very early** _(possibly without any documentation and many rough edges)_
* [gix-date](https://github.com/Byron/gitoxide/blob/main/crate-status.md#gix-date)
* [gix-dir](https://github.com/Byron/gitoxide/blob/main/crate-status.md#gix-dir)
* `gitoxide-core`
* **very early** _(possibly without any documentation and many rough edges)_
* [gix-merge](https://github.com/Byron/gitoxide/blob/main/crate-status.md#gix-merge)
* **idea** _(just a name placeholder)_
* [gix-note](https://github.com/Byron/gitoxide/blob/main/crate-status.md#gix-note)
* [gix-fetchhead](https://github.com/Byron/gitoxide/blob/main/crate-status.md#gix-fetchhead)
Expand Down
19 changes: 18 additions & 1 deletion crate-status.md
Original file line number Diff line number Diff line change
Expand Up @@ -196,6 +196,9 @@ The top-level crate that acts as hub to all functionality provided by the `gix-*
* [x] probe capabilities
* [x] symlink creation and removal
* [x] file snapshots
* [ ] **BString Interner with Arena-Backing and arbitrary value association**
- probably based on [`internment`](https://docs.rs/internment/latest/internment/struct.Arena.html#),
but needs `bumpalo` support to avoid item allocations/boxing, and avoid internal `Mutex`. (key type is pointer based).

### gix-fs
* [x] probe capabilities
Expand All @@ -215,6 +218,7 @@ The top-level crate that acts as hub to all functionality provided by the `gix-*
* [x] [name validation][tagname-validation]
* [x] transform borrowed to owned objects
* [x] edit trees efficiently and write changes back
- [ ] See if `gix-fs::InternedMap` improves performance.
* [x] API documentation
* [ ] Some examples

Expand Down Expand Up @@ -320,11 +324,24 @@ Check out the [performance discussion][gix-diff-performance] as well.
* [x] prepare invocation of external diff program
- [ ] pass meta-info
* [ ] working with hunks of data
* [ ] diff-heuristics match Git perfectly
* [x] API documentation
* [ ] Examples

[gix-diff-performance]: https://github.com/Byron/gitoxide/discussions/74

### gix-merge

* [x] three-way merge analysis of blobs with choice of how to resolve conflicts
- [ ] choose how to resolve conflicts on the data-structure
- [ ] produce a new blob based on data-structure containing possible resolutions
- [x] `merge` style
- [x] `diff3` style
- [x] `zdiff` style
* [ ] diff-heuristics match Git perfectly
* [x] API documentation
* [ ] Examples

### gix-traverse

Check out the [performance discussion][gix-traverse-performance] as well.
Expand Down
4 changes: 2 additions & 2 deletions gix-attributes/src/state.rs
Original file line number Diff line number Diff line change
Expand Up @@ -23,9 +23,9 @@ impl<'a> ValueRef<'a> {
}

/// Access and conversions
impl ValueRef<'_> {
impl<'a> ValueRef<'a> {
/// Access this value as byte string.
pub fn as_bstr(&self) -> &BStr {
pub fn as_bstr(&self) -> &'a BStr {
self.0.as_bytes().as_bstr()
}

Expand Down
12 changes: 11 additions & 1 deletion gix-diff/src/blob/pipeline.rs
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ pub struct WorktreeRoots {
pub new_root: Option<PathBuf>,
}

/// Access
impl WorktreeRoots {
/// Return the root path for the given `kind`
pub fn by_kind(&self, kind: ResourceKind) -> Option<&Path> {
Expand All @@ -30,6 +31,11 @@ impl WorktreeRoots {
ResourceKind::NewOrDestination => self.new_root.as_deref(),
}
}

/// Return `true` if all worktree roots are unset.
pub fn is_unset(&self) -> bool {
self.new_root.is_none() && self.old_root.is_none()
}
}

/// Data as part of an [Outcome].
Expand Down Expand Up @@ -184,6 +190,8 @@ impl Pipeline {
/// Access
impl Pipeline {
/// Return all drivers that this instance was initialized with.
///
/// They are sorted by [`name`](Driver::name) to support binary searches.
pub fn drivers(&self) -> &[super::Driver] {
&self.drivers
}
Expand Down Expand Up @@ -445,7 +453,7 @@ impl Pipeline {
}
}
.map_err(|err| {
convert_to_diffable::Error::CreateTempfile {
convert_to_diffable::Error::StreamCopy {
source: err,
rela_path: rela_path.to_owned(),
}
Expand Down Expand Up @@ -533,6 +541,8 @@ impl Driver {
pub fn prepare_binary_to_text_cmd(&self, path: &Path) -> Option<std::process::Command> {
let command: &BStr = self.binary_to_text_command.as_ref()?.as_ref();
let cmd = gix_command::prepare(gix_path::from_bstr(command).into_owned())
// TODO: Add support for an actual Context, validate it *can* match Git
.with_context(Default::default())
.with_shell()
.stdin(Stdio::null())
.stdout(Stdio::piped())
Expand Down
3 changes: 2 additions & 1 deletion gix-diff/src/blob/platform.rs
Original file line number Diff line number Diff line change
Expand Up @@ -184,7 +184,7 @@ pub mod prepare_diff {

use crate::blob::platform::Resource;

/// The kind of operation that was performed during the [`diff`](super::Platform::prepare_diff()) operation.
/// The kind of operation that should be performed based on the configuration of the resources involved in the diff.
#[derive(Debug, Copy, Clone, Eq, PartialEq)]
pub enum Operation<'a> {
/// The [internal diff algorithm](imara_diff::diff) should be called with the provided arguments.
Expand Down Expand Up @@ -383,6 +383,7 @@ impl Platform {
///
/// If one of the resources is binary, the operation reports an error as such resources don't make their data available
/// which is required for the external diff to run.
// TODO: fix this - the diff shouldn't fail if binary (or large) files are used, just copy them into tempfiles.
pub fn prepare_diff_command(
&self,
diff_command: BString,
Expand Down
6 changes: 4 additions & 2 deletions gix-filter/src/eol/convert_to_git.rs
Original file line number Diff line number Diff line change
Expand Up @@ -57,8 +57,10 @@ pub(crate) mod function {
/// Return `true` if `buf` was written or `false` if nothing had to be done.
/// Depending on the state in `buf`, `index_object` is called to write the version of `src` as stored in the index
/// into the buffer and if it is a blob, or return `Ok(None)` if no such object exists.
/// If renormalization is desired, let it return `Ok(None)` at all times to not let it have any influence over the
/// outcome of this function.
///
/// *If renormalization is desired*, let it return `Ok(None)` at all times to not let it have any influence over the
/// outcome of this function. Otherwise, it will check if the in-index buffer already has newlines that it would now
/// want to change, and avoid doing so as what's in Git should be what's desired (except for when *renormalizing*).
/// If `round_trip_check` is not `None`, round-tripping will be validated and handled accordingly.
pub fn convert_to_git(
src: &[u8],
Expand Down
10 changes: 5 additions & 5 deletions gix-filter/src/pipeline/convert.rs
Original file line number Diff line number Diff line change
Expand Up @@ -91,7 +91,7 @@ impl Pipeline {
self.options.eol_config,
)?;

let mut in_buffer = false;
let mut in_src_buffer = false;
// this is just an approximation, but it's as good as it gets without reading the actual input.
let would_convert_eol = eol::convert_to_git(
b"\r\n",
Expand Down Expand Up @@ -119,13 +119,13 @@ impl Pipeline {
}
self.bufs.clear();
read.read_to_end(&mut self.bufs.src)?;
in_buffer = true;
in_src_buffer = true;
}
}
if !in_buffer && (apply_ident_filter || encoding.is_some() || would_convert_eol) {
if !in_src_buffer && (apply_ident_filter || encoding.is_some() || would_convert_eol) {
self.bufs.clear();
src.read_to_end(&mut self.bufs.src)?;
in_buffer = true;
in_src_buffer = true;
}

if let Some(encoding) = encoding {
Expand Down Expand Up @@ -158,7 +158,7 @@ impl Pipeline {
if apply_ident_filter && ident::undo(&self.bufs.src, &mut self.bufs.dest)? {
self.bufs.swap();
}
Ok(if in_buffer {
Ok(if in_src_buffer {
ToGitOutcome::Buffer(&self.bufs.src)
} else {
ToGitOutcome::Unchanged(src)
Expand Down
49 changes: 49 additions & 0 deletions gix-merge/Cargo.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
[package]
name = "gix-merge"
version = "0.0.0"
repository = "https://github.com/Byron/gitoxide"
license = "MIT OR Apache-2.0"
description = "A crate of the gitoxide project implementing merge algorithms"
authors = ["Sebastian Thiel <[email protected]>"]
edition = "2021"
rust-version = "1.65"

[lints]
workspace = true

[lib]
doctest = false

[features]
default = ["blob"]
## Enable diffing of blobs using imara-diff, which also allows for a generic rewrite tracking implementation.
blob = ["dep:imara-diff", "dep:gix-filter", "dep:gix-worktree", "dep:gix-path", "dep:gix-fs", "dep:gix-command", "dep:gix-tempfile", "dep:gix-trace", "dep:gix-quote"]
## Data structures implement `serde::Serialize` and `serde::Deserialize`.
serde = ["dep:serde", "gix-hash/serde", "gix-object/serde"]

[dependencies]
gix-hash = { version = "^0.14.2", path = "../gix-hash" }
gix-object = { version = "^0.44.0", path = "../gix-object" }
gix-filter = { version = "^0.13.0", path = "../gix-filter", optional = true }
gix-worktree = { version = "^0.36.0", path = "../gix-worktree", default-features = false, features = ["attributes"], optional = true }
gix-command = { version = "^0.3.9", path = "../gix-command", optional = true }
gix-path = { version = "^0.10.11", path = "../gix-path", optional = true }
gix-fs = { version = "^0.11.3", path = "../gix-fs", optional = true }
gix-tempfile = { version = "^14.0.0", path = "../gix-tempfile", optional = true }
gix-trace = { version = "^0.1.10", path = "../gix-trace", optional = true }
gix-quote = { version = "^0.4.12", path = "../gix-quote", optional = true }

thiserror = "1.0.63"
imara-diff = { version = "0.1.7", optional = true }
bstr = { version = "1.5.0", default-features = false }
serde = { version = "1.0.114", optional = true, default-features = false, features = ["derive"] }

document-features = { version = "0.2.0", optional = true }

[dev-dependencies]
gix-testtools = { path = "../tests/tools" }
pretty_assertions = "1.4.0"

[package.metadata.docs.rs]
all-features = true
features = ["document-features"]
1 change: 1 addition & 0 deletions gix-merge/LICENSE-APACHE
1 change: 1 addition & 0 deletions gix-merge/LICENSE-MIT
43 changes: 43 additions & 0 deletions gix-merge/src/blob/builtin_driver/binary.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
/// What to do when having to pick a side to resolve a conflict.
#[derive(Copy, Clone, Debug, Eq, PartialEq, Ord, PartialOrd, Hash)]
pub enum ResolveWith {
/// Chose the ancestor to resolve a conflict.
Ancestor,
/// Chose our side to resolve a conflict.
Ours,
/// Chose their side to resolve a conflict.
Theirs,
}

/// Tell the caller of [`merge()`](function::merge) which side was picked.
#[derive(Copy, Clone, Debug, Eq, PartialEq, Ord, PartialOrd, Hash)]
pub enum Pick {
/// Chose the ancestor.
Ancestor,
/// Chose our side.
Ours,
/// Chose their side.
Theirs,
}

pub(super) mod function {
use crate::blob::builtin_driver::binary::{Pick, ResolveWith};
use crate::blob::Resolution;

/// As this algorithm doesn't look at the actual data, it returns a choice solely based on logic.
///
/// It always results in a conflict with `current` being picked unless `on_conflict` is not `None`.
pub fn merge(on_conflict: Option<ResolveWith>) -> (Pick, Resolution) {
match on_conflict {
None => (Pick::Ours, Resolution::Conflict),
Some(resolve) => (
match resolve {
ResolveWith::Ours => Pick::Ours,
ResolveWith::Theirs => Pick::Theirs,
ResolveWith::Ancestor => Pick::Ancestor,
},
Resolution::Complete,
),
}
}
}
30 changes: 30 additions & 0 deletions gix-merge/src/blob/builtin_driver/mod.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
use crate::blob::BuiltinDriver;

impl BuiltinDriver {
/// Return the name of this instance.
pub fn as_str(&self) -> &str {
match self {
BuiltinDriver::Text => "text",
BuiltinDriver::Binary => "binary",
BuiltinDriver::Union => "union",
}
}

/// Get all available built-in drivers.
pub fn all() -> &'static [Self] {
&[BuiltinDriver::Text, BuiltinDriver::Binary, BuiltinDriver::Union]
}

/// Try to match one of our variants to `name`, case-sensitive, and return its instance.
pub fn by_name(name: &str) -> Option<Self> {
Self::all().iter().find(|variant| variant.as_str() == name).copied()
}
}

///
pub mod binary;
pub use binary::function::merge as binary;

///
pub mod text;
pub use text::function::merge as text;
Loading
Loading