Skip to content

How can I compare the children of a fragment? #11

Open
@archonic

Description

@archonic

I so far haven't been able to get sensible output between two simple fragments. I have this for a simple comparison service:

class ComparisonService
  def initialize(seq1, seq2)
    @doc1 = Nokogiri::HTML.fragment(seq1)
    @doc2 = Nokogiri::HTML.fragment(seq2)
  end

  def raw
    {
      old: @doc1,
      new: @doc2
    }
  end

  def changes
    output = []
    @doc1.diff(@doc2) do |change, node|
      output << {
        change: change,
        node: node.to_html
      }
    end
    output
  end
end

This is the ouput of comparison.raw:

{:old=>
  #(DocumentFragment:0x2aef160d6238 {
    name = "#document-fragment",
    children = [
      #(Element:0x2aef160c5de8 { name = "p", children = [ #(Text "This paragraph remains the same.")] }),
      #(Element:0x2aef160c5d84 { name = "p", children = [ #(Text "This paragraph gets removed.")] })]
    }),
 :new=>
  #(DocumentFragment:0x2aef160c5b68 {
    name = "#document-fragment",
    children = [
      #(Element:0x2aef160c5438 { name = "p", children = [ #(Text "This paragraph remains the same.")] }),
      #(Element:0x2aef160c5348 { name = "p", children = [ #(Text "This paragraph is new.")] })]
    })}

I should see one removal and one addition for the change in the second paragraph. However, the changes method lumps everything together:

[{:change=>"-", :node=>"<p>This paragraph remains the same.</p><p>This paragraph gets removed.</p>"}, {:change=>"+", :node=>"<p>This paragraph remains the same.</p><p>This paragraph is new.</p>"}]

I've tried @doc1 = Nokogiri::HTML(seq1) but this appends <html> and <body> (unwanted) and seems to run the comparison against children recurrsively, like a russian doll:

[1] pry(#<DocumentsController>)> comp.raw
=> {:old=>
  #(Document:0x2aef1679ac90 {
    name = "document",
    children = [
      #(DTD:0x2aef1669663c { name = "html" }),
      #(Element:0x2aef16692604 {
        name = "html",
        children = [
          #(Element:0x2aef1668c31c {
            name = "body",
            children = [
              #(Element:0x2aef1667f5cc { name = "p", children = [ #(Text "This paragraph remains the same.")] }),
              #(Element:0x2aef166764f4 { name = "p", children = [ #(Text "This paragraph gets removed.")] })]
            })]
        })]
    }),
 :new=>
  #(Document:0x2aef1679abdc {
    name = "document",
    children = [
      #(DTD:0x2aef1663fcec { name = "html" }),
      #(Element:0x2aef1663e068 {
        name = "html",
        children = [
          #(Element:0x2aef16635ef4 {
            name = "body",
            children = [
              #(Element:0x2aef1662d1f0 { name = "p", children = [ #(Text "This paragraph remains the same.")] }),
              #(Element:0x2aef1661ebf0 { name = "p", children = [ #(Text "This paragraph is new.")] })]
            })]
        })]
    })}
[2] pry(#<DocumentsController>)> comp.changes
=> [{:change=>" ", :node=>""},
 {:change=>" ", :node=>"<html><body>\n<p>This paragraph remains the same.</p>\n<p>This paragraph gets removed.</p>\n</body></html>"},
 {:change=>" ", :node=>"<body>\n<p>This paragraph remains the same.</p>\n<p>This paragraph gets removed.</p>\n</body>"},
 {:change=>" ", :node=>"<p>This paragraph remains the same.</p>\n"},
 {:change=>" ", :node=>"<p>This paragraph gets removed.</p>"},
 {:change=>" ", :node=>"This paragraph remains the same."},
 {:change=>"-", :node=>"This paragraph gets removed."},
 {:change=>"+", :node=>"This paragraph is new."}]

I'm not sure if others find that output favourable, but I'm looking to make the output make sense by rendering html changes side by side, like commits can be viewed on github. Any suggestions?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions