Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Addition: Palindromic Tree #205

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
58 changes: 58 additions & 0 deletions content/strings/PalindromicTree.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
/**
* Author: Ahmed Dardery
* Date: 2020-12-24
* License: CC0
* Source: https://codeforces.com/blog/entry/13959
* Description: Builds a palindromic tree over a string
* 0 is the imaginary string, 1 is the empty string
* [2, n) are the palindromes, s.substr(pos[i], len[i]), occurs freq[i]
* fail[i] is the longest suffix palindrome of ith palindrome
Comment on lines +6 to +9
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* Description: Builds a palindromic tree over a string
* 0 is the imaginary string, 1 is the empty string
* [2, n) are the palindromes, s.substr(pos[i], len[i]), occurs freq[i]
* fail[i] is the longest suffix palindrome of ith palindrome
* Description: Builds a palindromic tree over a string.
* 0 is the imaginary string, 1 is the empty string.
* [2, n) are the palindromes, s.substr(pos[i], len[i]), occurs freq[i] times.
* fail[i] is the longest suffix palindrome of i'th palindrome.

We need to explain the tree structure too. As I understand it fail is a tree parent pointer, but I'm not clear on when fail[i] is 0 and when it's 1.

fail isn't a great name, suf or par may be better

* Time: O(n)
* Status: stress-tested
*/
#pragma once

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indent with tabs (which will expand to 2 spaces in the pdf)

struct PalinTree {
const int A = 128;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const member isn't great, make it constexpr or move it outside or do enum { A = 128 };

string str;
vi fail, len, pos, lz, freq;
vector<vi> nxt;
int n, cur;

PalinTree(const string &s) : str(s) {
fail = len = pos = lz = freq = vi(sz(s) + 2);
nxt.resize(sz(s) + 2);
n = cur = fail[0] = fail[1] = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fail[0] and fail[1] are automatically 0, and n and cur can have = 0 at their declarations

addNode(-1, -1), addNode(0, 0);
rep(i, 0, sz(s)) addChar(i, s[i]);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
rep(i, 0, sz(s)) addChar(i, s[i]);
rep(i,0,sz(s)) addChar(i, s[i]);

(kactl doesn't use spaces in rep macro arguments except when they are long)

propagate();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not all usages of palindromic trees need propagate or palindrome counts, so I suspect we should mark this optional somehow

}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's squeeze this a bit by removing blank lines between methods

void addChar(int i, int c) {
int u = getFailure(cur, i);
int &ch = nxt[u][c];
Comment on lines +32 to +33
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
int u = getFailure(cur, i);
int &ch = nxt[u][c];
int u = getFailure(cur, i), &ch = nxt[u][c];

if (~ch) return (void) ++lz[cur = ch];
int v = cur = ch = addNode(len[u] + 2, i - len[u] - 1);
fail[v] = len[v] == 1 ? 1 : nxt[getFailure(fail[u], i)][c];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this can use ch or cur instead of v (assuming getFailure is shorted to avoid line-wrapping)

}

int addNode(int l, int p) {
nxt[n].assign(A, -1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
nxt[n].assign(A, -1);
fill(all(nxt[n]), -1);

and make nxt a vector<array<int, A>>; I think that should increase performance

len[n] = l, pos[n] = p, lz[n] = 1, freq[n] = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
len[n] = l, pos[n] = p, lz[n] = 1, freq[n] = 0;
len[n] = l, pos[n] = p, lz[n] = 1;

return n++;
}

void propagate() {
for (int i = n - 1; ~i; --i) {
freq[i] += lz[i];
lz[fail[i]] += lz[i];
lz[i] = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is set up for propagate being called multiple times, but I don't see the point of that since it's O(n) which matches the full tree construction?

would it make sense to replace freq by lz, and remove this resetting?

}
}

int getFailure(int u, int i) {
while (i <= len[u] || str[i] != str[i - len[u] - 1]) u = fail[u];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this line goes past 63 chars, wrap it before u = or golf variable names

return u;
}
};

1 change: 1 addition & 0 deletions content/strings/chapter.tex
Original file line number Diff line number Diff line change
Expand Up @@ -8,3 +8,4 @@ \chapter{Strings}
\kactlimport{SuffixTree.h}
\kactlimport{Hashing.h}
\kactlimport{AhoCorasick.h}
\kactlimport{PalindromicTree.h}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(it's a bit sad to have both PalindromicTree and Manacher, but I guess they kinda compute different things)

32 changes: 32 additions & 0 deletions stress-tests/strings/PalindromicTree.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
#include "../utilities/template.h"
#include "../../content/strings/PalindromicTree.h"


int main() {
rep(alpha, 2, 27) {
rep(t, 0, 10000) {
int n = 1 + rand() % 30;
string s(n, 0);
rep(i, 0, n) {
s[i] = char(rand() % alpha);
}
PalinTree tree(s);
map<string, int> mp;
vector<vi> dp(n+1, vi(n+1));
rep(i, 0, n+1) dp[0][i] = 1;
rep(l, 1, n + 1) rep(i, 0, n - l + 1) {
dp[l][i] = l<= 1 || (dp[l-2][i + 1] && s[i] == s[i + l - 1]);
if (dp[l][i])
++mp[s.substr(i, l)];
}
rep(i, 2, tree.n) {
string sub = s.substr(tree.pos[i], tree.len[i]);
int cnt = mp.find(sub)->second;
assert(cnt == tree.freq[i]);
mp.erase(sub);
}
assert(mp.empty());
}
}
cout << "Test passed!\n";
}