You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In the page_inlink.txt file, some page links (e.g. "Henry_Hutchinson" -> "Stub") are
wrong.
This is because in the page link parser, namespace is not distinguished (e.g. some
pages link to "Wikipedia:Stub" rather than "Stub").
I suggest to modify the method:
public void processPageLinksRow(PagelinksParser plParser)
in SingleDumpVersionJDK.java
from
public void processPageLinksRow(PagelinksParser plParser)
throws IOException {
int pl_from = plParser.getPlFrom();
String pl_to = plParser.getPlTo();
if (pl_to != null) {
KeyType plToHash = (KeyType) hashAlgorithm.hashCode(pl_to);
Integer pl_toValue = pNamePageIdMap.get(plToHash);
// skip redirects if skipPage is enabled
if ((!skipPage || pPageIdNameMap.containsKey(pl_from))
&& pl_toValue != null) {
pageOutlinks.addRow(pl_from, pl_toValue);
pageInlinks.addRow(pl_toValue, pl_from);
}
}
}
to
public void processPageLinksRow(PagelinksParser plParser)
throws IOException {
int pl_from = plParser.getPlFrom();
String pl_to = plParser.getPlTo();
int pl_namespace = plParser.getPlNamespace();
if (pl_to != null) {
switch (pl_namespace) {
case NS_MAIN: {
KeyType plToHash = (KeyType) hashAlgorithm.hashCode(pl_to);
Integer pl_toValue = pNamePageIdMap.get(plToHash);
// skip redirects if skipPage is enabled
if ((!skipPage || pPageIdNameMap.containsKey(pl_from))
&& pl_toValue != null) {
pageOutlinks.addRow(pl_from, pl_toValue);
pageInlinks.addRow(pl_toValue, pl_from);
}
}
}
}
}
Reported by astronautguo on 2012-09-20 16:24:42
The text was updated successfully, but these errors were encountered:
Originally reported on Google Code with ID 103
Reported by
astronautguo
on 2012-09-20 16:24:42The text was updated successfully, but these errors were encountered: