Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validation error Silk-singlemachine #56

Open
jplu opened this issue Mar 7, 2016 · 1 comment
Open

Validation error Silk-singlemachine #56

jplu opened this issue Mar 7, 2016 · 1 comment

Comments

@jplu
Copy link

jplu commented Mar 7, 2016

Hello,

When I try to run my script with the release 2.7.1 this way:

java -DconfigFile=ffiec_lei.xml -jar silk.jar

I get the following error:

Silk Link Discovery Framework - Version 2.7.1
Exception in thread "main" org.silkframework.runtime.serialization.ValidationException: Link specification  contains an output that does not reference a predefined output by id
    at org.silkframework.config.LinkSpecification$LinkSpecificationFormat$$anonfun$read$1.apply(LinkSpecification.scala:124)
    at org.silkframework.config.LinkSpecification$LinkSpecificationFormat$$anonfun$read$1.apply(LinkSpecification.scala:121)
    at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:245)
    at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:245)
    at scala.collection.Iterator$class.foreach(Iterator.scala:742)
    at scala.collection.AbstractIterator.foreach(Iterator.scala:1194)
    at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
    at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
    at scala.collection.TraversableLike$class.map(TraversableLike.scala:245)
    at scala.collection.AbstractTraversable.map(Traversable.scala:104)
    at org.silkframework.config.LinkSpecification$LinkSpecificationFormat$.read(LinkSpecification.scala:121)
    at org.silkframework.config.LinkSpecification$LinkSpecificationFormat$.read(LinkSpecification.scala:94)
    at org.silkframework.runtime.serialization.Serialization$.fromXml(Serialization.scala:19)
    at org.silkframework.config.LinkingConfig$LinkingConfigFormat$$anonfun$6.apply(LinkingConfig.scala:124)
    at org.silkframework.config.LinkingConfig$LinkingConfigFormat$$anonfun$6.apply(LinkingConfig.scala:124)
    at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:245)
    at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:245)
    at scala.collection.Iterator$class.foreach(Iterator.scala:742)
    at scala.collection.AbstractIterator.foreach(Iterator.scala:1194)
    at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
    at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
    at scala.collection.TraversableLike$class.map(TraversableLike.scala:245)
    at scala.collection.AbstractTraversable.map(Traversable.scala:104)
    at org.silkframework.config.LinkingConfig$LinkingConfigFormat$.read(LinkingConfig.scala:124)
    at org.silkframework.config.LinkingConfig$LinkingConfigFormat$.read(LinkingConfig.scala:105)
    at org.silkframework.runtime.serialization.Serialization$.fromXml(Serialization.scala:19)
    at org.silkframework.Silk$.executeFile(Silk.scala:105)
    at org.silkframework.Silk$.execute(Silk.scala:92)
    at org.silkframework.Silk$$anonfun$1.apply$mcV$sp(Silk.scala:179)
    at org.silkframework.util.CollectLogs$.apply(CollectLogs.scala:29)
    at org.silkframework.Silk$.main(Silk.scala:178)
    at org.silkframework.Silk.main(Silk.scala)

Here is my config file:

<Silk>
<Prefixes>
<Prefix id="rdf" namespace="http://www.w3.org/1999/02/22-rdf-syntax-ns#"/>
<Prefix id="rdfs" namespace="http://www.w3.org/2000/01/rdf-schema#"/>
<Prefix id="owl" namespace="http://www.w3.org/2002/07/owl#"/>
</Prefixes>
<DataSources>
<Dataset id="ffiec" type="csv">
<Param name="arraySeparator" value=""/>
<Param name="separator" value=","/>
<Param name="prefix" value=""/>
<Param name="uri" value=""/>
<Param name="quote" value=""/>
<Param name="properties" value="IDRSSD,FDIC+Certificate+Number,OCC+Charter+Number,OTS+Docket+Number,Primary+ABA+Routing+Number,Financial+Institution+Name,Financial+Institution+Name+Cleaned,Financial+Institution+Address,Financial+Institution+City,Financial+Institution+State,Financial+Institution+Zip+Code,Financial+Institution+Zip+Code+5,Financial+Institution+Filing+Type,Last+Date%2FTime+Submission+Updated+On"/>
<Param name="regexFilter" value=""/>
<Param name="charset" value="UTF-8"/>
<Param name="file" value="FFIEC.csv"/>
<Param name="linesToSkip" value="0"/>
</Dataset>
<Dataset id="lei" type="csv">
<Param name="arraySeparator" value=""/>
<Param name="separator" value=","/>
<Param name="prefix" value=""/>
<Param name="uri" value=""/>
<Param name="quote" value=""/>
<Param name="properties" value="LOU,LOU_ID,LEI,LegalName,LegalNameCleaned,AssociatedLEI,AssociatedEntityName,LegalAddress_Line_Cleaned,LegalAddress_Line_Combined,LegalAddress_Line1,LegalAddress_Line2,LegalAddress_Line3,LegalAddress_Line4,LegalAddress_City,LegalAddress_Region_2,LegalAddress_Region,LegalAddress_Country,LegalAddress_PostalCode,LegalAddress_PostalCode_Numbers,LegalAddress_PostalCode_5,HeadquartersAddress_Line1,HeadquartersAddress_Line2,HeadquartersAddress_Line3,HeadquartersAddress_Line4,HeadquartersAddress_City,HeadquartersAddress_Region,HeadquartersAddress_Country,HeadquartersAddress_PostalCode,Register,BusinessRegisterEntityID,EntityStatus,InitialRegistrationDate,RegistrationStatus,LastUpdateDate,EntityExpirationDate,EntityExpirationReason,NextRenewalDate,SuccessorLEI,LegalForm"/>
<Param name="regexFilter" value=""/>
<Param name="charset" value="ISO-8859-1"/>
<Param name="file" value="LEI.csv"/>
<Param name="linesToSkip" value="0"/>
</Dataset>
</DataSources>
<Interlinks>
<Interlink id="linking_task">
<SourceDataset dataSource="ffiec" var="a" typeUri="">
<RestrictTo></RestrictTo>
</SourceDataset>
<TargetDataset dataSource="lei" var="b" typeUri="">
<RestrictTo></RestrictTo>
</TargetDataset>
<LinkageRule linkType="owl:sameAs">
<Aggregate id="average1" required="false" weight="1" type="average">
<Compare id="levenshteinDistance7" required="false" weight="1" metric="levenshteinDistance" threshold="3.0" indexing="true">
<TransformInput id="lowerCase13" function="lowerCase">
<TransformInput id="trim13" function="trim">
<Input id="sourcePath10" path="/Financial+Institution+Address"/>
</TransformInput>
</TransformInput>
<TransformInput id="lowerCase14" function="lowerCase">
<TransformInput id="trim14" function="trim">
<Input id="targetPath10" path="/LegalAddress_Line1"/>
</TransformInput>
</TransformInput>
<Param name="minChar" value="0"/>
<Param name="maxChar" value="z"/>
</Compare>
<Compare id="levenshteinDistance6" required="false" weight="1" metric="levenshteinDistance" threshold="3.0" indexing="true">
<TransformInput id="lowerCase11" function="lowerCase">
<TransformInput id="trim11" function="trim">
<Input id="sourcePath9" path="/Financial+Institution+Name"/>
</TransformInput>
</TransformInput>
<TransformInput id="lowerCase12" function="lowerCase">
<TransformInput id="trim12" function="trim">
<Input id="targetPath9" path="/LegalName"/>
</TransformInput>
</TransformInput>
<Param name="minChar" value="0"/>
<Param name="maxChar" value="z"/>
</Compare>
<Compare id="levenshteinDistance8" required="false" weight="1" metric="levenshteinDistance" threshold="3.0" indexing="true">
<TransformInput id="lowerCase15" function="lowerCase">
<TransformInput id="trim15" function="trim">
<Input id="sourcePath11" path="/Financial+Institution+Address"/>
</TransformInput>
</TransformInput>
<TransformInput id="lowerCase16" function="lowerCase">
<TransformInput id="trim16" function="trim">
<Input id="targetPath11" path="/LegalAddress_Line2"/>
</TransformInput>
</TransformInput>
<Param name="minChar" value="0"/>
<Param name="maxChar" value="z"/>
</Compare>
<Compare id="levenshteinDistance9" required="false" weight="1" metric="levenshteinDistance" threshold="3.0" indexing="true">
<TransformInput id="lowerCase17" function="lowerCase">
<TransformInput id="trim17" function="trim">
<Input id="sourcePath12" path="/Financial+Institution+Address"/>
</TransformInput>
</TransformInput>
<TransformInput id="lowerCase18" function="lowerCase">
<TransformInput id="trim18" function="trim">
<Input id="targetPath12" path="/HeadquartersAddress_Line2"/>
</TransformInput>
</TransformInput>
<Param name="minChar" value="0"/>
<Param name="maxChar" value="z"/>
</Compare>
<Compare id="levenshteinDistance5" required="false" weight="1" metric="levenshteinDistance" threshold="3.0" indexing="true">
<TransformInput id="lowerCase9" function="lowerCase">
<TransformInput id="trim9" function="trim">
<Input id="sourcePath8" path="/Financial+Institution+Name+Cleaned"/>
</TransformInput>
</TransformInput>
<TransformInput id="lowerCase10" function="lowerCase">
<TransformInput id="trim10" function="trim">
<Input id="targetPath8" path="/LegalNameCleaned"/>
</TransformInput>
</TransformInput>
<Param name="minChar" value="0"/>
<Param name="maxChar" value="z"/>
</Compare>
<Compare id="levenshteinDistance4" required="false" weight="1" metric="levenshteinDistance" threshold="2.0" indexing="true">
<TransformInput id="lowerCase7" function="lowerCase">
<TransformInput id="trim7" function="trim">
<Input id="sourcePath7" path="/Financial+Institution+City"/>
</TransformInput>
</TransformInput>
<TransformInput id="lowerCase8" function="lowerCase">
<TransformInput id="trim8" function="trim">
<Input id="targetPath7" path="/HeadquartersAddress_City"/>
</TransformInput>
</TransformInput>
<Param name="minChar" value="0"/>
<Param name="maxChar" value="z"/>
</Compare>
<Compare id="levenshteinDistance3" required="false" weight="1" metric="levenshteinDistance" threshold="3.0" indexing="true">
<TransformInput id="lowerCase5" function="lowerCase">
<TransformInput id="trim5" function="trim">
<Input id="sourcePath6" path="/Financial+Institution+Address"/>
</TransformInput>
</TransformInput>
<TransformInput id="lowerCase6" function="lowerCase">
<TransformInput id="trim6" function="trim">
<Input id="targetPath6" path="/HeadquartersAddress_Line1"/>
</TransformInput>
</TransformInput>
<Param name="minChar" value="0"/>
<Param name="maxChar" value="z"/>
</Compare>
<Compare id="levenshteinDistance2" required="false" weight="1" metric="levenshteinDistance" threshold="0.0" indexing="true">
<TransformInput id="lowerCase3" function="lowerCase">
<TransformInput id="trim3" function="trim">
<Input id="sourcePath5" path="/Financial+Institution+State"/>
</TransformInput>
</TransformInput>
<TransformInput id="stripPrefix2" function="stripPrefix">
<TransformInput id="lowerCase4" function="lowerCase">
<TransformInput id="trim4" function="trim">
<Input id="targetPath5" path="/LegalAddress_Region"/>
</TransformInput>
</TransformInput>
<Param name="prefix" value="us-"/>
</TransformInput>
<Param name="minChar" value="0"/>
<Param name="maxChar" value="z"/>
</Compare>
<Compare id="levenshteinDistance1" required="false" weight="1" metric="levenshteinDistance" threshold="0.0" indexing="true">
<TransformInput id="lowerCase1" function="lowerCase">
<TransformInput id="trim1" function="trim">
<Input id="sourcePath4" path="/Financial+Institution+State"/>
</TransformInput>
</TransformInput>
<TransformInput id="stripPrefix1" function="stripPrefix">
<TransformInput id="lowerCase2" function="lowerCase">
<TransformInput id="trim2" function="trim">
<Input id="targetPath4" path="/HeadquartersAddress_Region"/>
</TransformInput>
</TransformInput>
<Param name="prefix" value="us-"/>
</TransformInput>
<Param name="minChar" value="0"/>
<Param name="maxChar" value="z"/>
</Compare>
<Compare id="levenshteinDistance10" required="false" weight="1" metric="levenshteinDistance" threshold="2.0" indexing="true">
<TransformInput id="lowerCase19" function="lowerCase">
<TransformInput id="lowerCase23" function="lowerCase">
<Input id="sourcePath13" path="/Financial+Institution+City"/>
</TransformInput>
</TransformInput>
<TransformInput id="lowerCase20" function="lowerCase">
<TransformInput id="lowerCase24" function="lowerCase">
<Input id="targetPath13" path="/HeadquartersAddress_City"/>
</TransformInput>
</TransformInput>
<Param name="minChar" value="0"/>
<Param name="maxChar" value="z"/>
</Compare>
<Compare id="levenshteinDistance11" required="false" weight="1" metric="levenshteinDistance" threshold="0.0" indexing="true">
<TransformInput id="lowerCase21" function="lowerCase">
<TransformInput id="trim19" function="trim">
<Input id="sourcePath14" path="/Financial+Institution+State"/>
</TransformInput>
</TransformInput>
<TransformInput id="lowerCase22" function="lowerCase">
<TransformInput id="trim20" function="trim">
<Input id="targetPath14" path="/LegalAddress_Region_2"/>
</TransformInput>
</TransformInput>
<Param name="minChar" value="0"/>
<Param name="maxChar" value="z"/>
</Compare>
<Compare id="equality1" required="false" weight="1" metric="equality" threshold="2.0" indexing="true">
<Input id="sourcePath1" path="/Financial+Institution+Zip+Code+5"/>
<Input id="targetPath1" path="/LegalAddress_PostalCode_5"/>
</Compare>
<Compare id="equality2" required="false" weight="1" metric="equality" threshold="2.0" indexing="true">
<Input id="sourcePath2" path="/Financial+Institution+Zip+Code"/>
<Input id="targetPath2" path="/HeadquartersAddress_PostalCode"/>
</Compare>
<Compare id="equality3" required="false" weight="1" metric="equality" threshold="2.0" indexing="true">
<Input id="sourcePath3" path="/Financial+Institution+Zip+Code"/>
<Input id="targetPath3" path="/LegalAddress_PostalCode"/>
</Compare>
</Aggregate>
<Filter/>
</LinkageRule>
<Outputs>
                <Output type="file" minConfidence="0.57">
                    <Param name="file" value="/Users/jplu/tools/silk-singlemachine-2.7.1/accepted_ffiec_lei.nt" />
                    <Param name="format" value="ntriples" />
                </Output>
                <Output type="file" maxConfidence="0.56" minConfidence="0">
                    <Param name="file" value="/Users/jplu/tools/silk-singlemachine-2.7.1/verify_ffiec_lei.nt" />
                    <Param name="format" value="ntriples" />
                </Output>
            </Outputs>
            </Interlink>
        </Interlinks>
</Silk>

I do not understand, what is going wrong?

Thanks.

@afeliachi
Copy link
Contributor

Hi Julien,
If you're still getting this problem, µI think you can solv it by adding an "id" attribute to your output elements like this:
<Output id="acceptedLinks" type="file" minConfidence="0.57"> ...
tell me what you get.

Abdel.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants