Skip to content

Commit

Permalink
Merge branch 'release-0041'
Browse files Browse the repository at this point in the history
  • Loading branch information
Matteo Riondato committed Nov 25, 2014
2 parents a98b5c7 + 63951a1 commit ac18f54
Show file tree
Hide file tree
Showing 20 changed files with 183 additions and 120 deletions.
3 changes: 3 additions & 0 deletions doc/doc/basics/sampler.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,9 @@ The sampler executable can be invoked independently of DeepDive. The following
arguments to the sampler executable are used to specify input files, output
file, and learning and inference parameters:

-q, --quiet
Quiet output

-c <int>, --n_datacopy <int> (Linux only)
Number of data copies

Expand Down
15 changes: 15 additions & 0 deletions doc/doc/changelog/0.04.1-alpha.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
---
layout: default
---

# Changelog for release 0.0.4.1-alpha (11/25/2014)

This release focuses mostly on bug fixing and minor new features.

- Improve handling of failures in extractors and inference rules.
- Add support for running tests on GreenPlum.
- Add support for `-q`, `--quiet` in the DimmWitted sampler. This allows to
reduce the verbosity of the output.
- Remove some dead code.
- Fix a small bug in the `spouse_example` test.

1 change: 1 addition & 0 deletions doc/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -96,6 +96,7 @@ Riondato](http://cs.brown.edu/~matteo/).

### Updates &amp; Changelog

- [Changelog for version 0.04.1-alpha](doc/changelog/0.04.1-alpha.html) (11/25/2014)
- [Changelog for version 0.04-alpha](doc/changelog/0.04-alpha.html) (11/19/2014)
- [Changelog for version 0.03.2-alpha](doc/changelog/0.03.2-alpha.html) (09/16/2014)
- [Changelog for version 0.03.1-alpha](doc/changelog/0.03.1-alpha.html) (08/15/2014)
Expand Down
18 changes: 18 additions & 0 deletions src/main/scala/org/deepdive/Context.scala
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,24 @@ import scala.util.Try
/* Describes the context of the DeepDive application */
object Context extends Logging {

// The akka actor is initialized when "Context.system" is first accessed.
// TODO: it might not be best to use a lazy val here, since we may want
// to run "DeepDive.run" multiple times, e.g. in tests.
/* Notes @zifei:
The difference between lazy val and val is, that a val is executed
when it is defined, and a lazy val is executed when it is accessed the
first time.
We had encountered some problems when executing DeepDive.run several
times in integration tests. We have been using some hacks to fix it
(running these tests one by one in separate sbt commands). If we can
fix it, we can run all tests together.
I have to investigate more into the alternative. A possible way might
be initializing Context.system explicitly every time DeepDive.run
executes (not sure how), or making Context a class rather than an
object. But I am not sure.
*/
lazy val system = ActorSystem("deepdive")
var outputDir = "out"
// This needs to be variable since we might reassign it in relearnFrom feature
Expand Down
23 changes: 22 additions & 1 deletion src/main/scala/org/deepdive/DeepDive.scala
Original file line number Diff line number Diff line change
Expand Up @@ -21,9 +21,12 @@ object DeepDive extends Logging {

def run(config: Config, outputDir: String) {

// Get the actor system
// Initialize and get the actor system
val system = Context.system

// catching all exceptions and terminate Akka
try {

// Load Settings
val settings = Settings.loadFromConfig(config)
// If relearn_from specified, set output dir to that dir and skip everything
Expand Down Expand Up @@ -170,6 +173,24 @@ object DeepDive extends Logging {

// Clean up resources
Context.shutdown()

// end try
} catch {
/* Notes @zifei:
This non-termination fix does not guarantee fixing all
non-termination errors, since we has multiple Akka actors
(InferenceManager, ExtractionManager, etc), and simply catching
errors in DeepDive class may not handle all cases.
But this try-catch do fix some errors, e.g. invalid configuration
file (mandatory fields are not present) (in: Settings.loadFromConfig).
Tested in BrokenTest.scala
*/
case e: Exception =>
// In case of any exception
Context.shutdown()
throw e
}
}

}
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ import ProcessExecutor._
import scala.collection.mutable.ArrayBuffer
import scala.concurrent._
import scala.concurrent.duration._
import scala.language.postfixOps
import scala.io.Source
import scala.sys.process._

Expand Down
15 changes: 11 additions & 4 deletions src/main/scala/org/deepdive/inference/InferenceManager.scala
Original file line number Diff line number Diff line change
Expand Up @@ -59,10 +59,17 @@ trait InferenceManager extends Actor with ActorLogging {
case InferenceManager.GroundFactorGraph(factorDescs, calibrationSettings,
skipLearning, weightTable, parallelGrounding) =>
val _sender = sender
inferenceDataStore.asInstanceOf[SQLInferenceDataStore]
.groundFactorGraph(variableSchema, factorDescs, calibrationSettings,
skipLearning, weightTable, dbSettings, parallelGrounding)
sender ! "OK"
try {
inferenceDataStore.asInstanceOf[SQLInferenceDataStore]
.groundFactorGraph(variableSchema, factorDescs, calibrationSettings,
skipLearning, weightTable, dbSettings, parallelGrounding)
sender ! "OK"
} catch {
// If some exception is thrown, terminate DeepDive
case e: Throwable =>
sender ! Status.Failure(e)
context.stop(self)
}
// factorGraphBuilder ? FactorGraphBuilder.AddFactorsAndVariables(
// factorDesc, holdoutFraction, batchSize) pipeTo _sender
case InferenceManager.RunInference(factorDescs, holdoutFraction, holdoutQuery,
Expand Down
47 changes: 28 additions & 19 deletions src/main/scala/org/deepdive/inference/Sampler.scala
Original file line number Diff line number Diff line change
Expand Up @@ -25,26 +25,35 @@ class Sampler extends Actor with ActorLogging {
val cmd = buildSamplerCmd(samplerCmd, samplerOptions, weightsFile, variablesFile,
factorsFile, edgesFile, metaFile, outputDir, parallelGrounding)
log.info(s"Executing: ${cmd.mkString(" ")}")
// We run the process, get its exit value, and print its output to the log file
val exitValue = cmd!(ProcessLogger(
out => log.info(out),
err => System.err.println(err)
))
// Depending on the exit value we return success or kill the program
exitValue match {
case 0 => sender ! Success()
case _ => {
import scala.sys.process._
import java.lang.management
import sun.management.VMManagement;
import java.lang.management.ManagementFactory;
import java.lang.management.RuntimeMXBean;
import java.lang.reflect.Field;
import java.lang.reflect.Method;
var pid = ManagementFactory.getRuntimeMXBean().getName().toString
val pattern = """\d+""".r
pattern.findAllIn(pid).foreach(id => s"kill -9 ${id}".!)

// Handle the case where cmd! throw exception rather than return a value
try {
// We run the process, get its exit value, and print its output to the log file
val exitValue = cmd!(ProcessLogger(
out => log.info(out),
err => System.err.println(err)
))
// Depending on the exit value we return success or kill the program
exitValue match {
case 0 => sender ! Success()
case _ => {
import scala.sys.process._
import java.lang.management
import sun.management.VMManagement;
import java.lang.management.ManagementFactory;
import java.lang.management.RuntimeMXBean;
import java.lang.reflect.Field;
import java.lang.reflect.Method;
var pid = ManagementFactory.getRuntimeMXBean().getName().toString
val pattern = """\d+""".r
pattern.findAllIn(pid).foreach(id => s"kill -9 ${id}".!)
}
}
} catch {
// If some exception is thrown, terminate DeepDive
case e: Throwable =>
sender ! Status.Failure(e)
context.stop(self)
}

}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -106,7 +106,6 @@ trait SQLInferenceDataStore extends InferenceDataStore with Logging {
case "EqualFactorFunction" => 3
case "IsTrueFactorFunction" => 4
case "MultinomialFactorFunction" => 5
case "ContinuousLRFactorFunction" => 20
}
}

Expand Down Expand Up @@ -678,80 +677,7 @@ trait SQLInferenceDataStore extends InferenceDataStore with Logging {
}
val weightDesc = generateWeightDesc(factorDesc.weightPrefix, factorDesc.weight.variables)

if (factorDesc.func.getClass.getSimpleName == "ContinuousLRFactorFunction"){

log.info("~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~")
if (isFixed || weightlist == ""){
log.error("#########################################")
log.error("DO NOT SUPPORT FIXED ARRAY WEIGHT FOR NOW")
} else { // not fixed and has weight variables
execute(s"""DROP TABLE IF EXISTS ${weighttableForThisFactor} CASCADE;
CREATE TABLE ${weighttableForThisFactor} AS
SELECT ${weightlist}, ${cast(0, "bigint")} AS id, ${cast(isFixed, "int")} AS isfixed, ${initvalue} + 0.0 AS initvalue
FROM ${querytable}
GROUP BY ${weightlist};""")

// handle weight id
if (usingGreenplum) {
executeQuery(s"""SELECT fast_seqassign('${weighttableForThisFactor}', ${cweightid});""")
} else {
execute(s"UPDATE ${weighttableForThisFactor} SET id = ${nextVal(weightidSequence)};")
}

var min_weight_id = 0L
var max_weight_id = 0L

issueQuery(s"""SELECT MIN(id) FROM ${weighttableForThisFactor};""") { rs =>
min_weight_id = rs.getLong(1)
}

issueQuery(s"""SELECT MAX(id) FROM ${weighttableForThisFactor};""") { rs =>
max_weight_id = rs.getLong(1)
}

execute(s"""UPDATE ${weighttableForThisFactor} SET
id = ${min_weight_id} + (id - ${min_weight_id})*4096
;""")

issueQuery(s"""SELECT COUNT(*) FROM ${weighttableForThisFactor};""") { rs =>
cweightid += rs.getLong(1) * 4096
}

execute(s"""
DROP TABLE IF EXISTS ${weighttableForThisFactor}_other CASCADE;
CREATE TABLE ${weighttableForThisFactor}_other (addid int);
""")

var one_2_4096 = (1 to (4096-1)).map(v => s""" (${v}) """).mkString(", ")

execute(s"""
INSERT INTO ${weighttableForThisFactor}_other VALUES ${one_2_4096};
""")

execute(s"""
INSERT INTO ${weighttableForThisFactor}
SELECT t0.feature, t0.id+t1.addid, t0.isfixed, NULL
FROM ${weighttableForThisFactor} t0, ${weighttableForThisFactor}_other t1;
""")

execute(s"""INSERT INTO ${WeightsTable}(id, isfixed, initvalue) SELECT id, isfixed, initvalue FROM ${weighttableForThisFactor};""")

// check null weight
val weightChecklist = factorDesc.weight.variables.map(v => s""" ${quoteColumn(v)} IS NULL """).mkString("AND")
issueQuery(s"SELECT COUNT(*) FROM ${querytable} WHERE ${weightChecklist}") { rs =>
if (rs.getLong(1) > 0) {
throw new RuntimeException("Weight variable has null values")
}
}

// dump factors
val weightjoinlist = factorDesc.weight.variables.map(v => s""" t0.${quoteColumn(v)} = t1.${quoteColumn(v)} """).mkString("AND")
du.unload(s"${outfile}", s"${groundingPath}/${outfile}", dbSettings, parallelGrounding,
s"""SELECT t0.id AS factor_id, t1.id AS weight_id, ${idcols}
FROM ${querytable} t0, ${weighttableForThisFactor} t1
WHERE ${weightjoinlist} AND t1.initvalue IS NOT NULL;""")
}
} else if (factorDesc.func.getClass.getSimpleName != "MultinomialFactorFunction") {
if (factorDesc.func.getClass.getSimpleName != "MultinomialFactorFunction") {

// branch if weight variables present
val hasWeightVariables = !(isFixed || weightlist == "")
Expand Down Expand Up @@ -785,7 +711,6 @@ trait SQLInferenceDataStore extends InferenceDataStore with Logging {
SELECT id, isfixed, initvalue, ${weightDesc} FROM ${weighttableForThisFactor};""")

// check null weight (only if there are weight variables)
// TODO BUG here:
if (hasWeightVariables) {
val weightChecklist = factorDesc.weight.variables.map(v => s""" ${quoteColumn(v)} IS NULL """).mkString("AND")
issueQuery(s"SELECT COUNT(*) FROM ${querytable} WHERE ${weightChecklist}") { rs =>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -31,10 +31,6 @@ case class EqualFactorFunction(variables: Seq[FactorFunctionVariable]) extends F
override def variableDataType = "Boolean"
}

case class ContinuousLRFactorFunction(variables: Seq[FactorFunctionVariable]) extends FactorFunction {
override def variableDataType = "Boolean"
}

/* A factor function describing A == True. Restricted to one variable. */
case class IsTrueFactorFunction(variables: Seq[FactorFunctionVariable]) extends FactorFunction {
override def variableDataType = "Boolean"
Expand All @@ -50,11 +46,6 @@ case class MultinomialFactorFunction(variables: Seq[FactorFunctionVariable]) ext
override def variableDataType = "Discrete"
}

/* Dummy factor function */
case class DummyFactorFunction(val variables: Seq[FactorFunctionVariable]) extends FactorFunction {
override def variableDataType = "Discrete"
}

/* A variable used in a Factor function */
case class FactorFunctionVariable(relation: String, field: String, isArray: Boolean = false,
isNegated: Boolean = false, predicate: Option[Long] = None) {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,6 @@ object FactorFunctionParser extends RegexParsers with Logging {
def relationOrField = """[\w]+""".r
def arrayDefinition = """\[\]""".r
def equalPredicate = """[0-9]+""".r
// def factorFunctionName = "Imply" | "Or" | "And" | "Equal" | "IsTrue"

def implyFactorFunction = ("Imply" | "IMPLY") ~> "(" ~> rep1sep(factorVariable, ",") <~ ")" ^^ { varList =>
ImplyFactorFunction(varList)
Expand All @@ -31,11 +30,6 @@ object FactorFunctionParser extends RegexParsers with Logging {
EqualFactorFunction(List(v1, v2))
}

def continuousLRFactorFunction = ("ContinuousLR" | "CONTINUOUSLR") ~> "(" ~> factorVariable ~ ("," ~> factorVariable) <~ ")" ^^ {
case v1 ~ v2 =>
ContinuousLRFactorFunction(List(v1, v2))
}

def isTrueFactorFunction = ("IsTrue" | "ISTRUE") ~> "(" ~> factorVariable <~ ")" ^^ { variable =>
IsTrueFactorFunction(List(variable))
}
Expand All @@ -60,6 +54,6 @@ object FactorFunctionParser extends RegexParsers with Logging {
}

def factorFunc = implyFactorFunction | orFactorFunction | andFactorFunction |
equalFactorFunction | isTrueFactorFunction | xorFactorFunction | multinomialFactorFunction | continuousLRFactorFunction
equalFactorFunction | isTrueFactorFunction | xorFactorFunction | multinomialFactorFunction

}
5 changes: 4 additions & 1 deletion src/test/resources/application.conf
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,9 @@ deepdive {
password: ${PGPASSWORD}
}

sampler.sampler_args: "-l 300 -i 500 -s 1 --alpha 0.5"
# Use quiet output for tests. Use 0.5 as learning rate for all tests
# gives more stable result than the default (0.1). May need to find
# a better default setting.
sampler.sampler_args: "-l 300 -i 500 -s 1 --alpha 0.5 --quiet"

}
Loading

0 comments on commit ac18f54

Please sign in to comment.