Skip to content

Commit

Permalink
Merge pull request #1 from Georgetown-University-Libraries/master
Browse files Browse the repository at this point in the history
Synch changes back to terrywbrady
  • Loading branch information
terrywbrady committed Dec 5, 2013
2 parents 0829bdc + b535948 commit 67b73d3
Show file tree
Hide file tree
Showing 134 changed files with 8,178 additions and 1,798 deletions.
63 changes: 59 additions & 4 deletions README
Original file line number Diff line number Diff line change
@@ -1,8 +1,50 @@
PURPOSE
The File Analyzer and Metadata Harvester is designed to automate simple, file-based operations performed by cultural heritage institutions.
- Example: count files by type
- Example: test file naming conventions

The application assembles a toolkit of tasks a user can perform
- Present tasks in a simple User Interface
- Reduce learning curve when deploying a new task
- New tasks are easy for a user to locate

The modular code makes it easy to deploy new code and to share code.

APPLICATION DESCRIPTION
- Desktop Application written in Java
- Scans a file system and performs an action on files
- Imports a file, and performs an action on records
- Presents summary results
- Results Merging
- Customizable
- Easy to add additional rules to the application to implement local business rules
- Same UI for all tasks

OVERVIEW PRESENTATIONS
FILE ANALYZER WIKI
- https://github.com/Georgetown-University-Libraries/File-Analyzer/wiki
WRLC (Washington Research Library Consortium) Forum Presentation - Sept 2013
- https://docs.google.com/presentation/d/1hqbusyuNcsphFeAmsNuI3120c4iQqBycB9n7RQ6txdA/pub?start=false&loop=false&delayms=3000
OpenRepositories 2013 Presentation
- http://or2013.net/sessions/focus-your-content-not-ingesting-your-content
- http://or2013.net/sites/or2013.net/files/slides/OR2013%20Focus%20on%20Your%20Content.ppt
Code4Lib Presentation
- https://docs.google.com/presentation/d/1Sq2gqGm58DeuzSTigEofGGci_szqo5rwctgqER2HOKo/pub?start=false&loop=false&delayms=3000
- https://archive.org/details/Code4LibLightningTalksApr2013TerryBrady
DSpace Tools Overview
- http://prezi.com/wridwe2h0d4a/automating-dspace-folder-creation/?kw=view-wridwe2h0d4a&rc=ref-16497460
- http://prezi.com/cigoderbzi2g/dspace-ingest-via-file-analyzer/?kw=view-cigoderbzi2g&rc=ref-16497460
Digitization Tools
- https://docs.google.com/document/d/1KopsLHZCN2dxmZoWAV4QUpFETBRnpOID8lpq65I6Ne4/pub

==================================================================================================

This code has been derived from the NARA File Analyzer and Metadata Harvester which is available at
https://github.com/usnationalarchives/File-Analyzer.

PREREQUISITES
- JDK 1.6 or higher
- JDK 1.6 or higher (for build)
- JRE 1.6 or higher (for runtime)
- Maven (or you will need to compile the modules manually)

INSTALLATION
Expand All @@ -19,11 +61,24 @@ This code will build 3 flavors of the File Analyzer.
(3) Demo File Analyzer
This version contains extensions illustrating various capabilities of the File Analyzer.
This version of the file analyzer is a self-extracting jar file that references both the core and dspace file analyzer jar files.
Sample data will also be added.
This version of the application uses features of Apache Tika and BagIt

=================================================================================================
License information is contained below.
-------------------------------------------------------------------------

Copyright (c) 2013, Georgetown University Libraries
All rights reserved.

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

The original license information is contained below.
Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

-----------------------------
=================================================================================================
The NARA license for the base code is contained below.
-------------------------------------------------------------------------

NARA OPEN SOURCE AGREEMENT VERSION 1.3,
Based on NASA Open Source Agreement for Government Agencies, as approved by the Open Source Initiative.
Expand Down
8 changes: 5 additions & 3 deletions core/src/main/gov/nara/nwts/ftapp/FTDriver.java
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

import gov.nara.nwts.ftapp.filetest.FileTest;
import gov.nara.nwts.ftapp.filter.FileTestFilter;
import gov.nara.nwts.ftapp.importer.DelimitedFileImporter;
import gov.nara.nwts.ftapp.importer.DelimitedFileReader;
import gov.nara.nwts.ftapp.importer.Importer;
import gov.nara.nwts.ftapp.stats.Stats;
import gov.nara.nwts.ftapp.stats.StatsItemConfig;
Expand Down Expand Up @@ -68,7 +68,9 @@ public Preferences getPreferences() {
public boolean hasPreferences() {
return getPreferences()!=null;
}


public void setPreference(String path, String value){
}
protected Vector<Vector<String>>batchItems;

public static void dumpNode(Node n) {
Expand Down Expand Up @@ -264,7 +266,7 @@ public void initiateFileTest(File input, File output) {
fileTraversal.traverseFile();
}
public void loadBatch(File f) throws IOException {
batchItems = DelimitedFileImporter.parseFile(f, "\t", false);
batchItems = DelimitedFileReader.parseFile(f, "\t", false);
batchLoaded();
}

Expand Down
115 changes: 115 additions & 0 deletions core/src/main/gov/nara/nwts/ftapp/counter/Cell.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,115 @@
package gov.nara.nwts.ftapp.counter;

import java.text.NumberFormat;
import java.util.ArrayList;
import java.util.List;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class Cell {
int row = 0;
int col = 0;
boolean valid = false;

static Pattern pCell = Pattern.compile("^([A-Z]+)(\\d+)$");

Cell(int row, int col) {
this.row = row;
this.col = col;
valid = true;
}

Cell(String cellname) {
Matcher m = pCell.matcher(cellname);
if (!m.matches()) return;
String colstr=m.group(1);
int row = Integer.parseInt(m.group(2)) - 1;
int col = -1; //offset -1 for vector ref
for(int i=0;i<colstr.length();i++) {
int ii = colstr.length() - i - 1;
int x = (colstr.charAt(ii) - 'A') + 1;
col += Math.pow(26,i)*x;
}
this.row = row;
this.col = col;
valid = true;
}

static Cell at(String cellname) {
return new Cell(cellname);
}

public String getCellname() {
if (!valid) return "Invalid Cell Name";
String rowstr = ""+(row+1);
StringBuffer colstr = new StringBuffer();

int val = col; //A=0 for last digit
int rem = val % 26;

colstr.append((char)('A'+rem));

for(val = val / 26; val > 0; val = val / 26) {
val--; //A=1 if not last digit
rem = val % 26;
colstr.append((char)('A'+rem));
}
return colstr.reverse().toString() + rowstr;
}

public static final NumberFormat nf = NumberFormat.getIntegerInstance();
static {
nf.setMinimumIntegerDigits(6);
nf.setGroupingUsed(false);
}

public String getCellSort() {
return nf.format(row) + "," + nf.format(col);
}

public static List<Cell> makeRange(int srow, int scol, int endrow, int endcol) {
ArrayList<Cell> cells = new ArrayList<Cell>();
for(int r=srow; r<=endrow; r++) {
for(int c=scol; c<=endcol; c++) {
cells.add(new Cell(r, c));
}
}
return cells;
}

public static void test(String s) {
Cell c = new Cell(s);
System.out.println(s+" "+c.getCellname()+" "+c.row+","+c.col);
}

public static void main(String[] argv) {
test("A1");
test("A2");
test("B1");
test("B10");
test("Z1");
test("Z99");
test("AA1");
test("AA1000");
test("AZ1");
test("AZ2");
test("BA1");
test("BA2");
test("ZA1");
test("ZA2");
test("ZZ1");
test("ZZ2");
test("AAA1");
test("AAA2");
test("YYY1");
test("YYY2");
test("YZZ1");
test("YZZ2");
test("ZYZ1");
test("ZYZ2");
test("ZZZ1");
test("ZZZ2");
test("AAAA1");
test("AAAA2");
}
}
35 changes: 35 additions & 0 deletions core/src/main/gov/nara/nwts/ftapp/counter/CellCheck.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
package gov.nara.nwts.ftapp.counter;

import java.util.ArrayList;
import java.util.List;

public class CellCheck {
ArrayList<Cell> cells = new ArrayList<Cell>();

CounterCheck check;
CellCheck(CounterCheck check) {
this.check = check;
}
CellCheck(CounterCheck check, Cell cell) {
this.check = check;
cells.add(cell);
}
CellCheck(CounterCheck check, List<Cell> acell) {
this.check = check;
for(Cell cell: acell){
cells.add(cell);
}
}
List<CheckResult> performCheck(CounterData cd) {
ArrayList<CheckResult> results = new ArrayList<CheckResult>();
for(Cell cell: cells) {
CheckResult res = check.performCheck(cd, cell, cd.getCellValue(cell));
results.add(res);
if (res.stat.ordinal() >= CounterStat.ERROR.ordinal()) {
break;
}
}
return results;
}

}
65 changes: 65 additions & 0 deletions core/src/main/gov/nara/nwts/ftapp/counter/CheckResult.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
package gov.nara.nwts.ftapp.counter;

public class CheckResult {
CounterRec rec;
public Cell cell;
public CounterStat stat;
public String message = "";
public String newVal;

CheckResult(Cell cell, CounterStat stat) {
this.rec = CounterRec.CELL;
this.cell = cell;
this.stat = stat;
}

CheckResult(CounterStat stat) {
this.rec = CounterRec.FILE;
this.stat = stat;
}

CheckResult setMessage(String message) {
this.message = message;
return this;
}

CheckResult setNewVal(String newVal) {
this.newVal = newVal;
return this;
}

static CheckResult createFileStatus(CounterStat stat) {
return new CheckResult(stat);
}
static CheckResult createCellStatus(Cell cell, CounterStat stat) {
return new CheckResult(cell, stat);
}
static CheckResult createCellValid(Cell cell) {
return createCellStatus(cell, CounterStat.VALID);
}
static CheckResult createCellWarning(Cell cell, String message) {
return createCellStatus(cell, CounterStat.WARNING).setMessage(message);
}
static CheckResult createCellInvalid(Cell cell, String message) {
return createCellStatus(cell, CounterStat.INVALID).setMessage(message);
}
static CheckResult createCellInvalidCase(Cell cell, String message) {
return createCellStatus(cell, CounterStat.WARNING_CASE).setMessage(message);
}
static CheckResult createCellInvalidPunct(Cell cell, String message) {
return createCellStatus(cell, CounterStat.WARNING_PUNCT).setMessage(message);
}
static CheckResult createCellInvalidTrim(Cell cell, String message) {
return createCellStatus(cell, CounterStat.WARNING_TRIM).setMessage(message);
}
static CheckResult createCellInvalidDate(Cell cell, String message) {
return createCellStatus(cell, CounterStat.WARNING_DATE).setMessage(message);
}
static CheckResult createCellInvalidSum(Cell cell, String message) {
return createCellStatus(cell, CounterStat.INVALID_SUM).setMessage(message);
}
static CheckResult createCellError(Cell cell, String message) {
return createCellStatus(cell, CounterStat.ERROR).setMessage(message);
}

}
22 changes: 22 additions & 0 deletions core/src/main/gov/nara/nwts/ftapp/counter/ColSumCounterCheck.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
package gov.nara.nwts.ftapp.counter;

public class ColSumCounterCheck extends SumCounterCheck {
String val;
int sr;
int er;
public ColSumCounterCheck(int sr, int er, String message) {
super(message);
this.sr = sr;
this.er = er;
}

public int getRangeSum(CounterData cd, Cell cell) {
int rangesum = 0;
for(int r=sr; r<=er; r++) {
rangesum += getIntValue(cd.getCellValue(new Cell(r, cell.col)), 0);
}
return rangesum;
}

}

27 changes: 27 additions & 0 deletions core/src/main/gov/nara/nwts/ftapp/counter/CounterCheck.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
package gov.nara.nwts.ftapp.counter;

public class CounterCheck {
String message = "Cell does not match specifications";
CounterStat stat = CounterStat.INVALID;
boolean allowNull = false;

public CheckResult performCheck(CounterData cd, Cell cell, String cellval) {
return CheckResult.createCellValid(cell);
}

public CounterCheck setMessage(String message) {
this.message = message;
return this;
}

public CounterCheck setCounterStat(CounterStat stat) {
this.stat = stat;
return this;
}

public CounterCheck setAllowNull(boolean b) {
allowNull = b;
return this;
}

}
Loading

0 comments on commit 67b73d3

Please sign in to comment.