-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add some more tests #33
base: rdbms-sink
Are you sure you want to change the base?
Changes from all commits
88c122b
c083941
abd177b
23f6269
2f5b704
d17a6af
efe0d5e
c9f2839
b35ad3b
bede65e
9b2abff
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -6,14 +6,12 @@ | |
import org.slf4j.LoggerFactory; | ||
|
||
import java.io.*; | ||
import java.nio.ByteBuffer; | ||
import java.nio.CharBuffer; | ||
import java.nio.charset.*; | ||
import java.sql.*; | ||
import java.sql.Connection; | ||
import java.sql.DriverManager; | ||
import java.sql.SQLException; | ||
import java.sql.Statement; | ||
import java.util.Arrays; | ||
import java.util.regex.Matcher; | ||
import java.util.regex.Pattern; | ||
import java.util.stream.Collector; | ||
import java.util.stream.Collectors; | ||
|
||
/** | ||
|
@@ -57,7 +55,7 @@ public void writeBegin() throws IOException { | |
// We use column names node_id, way_id, relation_id instead of just id to facilitate joins. | ||
statement.execute("create table updates (time timestamp, replication_epoch bigint, operation varchar)"); | ||
statement.execute("create table nodes (node_id bigint, lat float(9), lon float(9), tags varchar)"); | ||
statement.execute("create table ways (way_id bigint, tags varchar, nodes bigint array)"); | ||
statement.execute("create table ways (way_id bigint, tags varchar, node_ids bigint array)"); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is a good idea - I almost made this change myself a while back. However it does mean we're introducing a schema change in a PR that's ostensibly about adding tests. Ideally this would be done in a separate commit or even a separate PR. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. the main reason I changed this was because I was trying to debug that other update statement and this was confusing me |
||
statement.execute("create table relations (relation_id bigint, tags varchar)"); | ||
statement.execute("create table relation_members (relation_id bigint, type varchar, member_id bigint, role varchar)"); | ||
connection.commit(); | ||
|
@@ -246,7 +244,11 @@ public void writeEnd() throws IOException { | |
// statement.execute("create index on nodes(lat, lon)"); | ||
LOG.info("Assigning representative coordinates to ways..."); | ||
statement.execute("alter table ways add column rep_lat float(9), add column rep_lon float(9)"); | ||
statement.execute("update ways set (rep_lat, rep_lon) = (select lat, lon from nodes where nodes.node_id = nodes[array_length(nodes, 1)/2])"); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is this the change that took you a long time to figure out? I still don't know why it was a problem. Can you add a code comment explaining why the subquery is necessary? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yep, this is the problem statement. I also don't know why it was a problem on travis either, but I'll add a comment. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I was having a hard time understanding what you meant by "problem statement". I just finally realized you probably mean "the statement that was causing the problem", i.e. "the problematic statement" not "the statement of a problem". There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How did you fix the statement if you don't know why it was failing? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I just refactored it to use a subquery and then it worked. ¯_(ツ)_/¯ |
||
// use a subquery because a previous statement without a subquery was failing on travis for unknown reasons | ||
statement.execute("update ways " + | ||
"set rep_lat=subq.lat, rep_lon=subq.lon " + | ||
"from (select lat, lon, node_id FROM nodes) as subq " + | ||
"where subq.node_id = ways.node_ids[array_length(ways.node_ids, 1)/2]"); | ||
LOG.info("Indexing representative coordinates of ways..."); | ||
statement.execute("create index on ways(rep_lat, rep_lon)"); | ||
connection.commit(); | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,75 @@ | ||
package com.conveyal.osmlib; | ||
|
||
import org.junit.Assert; | ||
import org.junit.Test; | ||
|
||
import java.sql.*; | ||
|
||
import static org.hamcrest.MatcherAssert.assertThat; | ||
import static org.hamcrest.Matchers.is; | ||
|
||
/** | ||
* Tests that verify osm data can be saved and queried from postgres | ||
*/ | ||
public class PostgresTest { | ||
|
||
/** | ||
* Test that this library can load data into postgres and that the count of records matches expectations | ||
*/ | ||
@Test | ||
public void canLoadFromFileIntoDatabase() throws Exception { | ||
String newDBName = TestUtils.generateNewDB(); | ||
if (newDBName == null) { | ||
throw new Exception("failed to generate test db"); | ||
} | ||
try { | ||
String jdbcUrl = "jdbc:postgresql://localhost/" + newDBName; | ||
String[] args = { | ||
"./src/test/resources/bangor_maine.osm.pbf", | ||
jdbcUrl | ||
}; | ||
|
||
// perform file to postgres load | ||
PostgresSink.main(args); | ||
|
||
// verify that data was loaded into postgres | ||
try { | ||
Connection connection = DriverManager.getConnection(jdbcUrl); | ||
TableTestCase[] tables = { | ||
new TableTestCase("nodes", 35747), | ||
new TableTestCase("relation_members", 435), | ||
new TableTestCase("relations", 34), | ||
new TableTestCase("updates", 0), | ||
new TableTestCase("ways", 2976) | ||
}; | ||
|
||
for (TableTestCase table : tables) { | ||
PreparedStatement preparedStatement = connection.prepareStatement("Select count(*) from " + table.tableName); | ||
preparedStatement.execute(); | ||
ResultSet resultSet = preparedStatement.getResultSet(); | ||
resultSet.next(); | ||
int numNodes = resultSet.getInt(1); | ||
assertThat(numNodes, is(table.expectedNumNodes)); | ||
} | ||
} catch (SQLException e) { | ||
e.printStackTrace(); | ||
Assert.fail(); | ||
} | ||
} finally { | ||
TestUtils.dropDB(newDBName); | ||
} | ||
} | ||
} | ||
|
||
/** | ||
* Helper class to iterate through when testing whether the proper amount of records got loaded into a particular table. | ||
*/ | ||
class TableTestCase { | ||
String tableName; | ||
int expectedNumNodes; | ||
|
||
public TableTestCase(String tableName, int expectedNumNodes) { | ||
this.tableName = tableName; | ||
this.expectedNumNodes = expectedNumNodes; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is running arbitrary code off a remote URL. This raises a red flag for me, especially on a repo that contains signing keys. Can we copy the script into our repo so we know what we're running?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The script at the url is pretty large and may change over time. I think this request is a bit excessive because that would also mean that we should examine and put every single maven dependency in our repo because that code could also echo environment variables or send it to a remote server.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True enough but Maven artifacts are signed. This is taking arbitrary code from a URL with no verification, and anyone can stick a proxy in between you and the server. It's always debatable how paranoid one needs to be, but the recent trend in piping the contents of URLs into bash is really a security hole.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that it's also unlikely we'd ever know if someone got the keys, which then compromises the signing system of Maven. The security problem is contagious.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah... but it's an https URL so there's a certificate for a particular domain involved. Maybe it's not so insecure then...