Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revert-store #22

Open
wants to merge 37 commits into
base: revert-store
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
37 commits
Select commit Hold shift + click to select a range
c5d76cf
update to 2.1.4
jexp Sep 28, 2014
e81fde1
Update readme.md
jexp Sep 28, 2014
a7aed40
fixed batch-caches in BatchInserterImpl filling up endlessly
jexp Oct 8, 2014
4063e2d
another fix for memory buildup in the batch-inserter
jexp Oct 10, 2014
3b85b0d
another fix for memory buildup in the batch-inserter
jexp Oct 10, 2014
e4e8936
update to 2.1.8
jexp Apr 15, 2015
05a138a
update to Neo4j 2.2.1
jexp May 20, 2015
5770424
Update readme.md
jexp May 22, 2015
2ffa872
Clean up StoreCopy output formatting
Aug 24, 2015
8cbada4
Merge pull request #7 from MikeBenza/cleanup-output
jexp Aug 24, 2015
16dc965
Update to Neo4j 2.2.4
jexp Aug 25, 2015
f219585
added try-catch block to the copyNodes method
ahmetkizilay Aug 25, 2015
eaeba10
updated current version info in readme
ahmetkizilay Aug 25, 2015
4f30b8e
Merge pull request #8 from graphcommons/22
jexp Aug 25, 2015
2b8e104
Update to 2.2.6
jexp Oct 29, 2015
0d70807
Update to 2.3.0
jexp Oct 29, 2015
ad00f81
better error handling
jexp Nov 2, 2015
2053340
better error handling
jexp Nov 2, 2015
b6954da
StoreComparer: add missing transaction wrapping.
jotomo Oct 13, 2015
742e642
Merge branch '22' into 23
jotomo Nov 4, 2015
a59398a
Merge pull request #13 from jotomo/23
jexp Nov 4, 2015
54f8359
flushing without writing to the source-store
jexp Nov 24, 2015
a18cfea
Updated docs, shell script and version to 2.3.1
jexp Nov 24, 2015
4ce6967
minimal readme fixes
jexp Nov 24, 2015
0a084b4
Fixed progress in case of errors, fixed memory usage in 2.3
jexp Dec 7, 2015
163f93c
changed flushing to clear caches directly
jexp Dec 7, 2015
3730ae4
fixed zero time
jexp Jan 7, 2016
2d36328
Upgrade to Neo4j 3.0.0-M05
jexp Apr 2, 2016
dd2c905
Upgrade to 3.0.3
jexp Jun 15, 2016
9e9194a
Ignore (only log) exceptions when shutting down source db in StoreCop…
lutovich Oct 25, 2016
ccc4f9e
added warning about required memory
jexp Nov 11, 2016
b8a84f1
Better parameter handling in copy-store.sh script, update Neo4j to 3.0.8
jexp Dec 21, 2016
f304b20
New branch for Neo4j 3.1.0
jexp Dec 21, 2016
58b9fd7
readme update
jexp Dec 21, 2016
285def0
separate size four source page cache
jexp May 22, 2017
7774b2b
fix node id int to long
jexp May 24, 2017
d6417bc
Updated Capabilities of Store Copy
jexp May 27, 2017
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -4,4 +4,6 @@ target
*.iml
*.ipr
.idea
*.db*
.shell_history

44 changes: 44 additions & 0 deletions copy-store.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
#!/bin/bash

EDITION=${1-community}
shift
SRC=$1
DST=$2
SKIP_RELS=$3
SKIP_PROPS=$4
SKIP_LABELS=$5
DELETE_NODES=$6
KEEP_NODE_IDS=$7
HEAP=4G
CACHE=2G
CACHE_SRC=1G
#$CACHE
echo "Usage: copy-store.sh [community|enterprise] source.db target.db [RELS,TO,SKIP] [props,to,skip] [Labels,To,Skip] [Labels,To,Delete,Nodes]"

if [[ "$EDITION" != "enterprise" && "$EDITION" != "community" ]]
then
echo "ATTENTION: The parameter '$EDITION' you passed in for the edition is neither 'community' nor 'enterprise'. Aborting."
exit
fi
if [[ "$SRC" = "" || "$DST" = "" ]]
then
echo "ATTENTION: Source '$SRC' or target '$DST' directory not provided. Aborting."
exit
fi

if [[ ! -d $SRC ]]
then
echo "ATTENTION: Source '$SRC' is not a directory. Aborting."
exit
fi

echo "Using: Heap $HEAP Pagecache $CACHE Edition '$EDITION' from '$SRC' to '$DST' skipping labels: '$SKIP_LABELS', removing nodes with labels: '$DELETE_NODES' rels: '$SKIP_RELS' props '$SKIP_PROPS' Keeping Node Ids: $KEEP_NODE_IDS"
echo
echo "Please note that you will need this memory ($CACHE + $CACHE_SRC + $HEAP) as it opens 2 databases one for reading and one for writing."
# heap config
export MAVEN_OPTS="-Xmx$HEAP -Xms$HEAP -XX:+UseG1GC"

mvn clean compile exec:java -P${EDITION} -e -Dexec.mainClass="org.neo4j.tool.StoreCopy" -Ddbms.pagecache.memory=$CACHE -Ddbms.pagecache.memory.source=$CACHE_SRC \
-Dexec.args="$SRC $DST $SKIP_RELS $SKIP_PROPS $SKIP_LABELS $DELETE_NODES $KEEP_NODE_IDS"

#-Dneo4j.version=2.3.0
18 changes: 18 additions & 0 deletions neo4j.properties
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
dbms.pagecache.memory=2G
dbms.pagecache.memory.source=2G

cache_type=none
allow_store_upgrade=true

source_db_dir=
target_db_dir=

keep_node_ids=true

properties_to_ignore=
labels_to_ignore=
labels_to_delete=
rel_types_to_ignore=

store_copy_log_dir=
bad_entries_log_dir=
31 changes: 26 additions & 5 deletions pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -4,31 +4,52 @@

<groupId>org.neo4j</groupId>
<artifactId>store-util</artifactId>
<version>2.0.1</version>
<version>3.1.0</version>
<packaging>jar</packaging>

<name>store-util</name>

<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<neo4j.version>${project.version}</neo4j.version>
</properties>

<profiles>
<profile>
<id>enterprise</id>
<dependencies>
<dependency>
<groupId>org.neo4j</groupId>
<artifactId>neo4j-enterprise</artifactId>
<version>${neo4j.version}</version>
</dependency>
</dependencies>
</profile>
<profile>
<id>community</id>
</profile>
</profiles>
<dependencies>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>4.11</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.neo4j</groupId>
<artifactId>neo4j-io</artifactId>
<version>${neo4j.version}</version>
</dependency>
<dependency>
<groupId>org.neo4j</groupId>
<artifactId>neo4j-kernel</artifactId>
<version>${project.version}</version>
<version>${neo4j.version}</version>
</dependency>
<dependency>
<groupId>org.neo4j</groupId>
<artifactId>neo4j-lucene-index</artifactId>
<version>${project.version}</version>
<version>${neo4j.version}</version>
</dependency>
</dependencies>

Expand All @@ -39,8 +60,8 @@
<artifactId>maven-compiler-plugin</artifactId>
<version>2.3.2</version>
<configuration>
<source>1.7</source>
<target>1.7</target>
<source>1.8</source>
<target>1.8</target>
<compilerArgument>-Xlint:all</compilerArgument>
<showWarnings>true</showWarnings>
<showDeprecation>false</showDeprecation>
Expand Down
71 changes: 58 additions & 13 deletions readme.md
Original file line number Diff line number Diff line change
@@ -1,22 +1,67 @@
## Tools to copy and compare Neo4j Stores
## Tool to copy Neo4j Stores

Uses the GraphDatabaseService to read a store and the batch-inserter API to write the target store keeping the node-ids.
Copies the index-files as is.
Ignores broken nodes and relationships.
Uses the BatchInserterImpl to read a store and write the target store keeping the node-ids.
Copies the manual (legacy) index-files as is, please note it performs no index upgrade!

Also useful to skip no longer wanted properties or relationships with a certain type. Good for store compaction as it
rewrites the store file reclaiming space that is sitting empty.
You will have to recreate any schema indexes too.

Change the Neo4j version in pom.xml before running. (Currently 1.9.5)
Ignores broken nodes and relationships and records them in `target/store-copy.log`

Also useful to skip no longer wanted properties, relationships with a certain type.
Or of certain labels and even nodes with certain labels.

Good for store compaction and reorganization of relationships and properties as
it rewrites the store file reclaiming space that is sitting empty.

NOTE: With Neo4j 3.x there are two different store formats, so you have to provide "enterprise" or "community" as first argument of the call!

You can now also decide if you want to compact the node-store, then you have to pass "false" as the parameter for keep-node-ids.

Config is read from `neo4j.properties` file in current directory if it exists, but command line options override.

neo4j.properties

```
source_db_dir=
target_db_dir=

keep_node_ids=true

properties_to_ignore=
labels_to_ignore=
labels_to_delete=
rel_types_to_ignore=

store_copy_log_dir=
bad_entries_log_dir=
```

### Store Copy

Usage:
copy-store.sh [enterprise|community] source.db target.db [RELS,TO,SKIP] [props,to,skip] [Labels,To,Skip] [Labels,To,Delete,Nodes] [keep-node-ids:true/false]


The provided script contains these settings for page-cache (note you can configure a different, smaller setting for the source store than the target store).

dbms.pagecache.memory.source=2G
dbms.pagecache.memory=2G

Heap config is in the shell-script, default is:

export MAVEN_OPTS="-Xmx4G -Xms4G -Xmn1G -XX:+UseG1GC"

**Please adapt the settings as needed for your store.**

**Please note that you will need the memory for (source-page-cache + target-page-cache + 1x heap) as it opens 2 databases one for reading and one for writing.**

Change the Neo4j version in pom.xml before running as needed. (Currently 3.1.0)

Optionally changeable from the outside with `-Dneo4j.version=3.1.0` on the `mvn` invocation.

### Internally

mvn compile exec:java -Dexec.mainClass="org.neo4j.tool.StoreCopy" \
-Dexec.args="source-dir target-dir [rel,types,to,ignore] [properties,to,ignore] [labels,to,ignore]"
Note: It calls under the hood:

# Store Compare
mvn compile exec:java -Dexec.mainClass="org.neo4j.tool.StoreCopy" -Penterprise \
-Dexec.args="source-dir target-dir [REL,TYPES,TO,IGNORE] [properties,to,ignore] [Labels,To,Ignore] [Labels,To,Delete,Nodes] [keep-node-ids:true/false]"

mvn compile exec:java -Dexec.mainClass="org.neo4j.tool.StoreComparer" \
-Dexec.args="source-dir target-dir [rel,types,to,ignore] [properties,to,ignore]"
8 changes: 4 additions & 4 deletions src/main/java/org/neo4j/tool/DomainAnalyzer.java
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,9 @@
import org.neo4j.graphdb.GraphDatabaseService;
import org.neo4j.graphdb.Node;
import org.neo4j.graphdb.factory.GraphDatabaseFactory;
import org.neo4j.helpers.collection.IteratorUtil;
import org.neo4j.kernel.EmbeddedGraphDatabase;
import org.neo4j.helpers.collection.Iterables;

import java.io.File;
import java.util.*;

/**
Expand Down Expand Up @@ -74,13 +74,13 @@ private String toString(Object value) {
}
}
public static void main(String[] args) {
graphDb = new GraphDatabaseFactory().newEmbeddedDatabase(args[0]);
graphDb = new GraphDatabaseFactory().newEmbeddedDatabase(new File(args[0]));

long time = System.currentTimeMillis();
Map<Set<String>,Sample> statistics = new HashMap<Set<String>, Sample>();
int count = 0;
for (Node node : graphDb.getAllNodes()) {
final HashSet<String> keys = IteratorUtil.addToCollection(node.getPropertyKeys(), new HashSet<String>());
final HashSet<String> keys = Iterables.addToCollection(node.getPropertyKeys(), new HashSet<String>());
Sample sample = statistics.get(keys);
if (sample==null) {
sample = new Sample(node);
Expand Down
9 changes: 5 additions & 4 deletions src/main/java/org/neo4j/tool/GraphGenerator.java
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,8 @@
import org.neo4j.graphdb.Relationship;
import org.neo4j.graphdb.Transaction;
import org.neo4j.graphdb.factory.GraphDatabaseFactory;
import org.neo4j.kernel.EmbeddedGraphDatabase;

import java.io.File;
import java.util.Arrays;

/**
Expand All @@ -17,7 +17,7 @@ public class GraphGenerator {
public static final int MILLION = 1000 * 1000;

public static void main(String[] args) {
final GraphDatabaseService gdb = new GraphDatabaseFactory().newEmbeddedDatabase("target/data");
final GraphDatabaseService gdb = new GraphDatabaseFactory().newEmbeddedDatabase(new File("target/data"));
createDatabase(gdb);
gdb.shutdown();
}
Expand All @@ -39,14 +39,15 @@ public static void createDatabase(GraphDatabaseService graphdb) {
System.out.print(".");
if ((i % 10000) == 0) {
tx.success();
tx.finish();
tx.close();
System.out.println(" " + i);
tx = graphdb.beginTx();
}
}
}
} finally {
tx.finish();
tx.success();
tx.close();
}
System.out.println();
long delta = (System.currentTimeMillis() - cpuTime);
Expand Down
4 changes: 2 additions & 2 deletions src/main/java/org/neo4j/tool/PropertyAnalyzer.java
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,8 @@
import org.neo4j.graphdb.*;
import org.neo4j.graphdb.factory.GraphDatabaseFactory;
import org.neo4j.helpers.collection.MapUtil;
import org.neo4j.kernel.EmbeddedGraphDatabase;

import java.io.File;
import java.lang.reflect.Array;
import java.util.*;

Expand Down Expand Up @@ -96,7 +96,7 @@ public static Map<String,String> config() {
}

public static void main(String[] args) {
final GraphDatabaseService db = new GraphDatabaseFactory().newEmbeddedDatabaseBuilder(args[0]).setConfig(config()).newGraphDatabase();
final GraphDatabaseService db = new GraphDatabaseFactory().newEmbeddedDatabaseBuilder(new File(args[0])).setConfig(config()).newGraphDatabase();
int withoutProps=0, nodes = 0, rels = 0;
Map<String,PropertyInfo> props=new HashMap<String, PropertyInfo>();
for (Node node : db.getAllNodes()) {
Expand Down
82 changes: 0 additions & 82 deletions src/main/java/org/neo4j/tool/SingleRelationshipDeletion.java

This file was deleted.

Loading