strongbox · sbespalov · Mar 18, 2020 · Mar 18, 2020 · Mar 18, 2020 · Mar 18, 2020
diff --git a/docs/developer-guide/getting-started-with-persistence.md b/docs/developer-guide/getting-started-with-persistence.md
@@ -4,65 +4,33 @@
 
 This page contains explanations and code samples for developers who need to store their entities into the database. 
 
-The Strongbox project uses [OrientDB](http://orientdb.com/orientdb/) as its internal persistent storage through the 
-corresponding `JPA` implementation and `spring-orm` middle tier. Also we use `JTA` for transaction management and 
-`spring-tx` implementation module from Spring technology stack.
+The Strongbox project uses [JanusGraph](https://janusgraph.org/) as its internal persistent storage through the 
+corresponding [Gremlin](https://tinkerpop.apache.org/gremlin.html) implementation and [spring-data-neo4j](https://spring.io/projects/spring-data-neo4j#overview) middle tier. We also use `JTA` for transaction management and the `spring-tx` implementation module from the Spring technology stack.
 
-## OrientDB Studio
+## Persistence stack
 
-As you are learning about Strongbox persistence, you may want to explore the existing persistence implementation.
-For development environments, Strongbox includes an embedded OrientDB server as well as an embedded instance of
-OrientDB Studio. By default, when you run the application from the source tree, you'll use the embedded database
-server. However, OrientDB Studio is disabled by default.
+We're using the following technology stack to deal with persistence:
 
-### Running OrientDB Studio From Source Tree
+ - Embedded Cassandra as direct storage (`CassandraDaemon` allows us to have the Cassandra instance inside the same JVM as the application)
+ - JanusGraph as our graph DBMS (it is not directly a data storage, it just allows you to have access to data in the form of a graph)
+ - [Apache TinkerPop](http://tinkerpop.apache.org/docs/current/reference/) as a set of tools to interact with the database
+ - [spring-data-neo4j](https://github.com/spring-projects/spring-data-neo4j) to manage transactions in Spring with `Neo4jTransactionManager` and implement custom Cypher queries with Spring Data repositories (by custom queries via the `@org.springframework.data.neo4j.annotation.Query` annotation)
+ - [cypher-for-gremlin](https://github.com/opencypher/cypher-for-gremlin) which translates Cypher queries into Gremlin traversals (it has some issues which prevent us from using it for `neo4j-ogm` CRUD operations, these issues will be explained below)
+ - [neo4j-ogm](https://github.com/neo4j/neo4j-ogm) to map Java POJOs into Vertices and Edges of Graph
+ - We also use custom `EntityTraversalAdapters`, which implement anonymous Gremlin traversals for CRUD operations under `neo4j-ogm` entities.
 
-To enable OrientDB Studio, you need only to set the property `strongbox.orientdb.studio.enabled` to `true`. You
-can do this on the Maven command line by running Strongbox as follows:
+## Vertices and Edges
 
-```
-$ mvn spring-boot:run -Dspring-boot.run.jvmArguments="-Dstrongbox.orientdb.studio.enabled=true"
-```
-
-There are two additional properties that can be used to configure OrientDB Studio:
-
-- `strongbox.orientdb.studio.ip.address`
-- `strongbox.orientdb.studio.port`
-
-### Running OrientDB Studio From The Distribution
-
-If you're running from the `tar.gz`, or `rpm` distributions, you can start Strongbox as follows to enable OrientDB Studio:
-
-```
-$ cd /opt/strongbox
-$ STRONGBOX_VAULT=/opt/strongbox-vault STRONGBOX_ORIENTDB_STUDIO_ENABLED=true ./bin/strongbox console
-```
+Unlike a relational DBMS, Graph DBMS have vertices and edges, not rows and tables. So, in terms of Graph, every persistent entity should be stored as vertex or edge. An example of a vertex might be `Artifact` or `AritfactCoordinates` and the relation between them would be an edge. It should be noted that, unlike RDBMS, object relations are represented by a separate edge, instead of just a foreign key column in a table. In addition to vertices, persistence objects can also be edges -- for example, the `ArtifactDependency` would be an edge between `ArtifactCoordinates` vertices.
 
-Please, note that the `STRONGBOX_VAULT` environment variable needs to be pointing to an absolute path for this to work.
+## Gremlin Server
 
-As with the source distribution, you can set additional environment variables to further configure OrientDB Studio:
+`TODO`
 
-```
-$ export STRONGBOX_ORIENTDB_STUDIO_IP_ADDRESS=0.0.0.0
-$ export STRONGBOX_ORIENTDB_STUDIO_PORT=2480
-```
-
-Once the application is running, you can login to OrientDB Studio by visiting
-http://127.0.0.1:2480/studio/index.html in your browser. The initial credentials are `admin` and `password`.
-
-![Login Screen](/assets/screenshots/orientdb-studio/login-screen.png)
-
-After your login, you'll land on the Browse Screen, which allows you to query the embedded database.
-
-![Browse Screen](/assets/screenshots/orientdb-studio/browse-screen.png)
-
-Finally, you can explore the schema defined in the database by clicking `SCHEMA`.
-
-![Schema Screen](/assets/screenshots/orientdb-studio/schema-screen.png)
 
 ## Adding Dependencies
 
-Let's assume that you, as a Strongbox developer, need to create a new module or write some persistence code in an 
+Let's assume that you, as a Strongbox developer, need to create a new module, or write some persistence code in an 
 existing module that does not contain any persistence dependencies yet. (Otherwise you will already have the proper 
 `<dependencies/>` section in your `pom.xml`, similar to the one in the example below). You will need to add the 
 following code snippet to your module's `pom.xml` under the `<dependencies>` section:
@@ -75,173 +43,195 @@ following code snippet to your module's `pom.xml` under the `<dependencies>` sec
 </dependency>
 ```
 
-Notice that there is no need to define any direct dependencies on OrientDB or Spring Data - it's already done via 
+Notice that there is no need to define any direct dependencies on JanusGraph or Spring Data - it's already done via 
 the `strongbox-data-service` module.
 
 ## Creating Your Entity Class
 
 Let's now assume that you have a POJO and you need to save it to the database (and that you probably have at least 
-CRUD operation's implemented in it as well). Place your code under the `org.carlspring.strongbox.domain.yourstuff` 
-package. For the sake of the example, let's pick `MyEntity` as the name of your entity.
+CRUD operations implemented in it as well). Place your code under the `org.carlspring.strongbox.domain` 
+package. For the sake of the example, let's pick `PetEntity` as the name of your entity.
 
 If you want to store that entity properly you need to adopt the following rules:
 
-* Extend the `org.carlspring.strongbox.data.domain.GenericEntity` class to inherit all required fields and logic from 
-  the superclass.
-* Define getters and setters according to the `JavaBeans` coding convention for all non-transient properties in your 
-  class.
-* Define a default empty constructor for safety (even if the compiler will create one for you, if you don't define any 
-  other constructors) and follow the `JPA` and `java.io.Serializable` standards.
-* Override the `equals() `and `hashCode()` methods according to java `hashCode` contract (because your entity could be 
-  used in collection classes such as `java.util.Set` and if you don't define such methods properly other developers or 
-  yourself will be not able to use your entity).
-* _Optional_ - define a `toString()` implementation to let yourself and other developers see something meaningful in 
-  the debug messages.
+* Create the interface for your entity with all the getters and setters that are required to interact with the entity, according to the `JavaBeans` coding convention. This interface should extend `org.carlspring.strongbox.data.domain.DomainObject`. We need an interface in order to hide the implementation-specific details that depend on the underlying database, such as inheritance strategy.
+* Create the entity class which implements the above interface and extend to `org.carlspring.strongbox.data.domain.DomainEntity`.
+* Declare an entity class with `@NodeEntity` or `@RelationshipEntity`.
+* Define a default empty constructor, as this would be required in order to create entity instances from `neo4j-ogm` internals.
 
 The complete source code example that follows all requirements should look something like this:
 
 ```java
 package org.carlspring.strongbox.domain;
 
-import org.carlspring.strongbox.data.domain.GenericEntity;
-
-import com.google.common.base.Objects;
-
-public class MyEntity
-        extends GenericEntity
+@NodeEntity("Pet")
+public class PetEntity
+        extends DomainEntity
+        implements Pet
 {
 
-    private String property;
+    private Integer age;
 
-    public MyEntity()
+    public PetEntity()
     {
     }
 
-    public String getProperty()
+    @Override
+    public Integer getAge()
     {
-        return property;
+        return age;
     }
 
-    public void setProperty(String property)
+    @Override
+    public void setAge(Integer age)
     {
-        this.property = property;
+        this.age = age;
     }
 
+}
+```
+
+## Creating a `EntityTraversalAdapter`
+
+As mentioned above, besides `neo4j-ogm` and `spring-data-neo4j`, we were forced to use custom CRUD implementations based on Gremlin. This has its advantages, as it allows us to optimize OGM entities and make them faster than what the common `neo4j-ogm` provides out of the box. The main thing of the Gremlin based CRUD is `EntityTraversalAdapter` which is a strategy for create/update/read/delete operations. The concrete `EntityTraversalAdapter` provides [Anonymous Traversals](http://tinkerpop.apache.org/docs/current/tutorials/gremlins-anatomy/) for each operation of the specific entity type. These traversals are used in Gremlin-based repositories to perform common CRUD operations:
+
+- `fold` : to construct entity instance based on vertex/edge and its properties
+- `unfold` : to extract entity properties into vertex/edge and its properties
+- `cascade` : to cascade other vertices/edges within delete if needed
+
+Basically these all these operations are implemented using special `__` class, which represent anonymous traversal in Gremlin.
+
+The `EntityTraversalAdapter` implementations can also use each other to support relations between entities, inheritance and cascade operations.
+
+Below is the code example of `EntityTraversalAdapter` implementation for `PetEntity`:
+
+```java
+package org.carlspring.strongbox.gremlin.adapters;
+
+import static org.carlspring.strongbox.gremlin.adapters.EntityTraversalUtils.extractObject;
+
+import java.util.Collections;
+import java.util.Map;
+import java.util.Set;
+
+import org.apache.tinkerpop.gremlin.process.traversal.Traverser;
+import org.apache.tinkerpop.gremlin.structure.Element;
+import org.apache.tinkerpop.gremlin.structure.Vertex;
+import org.carlspring.strongbox.domain.Pet;
+import org.carlspring.strongbox.domain.PetEntity;
+import org.carlspring.strongbox.gremlin.dsl.EntityTraversal;
+import org.carlspring.strongbox.gremlin.dsl.__;
+import org.springframework.stereotype.Component;
+
+@Component
+public class PetAdapter extends VertexEntityTraversalAdapter<Pet>
+{
+
     @Override
-    public boolean equals(Object o)
+    public Set<String> labels()
     {
-        if (this == o)
-        {
-            return true;
-        }
-        if (o == null || getClass() != o.getClass())
-        {
-            return false;
-        }
-
-        MyEntity myEntity = (MyEntity) o;
+        return Collections.singleton("Pet");
+    }
 
-        return Objects.equal(property, myEntity.property);
+    @Override
+    public EntityTraversal<Vertex, Pet> fold()
+    {
+        return __.<Vertex, Object>project("uuid", "age")
+                 .by(__.enrichPropertyValue("uuid"))
+                 .by(__.enrichPropertyValue("age"))
+                 .map(this::map);
+    }
+
+    private Pet map(Traverser<Map<String, Object>> t)
+    {
+        PetEntity result = new PetEntity();
+        result.setUuid(extractObject(String.class, t.get().get("uuid")));
+        result.setAge(extractObject(Integer.class, t.get().get("age")));
+
+        return result;
     }
 
     @Override
-    public int hashCode()
+    public UnfoldEntityTraversal<Vertex, Vertex> unfold(Pet entity)
     {
-        return Objects.hashCode(property);
+        EntityTraversal<Vertex, Vertex> t = __.<Vertex>identity();
+        if (entity.getAge() != null)
+        {
+            t = t.property(single, "age", entity.getAge());
+        }
+
+        return new UnfoldEntityTraversal<>("Pet", t);
     }
 
     @Override
-    public String toString()
+    public EntityTraversal<Vertex, ? extends Element> cascade()
     {
-        final StringBuilder sb = new StringBuilder("MyEntity{");
-        sb.append("property='").append(property).append('\'');
-        sb.append('}');
-
-        return sb.toString();
+        return __.identity();
     }
+
 }
-```
 
-## Creating a DAO Layer
+```
 
-First of all you will need to extend the `CrudService` with the second type parameter that corresponds to your ID's data type. Usually it's just strings. 
+## Creating a `Repository`
 
+All the database interactions should be done through repositories. For the compatibility with `spring-data`, we use `org.springframework.data.repository.CrudRepository` as a basis for our repositories. The base class for implementing `EntityTraversalAdapter`-based repositories is `org.carlspring.strongbox.gremlin.repositories.GremlinRepository`. Further repository implementation depends on the type of entity; for vertex-backed entities, it should be `GremlinVertexRepository`.
+In addition to CRUD operations, there is also the need to be able to select data using queries. Queries could be implemented using [Cypher](https://neo4j.com/docs/cypher-manual/current/introduction/) through `spring-data-neo4j` using the `@org.springframework.data.neo4j.annotation.Query` annotation. So, the final repository should be a mixin that extends `GremlinRepository` and delegates custom `Cypher` queries to the `org.springframework.data.repository.Repository` instance provided by `spring-data-neo4j`.
 
-!!! tip "To read more about ID's in OrientDB, check the <a href='http://orientdb.com/docs/2.0/orientdb.wiki/Tutorial-Record-ID.html' target='_blank'>manual</a>"
+Putting together all the above, the repository for the `PetEntity` will look like below:
 
 ```java
-package org.carlspring.strongbox.users.service;
+package org.carlspring.strongbox.repositories;
 
-import org.carlspring.strongbox.data.service.CrudService;
-import org.carlspring.strongbox.users.domain.MyEntity;
+import javax.inject.Inject;
 
-import org.springframework.transaction.annotation.Transactional;
+import org.carlspring.strongbox.domain.Pet;
+import org.carlspring.strongbox.gremlin.adapters.PetAdapter;
+import org.carlspring.strongbox.gremlin.repositories.GremlinVertexRepository;
+import org.springframework.stereotype.Repository;
 
-/**
- * CRUD service for managing {@link MyEntity} entities.
- *
- * @author Alex Oreshkevich
- */
-@Transactional
-public interface MyEntityService
-        extends CrudService<MyEntity, String>
+@Repository
+public class PetRepository extends GremlinVertexRepository<Pet>
+        implements PetQueries
 {
 
-    MyEntity findByProperty(String property);
+    @Inject
+    PetAdapter adapter;
+
+    @Inject
+    PetQueries queries;
 
-}
-```
+    @Override
+    protected PetAdapter adapter()
+    {
+        return adapter;
+    }
 
-After that you will need to define an implementation of your service class. 
-
-Follow these rules for the service implementation:
-
-* Inherit your CRUD service from `CommonCrudService<MyEntity>` class;
-* Name it like your service interface with an `Impl` suffix, for example `MyEntityServiceImpl`;
-* Annotate your class with the Spring `@Service` and `@Transactional` annotations;
-* Do **not** define your service class as public and use interface instead of class for injection (with `@Autowired`); 
-  this follows the best practice principles from Joshua Bloch 'Effective Java' book called Programming to Interface;
-* _Optional_ - define any methods you need to work with your `MyEntity` class; these methods mostly should be based on 
-  common API form `javax.persistence.EntityManager`, or custom queries (see example below);
-
-* !!! warning "Avoid query parameters construction through string concatenation!"
-      Please avoid using query parameter construction through string concatenation!  
-      This usually leads to [SQL Injection](https://en.wikipedia.org/wiki/SQL_injection) issues!    
-      Bad query example:  
-      `String sQuery = "select * from MyEntity where proprety='" + propertyValue + "'"`;  
-      What you should do instead is to create a service which does properly assigns the parameters.  
-      Here's an example service:
-      ```java
-      @Transactional
-      public class MyEntityServiceImpl
-              extends CommonCrudService<MyEntity> implements MyEntityService
-      {
-          public MyEntity findByProperty(String property)
-          {
-              String sQuery = "select * from MyEntity where property = :propertyValue";
-
-              OSQLSynchQuery<Long> oQuery = new OSQLSynchQuery<Long>(sQuery);
-              oQuery.setLimit(1);
-
-              HashMap<String, String> params = new HashMap<String, String>();
-              params.put("propertyValue", property);
-
-              List<MyEntity> resultList = getDelegate().command(oQuery).execute(params);
-              return !resultList.isEmpty() ? resultList.iterator().next() : null;
-          }
-      }
-      ```
-
-## Register entity schema in EntityManager
-Before using entities you will need to register them. Consider the following example:
+    List<Pet> findByAgeGreater(Integer age)
+    {
+        return queries.findByAgeGreater(age);
+    }
 
-```java
-@Inject
-private OEntityManager oEntityManager;
+}
 
-@PostConstruct
-public void init() 
+@Repository
+interface PetQueries
+        extends org.springframework.data.repository.Repository<Pet, String>
 {
-    oEntityManager.registerEntityClass(MyEntity.class);
+
+    @Query("MATCH (pet:Pet) " +
+           "WHERE pet.age > $age " +
+           "RETURN pet")
+    List<Pet> findByAgeGreater(@Param("age") Integer age);
+
 }
 ```
+
+## Issues of `cypher-for-gremlin` and `neo4j-ogm`
+
+The first issue that we have, is the fact that `cypher-for-gremlin` does not fully suport all Cypher syntax that is produced by `neo4j-ogm` for CRUD operations. To be more specific, on every CRUD operation, `neo4j-ogm` generates a Cypher query which is then translated to Gremlin by `cypher-for-gremlin`. As a workadound, we modify Cypher queries produced by `neo4j-ogm` and replace some clauses (see `org.opencypher.gremlin.neo4j.ogm.request.GremlinRequest`). 
+
+Another issue is that `cypher-for-gremlin` has an ambiguous concept for working with `null` values in Gremlin. They put a lot of noisy tokens into Gremlin traversals which prevents the JanusGraph engine from matching expected indexes. This, in term, causes heavy full-scans on every query (see [#342](https://github.com/opencypher/cypher-for-gremlin/issues/342)). This was the main reason why we couldn't use the `neo4j-ogm` for CRUD operations.
+
+Either way, we are still using it for custom Cypher queries via the `@org.springframework.data.neo4j.annotation.Query` annotation. This is a good option to have Cypher queries, instead of Gremlin ones, because it looks more clear and takes less time to read and write queries.
+