Integrates the components developed within the EU MONDO project.
The main artefacts developed within this project are:
uk.ac.york.mondo.integration.server.product
, an Eclipse product for a Jetty application server that hosts the integrated MONDO components and provides Apache Thrift-based APIs to access them remotely. Clients targeting this API are included in theuk.ac.york.mondo.integration.updatesite
Eclipse P2 update site (GUI-based clients and OSGi console extensions) anduk.ac.york.mondo.integration.clients.cli.product
Eclipse product (command-line client).uk.ac.york.mondo.integration.eclipse.product
, an Eclipse product with a custom Mars SR.1 distribution that includes all the Eclipse-based tools in the MONDO project.
The project has the following external dependencies:
- Most of the OSGi-ready dependencies come from the
uk.ac.york.mondo.integration.targetplatform
target platform definition. Due to conflicting licenses, developers are expected to build the Hawk GPL update site on their own and edit the target platform to refer to their locally built archive. Please refer to mondo-hawk/README.md for instructions. The Hawk GPL update site must be served through a local web server on port 8000: in most UNIX environments, changing to its directory and runningpython -m SimpleHTTPServer 8000
should be enough. - The
atl-mr
Eclipse project of the bluezio/integrate-hawk-emf branch of the ATL/MapReduce project. - Some of the plugins in the org.eclipse.atl.atlMR project (see the CloudATL section for details).
- The Apache Artemis core client and server libraries. These are listed in
uk.ac.york.mondo.integration.artemis/ivy.xml
, so they are easy to download using IvyDE.
Compiled versions of the remote client components are available as an Eclipse update site and as a Maven repository here.
We recommend using Eclipse Modeling (Luna or later).
- Go to the MONDO updates site and install the fix for SNI updates.
- Import all
uk.ac.york.mondo.integration.*
projects from this repository.
The dependencies for the project using Artemis are located in uk.ac.york.mondo.integration.artemis
. There are two ways to fetch the dependencies:
-
Use the Apache IvyDE plug-in. Install everything from the IvyDE update site. Right click the
fetch-deps.xml
file and choose Run as | Ant Build. If this does not work, go to option 2. -
Use the command-line tool. Install the
ivy
Ubuntu package and issue the following command in the repository:ant -Dnative-package-type=jar -lib /usr/share/java/ivy.jar -f uk.ac.york.mondo.integration.artemis/fetch-deps.xml
You may have to run the Ant job twice in order to succeed.
Before attempting to resolve the target platform, don't forget to start the local webserver in the mondo-hawk directory:
cd mondo-hawk/org.hawk.updateiste/hawk-gpl
python -m SimpleHTTPServer 8000
Go to the uk.ac.york.mondo.integration.targetplatform
, open the uk.ac.york.mondo.thrift.osgi.example.targetplatform.target
Target Definition file and click Set as Target Platform.
The Thrift APIs are defined in uk.ac.york.mondo.integration.api
, particularly in the src/api.emf
Emfatic file. Emfatic produces an .ecore
file from it, which we transform to the api.thrift
Thrift IDL file using the ecore2thrift plugin. The Thrift compiler then uses api.thrift
to generate Java and JavaScript client and server stubs (Client and Processor), as usual.
The same .api
project also provides some utility methods and classes in its api.utils
package, which is not manually generated. It is recommended to use APIUtils
to connect to the Thrift APIs instead of using the Thrift classes directly.
The MONDO Eclipse workbench product (uk.ac.york.mondo.integration.eclipse.product
) runs on Mars SR.1, which has some graphics corruption issues in some GNU/Linux distributions. GNU/Linux users are recommended to use the provided run-eclipse.sh
script, which sets up the environment variable appropriately to avoid these issues.
The MONDO server product (uk.ac.york.mondo.integration.server.product
) consists of several servlets that implement the Thrift APIs for the various MONDO components, plus two customizations for the standard OSGi HttpService: uk.ac.york.mondo.integration.server.logback
binds SLF4J to the Logback library, and uk.ac.york.mondo.integration.server.gzip
adds gzip compression to all HTTP responses coming from the server.
When deploying the server in a production environment, it is important to set up the secure store correctly, as it keeps the usernames and passwords of all the VCS that Hawk indexes. This has two steps (in Linux, they are automated when using the provided run-server.sh
script):
-
The secure store must be placed in a place no other program will try to access concurrently. This can be done by editing
mondo-server.ini
and adding-eclipse.keyring <newline> /path/to/keyringfile
. That path should be only readable by the user running the server, for added security. -
An encryption password must be set. For Windows and Mac, the available OS integration should be enough. For Linux environments, two lines have to be added at the beginning of the
mondo-server.ini
file, specifying the path to a password file:-eclipse.password /path/to/passwordfile
Creating a password file from 100 bytes of random data can be produced with these commands:
head -c 100 /dev/random | base64 > /path/to/password
chmod 400 /path/to/password
The server will test on startup that the secure store has been set properly: if you get a warning that encryption is not available, you will need to revise your setup.
Another important detail for production environments is turning on security. This is disabled by default to help with testing and initial evaluations, but it can be enabled by running the server once, shutting it down and then editing the shiro.ini
file appropriately (relevant sections include comments on what to do) and switching artemis.security.enabled
to true
in the mondo-server.ini
file. The MONDO server uses an embedded MapDB database, which is managed through the Users Thrift API. Once security is enabled, all Thrift APIs and all external (not in-VM) Artemis connections become password-protected.
If you are deploying this across a network, you will need to edit the mondo-server.ini
file and customize the hawk.artemis.host
line to the host that you want the Artemis server to listen to. This should be the IP address or hostname of the MONDO server in the network, normally. The Thrift API uses this hostname as well in its replies to the watchModelChanges
operation in the Hawk API.
Additionally, if the server IP is dynamic but has a consistent DNS name (e.g. an Amazon VM + a dynamic DNS provider), we recommend setting hawk.artemis.listenAll
to true
(so the Artemis server will keep listening on all interfaces, even if the IP address changes) and using the DNS name for hawk.artemis.host
instead of a literal IP address.
Finally, production environments should enable and enforce SSL as well, since plain HTTP is insecure. Some pointers on how to do this are provided https://www.eclipse.org/forums/index.php/t/24782/.
Hawk is integrated through its Thrift API, which is implemented in the uk.ac.york.mondo.integration.hawk.servlet
plugin as standard OSGi HttpService servlets. The plugin takes care of saving and reloading the created Hawk instances and starting and stopping the embedded Apache Artemis messaging server.
On top of this, the integration project includes the following clients for the Hawk Thrift APIs:
uk.ac.york.mondo.integration.hawk.cli
maps the Thrift API to the OSGi console view (usehawkhelp
to see the available commands).uk.ac.york.mondo.integration.hawk.emf
provides a EMF Resource implementation that allows treating a Hawk index (or a part of it) as a remote read-only model.uk.ac.york.mondo.integration.hawk.remote.thrift
extends Hawk's Eclipse UI with an additionalIModelIndexer
implementation, so it can manage and access remote Hawk indexes in the same way as local ones.
Access details for Hawk model indexes are saved as .hawkmodel
files or provided through hawk+http(s)://
URLs. Both can be created with the editor provided by uk.ac.york.mondo.integration.hawk.emf.dt
. .hawkmodel
files can be then opened as models with the sample Ecore reflective editor or by the Epsilon Exeed editor.
.hawkmodel
files are Java property files, so their file format is quite simple. They define the following options:
- Server URL, instance name and Thrift protocol: these are required to contact the Thrift API.
hawk.servlet
provides servlets in/thrift/hawk/json
,/thrift/hawk/binary
,/thrift/hawk/compact
and/thrift/hawk/tuple
: they are all backed by the same storage and logic, but they use different Thrift protocols with varying degrees of language compatibility and efficiency. JSON works for all languages but is the least efficient,binary
works for everything but JavaScript and is quite efficient,compact
takes up less space at the expense of some extra processing (but only works for a reduced set of languages) andtuple
is the most space-efficient but only works for Java. - Pattern for the version control repositories whose contents we want to see (we only support "*" as a wildcard at the moment).
- Comma-separated filename patterns for the files whose contents we want to see (again, only "*" is supported at the moment).
- Loading mode: from most eager to least,
GREEDY
requests all model elements at once,LAZY_ATTRIBUTES
requests all model elements without attributes and then fetches attributes on the fly,LAZY_CHILDREN
requests only the roots and then fetches all reference fields of a node once a reference is accessed, andLAZY_REFERENCES
requests only the roots and fetches single reference fields once they are accessed. It is also possible to combineLAZY_ATTRIBUTES
with the other two lazy modes (LAZY_ATTRIBUTES_CHILDREN
andLAZY_ATTRIBUTES_REFERENCES
). For very large models (with millions of elements),LAZY_CHILDREN
performs the best so far at browsing models (when combined with the Epsilon Exeed editor). - By checking the "Subscribe" box, the resource will ask Hawk to feed an Artemis queue with events about changes in the indexed models. This queue will be used by the resource to update the local view of the model incrementally on the fly. A unique client ID must be provided, in order to support reconnections and durable queues. The durability of the queue can be
DEFAULT
(survive reconnections),DURABLE
(survive reconnections and server restarts) orTEMPORARY
(removed after disconnecting).
CloudATL (also known as ATL/MapReduce) is integrated in a similar way to Hawk. fr.inria.atlanmod.mondo.integration.cloudatl.servlet
implements the Thrift API, which is exposed by the fr.inria.atlanmod.mondo.integration.cloudatl.cli
as a set of OSGi console commands.
The CloudATL servlet works as a frontend node for a Hadoop cluster, which must have been set up in advance. The conf
folder of the .servlet
project provides an example of how the configuration would look like for a trivial one-node cluster. Using Docker, it is quite simple to start a one-node pseudo-distributed Hadoop cluster. Install Docker, make sure you have over 10% of free disk space (required by Hadoop to start a node) and issue the following command.
sudo docker run -it bluezio/hadoop-jh /etc/bootstrap.sh -bash
After this, the conf/*.xml
files should be updated to reflect the IP address of the Docker instance. hdfs-site.xml
should also be edited to provide valid local directories writable by the user running the MONDO server. When running the server, the HADOOP_USER_NAME
environment variable should be set to root
(the username running Hadoop in the Docker instance) and HADOOP_CONF_DIR
should be set to the absolute path to the conf
directory.
NOTE: pseudo-distributed Hadoop clusters are only meant for quick test runs. To obtain actual performance gains, users will need to set up more realistic Hadoop clusters on their own. More advanced Hadoop configurations are outside of the scope of this document.
Once set up, running a CloudATL job from the OSGi console can be done with two commands (the full list is available through cloudatlhelp
):
cloudatlconnect http://mondo_server_ip:port/thrift/cloudatl
cloudatllaunch emftvm_url sourcemetamodel_url targetmetamodel_url sourcemodel_url targetmodel_url
The supported URL schemes are hdfs://
(for files that have been previously uploaded to the Hadoop Distributed File System, using e.g. hdfs dfs -put
) and hawk+http://
(for models indexed by Hawk: these URLs are produced by the .hawkmodel
editor in the hawk.emf.dt
plugin).
The CloudATL servlet needs to have up-to-date builds of ATL/MapReduce and the Hawk EMF driver in its libs
folder, so it can send them to Hadoop for processing. Building the .jar
requires these steps:
- Download and install a recent version of Eclipse and a JDK for Java 7.
- Go to "Window > Preferences > Java > Compiler" and change the default "Compiler compliance level" to 1.7. This is needed for the Docker image above, which comes with Java 7.
- Go to "Help > Install New Software..." and install everything from IvyDE through its update site. Let Eclipse restart.
- Go to the Git perspective with "Window > Perspective > Open Perspective > Other... > Git".
- Clone the
integrate-hawk-emf
https://github.com/bluezio/ATL_MR
Git repository and import its projects. This can be done by copying the URL into the clipboard, right-clicking on the "Git Repositories" view and selecting "Paste Repository Path or URI". Make sure to check the "Import all existing Eclipse projects" box on the last step of the wizard. - Clone the
https://github.com/atlanmod/org.eclipse.atl.atlMR.git
repository, but do not import all projects. Instead, uncheck the box, let the clone finish and right click on the "plugins" folder within "Working Directory", selecting the "Import Projects..." menu entry. - Close all projects except for
atl-mr
and openorg.eclipse.m2m.atl.emftvm
andorg.eclipse.m2m.atl.emftvm.trace
, without letting it open any referenced projects. - Right-click on
atl-mr
in the "Package Explorer" view and select "Export... > Ant Buildfiles". - Right-click on the generated
build.xml
file in the "Package Explorer" view and select "Run As > Ant Build...". Select thedist-emftvm
configuration, and make sure in the "JRE" tab that it runs in the same JRE as the workspace. - Refresh the
atl-mr
project by right clicking on it in the "Package Explorer" view and selecting "Refresh". - Again, right-click on the generated
build.xml
file, but this time select thedist
configuration. It does not need to run in the same JRE as the workspace, but nevertheless check in the "JRE" tab that a valid JRE has been selected. - Refresh the project, and we're done: the binary distribution will be located in
atl-mr/dist
.