-
Notifications
You must be signed in to change notification settings - Fork 1
GSIP 68
Introduce guava-libraries as a GeoServer core dependency and provide some general guidelines on when, why, and how to use them
Gabriel Roldan
2.2.0.
Completed
I’ve been using some of the guava utilities for the most part of last year in other GeoServer related projects. At the mailing list we decided a GSIP would be worth it as an introduction to its benefits and as a reference for other GeoServer developers.
This proposal aims at introducing the Google core guava-libraries as a core GeoServer dependency and to provide some guidelines and material for the progressive adoption of its utility classes, ranging from collections utilities, to IO, concurrent, primitive and String oprations, cache facilities, and more.
In a nutshell, excerpt from the Guava Explained wiki:
- Basic utilities: Make using the Java language more pleasant
- Collections: Guava’s extensions to the JDK collections ecosystem. These are some of the most mature and popular parts of Guava.
- Caches: Local caching, done right, and supporting a wide variety of expiration behaviors.
- Functional idioms: Used sparingly, Guava’s functional idioms can significantly simplify code.
- Concurrency: Powerful, simple abstractions to make it easier to write correct concurrent code.
- Strings: A few extremely useful string utilities: splitting, joining, padding, and more.
- Primitives: operations on primitive types, like int and char, not provided by the JDK, including unsigned variants for some types.
- Ranges: Guava’s powerful API for dealing with ranges on Comparable types, both continuous and discrete.
- I/O: Simplified I/O operations, especially on whole I/O streams and files, for Java 5 and 6.
- Hashing: Tools for more sophisticated hashes than what’s provided by Object.hashCode (), including Bloom filters.
- EventBus: Publish-subscribe-style communication between components without requiring the components to explicitly register with one another.
- Math: Optimized, thoroughly tested math utilities not provided by the JDK.
It’s on maven central, so just:
<dependency>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
<version>11.0.1</version>
</dependency>
It’s a single but sizable Jar, around 1.5 MB. In order not to increase
the size of our downloads too much, it looks like at least we could get
rid for the following libraries (thanks Andrea):
1,6M aspectjweaver1.6.8.jar
1,2M xercesImpl2.7.1.jar
The following are just some small concrete examples of using Guava utilities in GeoServer, and focus only on the bits that I got to use so far.
We use a lot of caches. Specially in core classes like CatalogImpl
and ResourcePool
.
Some are plain HashMap
, some others are custom crafted specializations of SoftValueHashMap
. Some need to do additional clean up when a resource is evicted from the cache.
So in ResourcePool
we have all these cases. Replacing those HashMaps and custom classes by Guava Cache
makes for doing more with less code:
-
Set cache capacity bound;
-
Entry expiration based on last access time or last read time;
-
Ability to use weak keys and/or soft value references
-
Concurrency hints ( the table is internally partitioned to try to permit the indicated number of concurrent updates without thread contention.)
-
For the cases where resource clean up needs to be done upon entry eviction, encapsulates the cache population logic and entry eviction hooks into a single object, so related logic remains close.:
.... CacheLoader<String, DataAccess> loader = new DataStoreLoader(); Cache<String, DataAccess> dataStoreCache = CacheBuilder.newBuilder() .concurrencyLevel(10) .expireAfterAccess(10, TimeUnit.MINUTES) .initialCapacity(10); .maximumSize(100) .softValues(); .removalListener(loader) .build(loader);
.... class DataStoreLoader extends CacheLoader<String, DataAccess> implements RemovalListener<String, DataAccess> {
@Override public DataAccess load(String id) throws Exception { DataAccess dataStore = .... return dataStore; } @Override public void onRemoval(RemovalNotification<String, DataAccess> notification) { String id = notification.getKey(); DataAccess da = notification.getValue(); try { da.dispose(); } catch (Exception e) { LOGGER.log(Level.WARNING, e.getMessage(), e); } }
}
-
Eliminates the need for the “double checked logic anti-pattern” , so that every get method on cacheable contents becomes basically:
public void getFoo(someKey){ return fooCache.get(someKey); }
instead of
public void getFoo(someKey){
Foo foo = fooCache.get(someKey);
if( foo == null ){
synchronized(fooCache){
foo = fooCache.get(someKey);
if( foo == null ){
foo = ....
fooCache.put(someKey, foo);
}
}
}
return foo;
}
Here’s a complete patch
for using guava Cache
in ResourcePool
, and the clean version
of it.
Although that patch is not strictly part of this proposal, it would be a good thing to have once/if this proposal is accepted.
class GeoServerDataProvider<T>{
...
@Before
public Iterator<T> iterator(int first, int count) {
List<T> items = getFilteredItems();
// global sorting
Comparator<T> comparator = getComparator(getSort());
if (comparator != null) {
Collections.sort(items, comparator);
}
// in memory paging
int last = first + count;
if (last > items.size())
last = items.size();
return items.subList(first, last).iterator();
}
@After
public Iterator<T> iterator(int first, int count) {
Iterable<T> items = getFilteredItems();
// global sorting
Comparator<T> comparator = getComparator(getSort());
if (comparator != null) {
items = Ordering.from(comparator).sortedCopy(items);
}
// in memory paging
Iterator<T> iterator = items.iterator();
Iterators.skip(iterator, first);
return Iterators.limit(iterator, count);
}
@Before
protected List<T> getFilteredItems() {
List<T> items = getItems();
// if needed, filter
if (keywords != null && keywords.length > 0) {
return filterByKeywords(items);
} else {
// make a deep copy anyways, the catalog does not do that for us
return new ArrayList<T>(items);
}
}
@After
protected Iterable<T> getFilteredItems() {
Iterable<T> items = getItems();
// if needed, filter
if (keywords != null && keywords.length > 0) {
return filterByKeywords(items);
} else {
return items;
}
}
@Before
private List<T> filterByKeywords(List<T> items) {
List<T> result = new ArrayList<T>();
final Matcher[] matchers = getMatchers();
List<Property<T>> properties = getProperties();
for (T item : items) {
ITEM:
// find any match of any pattern over any property
for (Property<T> property : properties) {
Object value = property.getPropertyValue(item);
// brute force check for keywords
for (Matcher matcher : matchers) {
matcher.reset(String.valueOf(value));
if (matcher.matches()) {
result.add(item);
break ITEM;
}
}
}
}
return result;
}
@After
private Iterable<T> filterByKeywords(Iterable<T> items) {
final Matcher[] matchers = getMatchers();
final List<Property<T>> properties = getProperties();
Predicate<T> filter = new Predicate<T>() {
@Override
public boolean apply(T item) {
for (Property<T> property : properties) {
Object value = property.getPropertyValue(item);
// brute force check for keywords
for (Matcher matcher : matchers) {
matcher.reset(String.valueOf(value));
if (matcher.matches()) {
return true;
}
}
}
return false;
}
};
return Iterables.filter(items, filter);
}
}
class CatalogConfiguration implements org.geowebcache.config.Configuration{
...
@Override
public Iterable<GeoServerTileLayer> getLayers() {
List<LayerGroupInfo> layerGroups = catalog.getLayerGroups();
List<LayerInfo> layerInfos = catalog.getLayers();
List[] sublists = { layerInfos, layerGroups };
CompositeList composite = new CompositeList(sublists);
LazyGeoServerTileLayerList tileLayers = new LazyGeoServerTileLayerList(composite, this);
return tileLayers;
}
private static class CompositeList extends AbstractList<Object> {
private final List<Object>[] decorated;
@SuppressWarnings("unchecked")
public CompositeList(List[] sublists) {
this.decorated = sublists;
}
@Override
public Object get(final int index) {
int subIndex = index;
List<Object> sublist;
for (int i = 0; i < decorated.length; i++) {
sublist = decorated[i];
if (subIndex < sublist.size()) {
return sublist.get(subIndex);
}
subIndex -= sublist.size();
}
throw new IndexOutOfBoundsException();
}
@Override
public int size() {
int size = 0;
List<Object> sublist;
for (int i = 0; i < decorated.length; i++) {
sublist = decorated[i];
size += sublist.size();
}
return size;
}
}
private static class LazyGeoServerTileLayerList extends AbstractList<GeoServerTileLayer> {
private final List<Object> infos;
private final CatalogConfiguration mediator;
public LazyGeoServerTileLayerList(final List<Object> infos,
final CatalogConfiguration catalogConfiguration) {
this.infos = infos;
this.mediator = catalogConfiguration;
}
@Override
public GeoServerTileLayer get(int index) {
Object object = infos.get(index);
if (object instanceof LayerInfo) {
return new GeoServerTileLayer(mediator, (LayerInfo) object);
} else if (object instanceof LayerGroupInfo) {
return new GeoServerTileLayer(mediator, (LayerGroupInfo) object);
}
throw new IllegalStateException();
}
@Override
public int size() {
return infos.size();
}
}
}
class CatalogConfiguration implements org.geowebcache.config.Configuration{
...
@Override
public Iterable<GeoServerTileLayer> getLayers() {
Iterable<GeoServerTileLayer> layers = Iterables.transform(catalog.getLayers(),
new Function<LayerInfo, GeoServerTileLayer>() {
@Override
public GeoServerTileLayer apply(LayerInfo layer) {
CatalogConfiguration mediator = CatalogConfiguration.this;
return new GeoServerTileLayer(mediator, layer);
}
});
Iterable<GeoServerTileLayer> layersGroups = Iterables.transform(catalog.getLayerGroups(),
new Function<LayerGroupInfo, GeoServerTileLayer>() {
@Override
public GeoServerTileLayer apply(LayerGroupInfo layerGroup) {
CatalogConfiguration mediator = CatalogConfiguration.this;
return new GeoServerTileLayer(mediator, layerGroup);
}
});
return Iterables.concat(layers, layersGroups);
}
}
No!. And maybe. There are lots of things than (IMHO) can be done better with guava than with commons-collections. But guava is way more than the collections utilities, and so is Apache commons-*. Both of them have utilities not present in each other, and some overlap. My personal preference is to use Guava from now on for all collection utilities needs, as it’s more modern, well designed, faithfully respects the Java collection contracts, leverages immutability and code clarity, is under active development and well supported. But Apache commons is gonna be around for sure as there are a lot more to commons than collections.
Also, the point of this proposal is to present guava to you and recommend you take your own tour not only about the collection utilities, but also the I/O, net, primitives, concurrent, etc.
Googling gives as usual thousands of links. Here are some of the ones that seemed more appealing to me:
-
“The”Guava Explained" wiki“:http://code.google.com/p/guava-libraries/wiki/GuavaExplained
-”Presentation slides focusing on base, primitives, and io“:http://guava-libraries.googlecode.com/files/Guava_for_Netflix_.pdf
-”Presentation slides focusing on cache“:http://guava-libraries.googlecode.com/files/ConcurrentCachingAtGoogle.pdf
-”Presentation slides focusing on util.concurrent“:http://guava-libraries.googlecode.com/files/guava-concurrent-slides.pdf
-”What are the big improvements between guava and apache equivalent libraries?“:http://stackoverflow.com/questions/4542550/what-are-the-big-improvements-between-guava-and-apache-equivalent-libraries
-”Writing more elegant comparison logic with Guava’s Ordering“:http://www.polygenelubricants.com/2010/10/elegant-comparison-logic-with-guava.html
-”Creating a fluent interface for Google Collections“:http://codemonkeyism.com/creating-a-fluent-interface-for-google-collections/
-”Google’s guava java: the easy parts“:http://www.copperykeenclaws.com/googles-guava-java-the-easy-parts/
-”Beautiful code with Google Collections, Guava and static imports":http://codemunchies.com/2009/10/beautiful-code-with-google-collections-guava-and-static-imports-part-1/
Jody Garnett: {quote}…Well I really like that set of capabilities; while it would represent an increased learning curve to work on GeoServer - it would be a win if we could remove a few more dependencies. We may need to duck back into GeoTools to make that happen; but that would perhaps not be a bad thing.{quote} {quote} …We are welcome to peruse this library for GeoServer prior to that point. I also have some uDig code that used the earlier google collections library that I can fix up (and get some experience).
So you are getting two bits of feedback:
- Yes - but not for GeoTools until after 8.0
- A good trade if we cut down or out the other dependencies (coming from GeoTools) {quote}
Justin Deoliveira: {quote}The guava library looks beautiful, no question there, and there is a lot of hype around it at the moment on all the java blogs. But as I mentioned before, and as jody mentioned i don’t love the idea just lumping on another utility library. Obviously it leads to much nicer code, and has some functionality we don’t have now but without a concrete problem it solves i don’t see that as justification enough alone. It is already enough of a maze trying to look up the right utility class to use when you have to do something, this will make it worse.
I would actually be more in favor of a lower level effort at the geotools level to replace commons with guava. Obviously though that is a larger effort and by no means meant to block the proposal{quote}
Andrea Aime: {quote}I feel the same, but at the same time I’m worried the code will turn into COBOL pretty soon if we don’t do some effort to modernize it. The situation with scripting languages and the various “java successors” seems like a grand royal mess that is not going to give us a clear successor to Java anytime soon, so we better try to get onto more compact/modern code and try to prolongue the life of the code base as much as possible.
Of course once we adopt Guava we must make an effort to use it instead of commons wherever possible/makes sense to get some uniformity back.{quote}
As the proposal aims to adding a new set of utilities to the class path for progressive adoption, there are no backwards compatibility issue foreseen.
Andrea Aime: +1 Alessio Fabiani: +1 Ben Caradoc Davies: +0 Gabriel Roldan: +1 Justin Deoliveira: +0 Jody Garnett: +1 Mark Leslie: Rob Atkinson: Simone Giannecchini:
©2020 Open Source Geospatial Foundation