NEPTUNE is a comprehensive framework, developed at the Politecnico di Milano, for the runtime management of large-scale edge applications that exploits placement, routing, network delays, and CPU/GPU interplay in a coordinated way to allow for the concurrent execution of edge applications that meet user-set response times.
NEPTUNE implementation run on top of Kubernetes or Kubernetes-compatible orchestrator as a set of Kubernetes Controllers and Custom Resources.
-
System Controller:
SystemController
is the component that splits large edge topologies into smaller, closely-located and indenpendet communities. Communities are found using an external function that, given a set of nodes and the latencies between all nodes, returns a set of communities and the set of nodes assigned to each community. The base implementation of theSystemController
adopts SLPA (Speaker-Listener Label Propagation Algorithm, Xie et al., 2019). The implementation of the SLPA algorithm can be found here. TheSystemController
takes in inputCommunityConfigurations
Custom Resources and it splits the network of nodes according toCommunityConfigurations
. A network of nodes is split into communities by assigning to each node a label that denote to which community the node belongs to. -
Community Controller:
CommunityController
is the component that schedules functions on nodes and provides routing policies for smart requests routing. As in theSystemController
, function schedules and routing policies are found using an external function that requires the information about the topology and the set of functions to be deployed on top of the nodes. TheCommunityController
relies on solving a two-step mixed integer linear programming problem. The solver can be found here. TheCommunityController
is composed by two control loops: the first one updatesCommunitySchedule
using the two-step mixed integer linear programming, and the second one tries to reconcile the actual function schedule and the one specified inCommunitySchedule
. -
Node Controller:
NodeController
is the component that vertical scales function in order to meet user-set response times. The node controller has been implemented in another project named Kosmos. The project can be found here. -
Request Dispatcher:
RequestDispacher
is the component that routes requests to function instances according to the routing policies specified inCommunitySchedule
. The requests dispatcher works only with HTTP and HTTPS messages. -
Monitoring:
Monitoring
is the component that monitors and collect informations about resource consumption and network performance inside the cluster. -
Metrics Database:
MetricsDatabase
is a SQL database that stores data. It has been chosen to use TimeScaleDB, but since it has an interface PostGresQL-compatible, it's also possible to use PSQL.
- Community Configuration:
CommunityConfiguration
defines the properties of each community inside the cluster. An example is:
apiVersion: edgeautoscaler.polimi.it/v1alpha1
kind: CommunityConfiguration
metadata:
name: example-cc
namespace: openfaas-fn
spec:
community-size: 3
maximum-delay: 100
probability-threshold: 20
iterations: 20
slpa-service: slpa.default.svc.cluster.local:4567
status:
generated-communities: []
- Community Schedule:
CommunitySchedule
defines the function placement and routing policies.
NEPTUNE is a complex framework that requires a modified version of K3s for in-place vertical-autoscaling.
A Terraform repository to deploy a modified K3s cluster on the AWS Cloud can be found here.
Otherwise it's necessary to setup a K3s distribution which integrate KEP 1287 (link).
To deploy a function it's sufficient to deploy an OpenFaaS function Custom Resouce. An example is:
apiVersion: openfaas.com/v1
kind: Function
metadata:
name: prime-numbers
namespace: openfaas-fn
spec:
image: systemautoscaler/prime-numbers:0.1.0
labels:
com.openfaas.scale.factor: "20"
com.openfaas.scale.max: "100"
com.openfaas.scale.min: "1"
com.openfaas.scale.zero: "false"
edgeautoscaler.polimi.it/scheduler: edge-autoscaler
name: prime-numbers
readOnlyRootFilesystem: false
requests:
memory: 1M
If you are using Kubectl
, you can:
kubectl apply -f {function_configuration_file}.yaml
Since the API code generator used in hack/update-codegen.sh was not designed to work with Go modules, it is mandatory to recreate the entire module path in order to make the code generation work.
This gives you two options:
- Create the folders
github.com/deib-polimi
and clone this repository in any location of your filesystem. - Clone the repository inside the
GOPATH
directory.
In the end there is no choice other than to preserve the module hierarchy.
You can find a presentation of this work on Youtube.
If you use this code for evidential learning as part of your project or paper, please cite the following work:
@inproceedings{seams2022,
author = {Baresi, Luciano and Hu, Davide Yi Xian and Quattrocchi, Giovanni and Terracciano, Luca},
title = {NEPTUNE: Network- and GPU-Aware Management of Serverless Functions at the Edge},
year = {2022},
doi = {10.1145/3524844.3528051},
booktitle = {Proceedings of the 17th Symposium on Software Engineering for Adaptive and Self-Managing Systems},
pages = {144–155},
}