Skip to content

Releases: awslabs/multi-model-server

v1.1.1 - Resource cleanup for terminated worker threads

19 May 00:54
Compare
Choose a tag to compare

This release contains minor fixes to make sure resource cleaning is done for terminated worker threads

  • Terminates the STDOUT and STDERR ReaderThreads for a Worker when it is scaled down

v1.0.8 - Synchronous resource cleaning and API changes

05 Nov 00:35
Compare
Choose a tag to compare

This release contains API changes and fixes to make sure resource cleaning is handled synchronously.

  • Load model API sends a conflict response instead of a bad request response when trying to register an already registered model. #851
  • Unregister model API is now synchronous and will wait until all resources are cleaned before sending a response back. A timeout feature was also added to config if users don't want to wait. #853

v1.0.7 - Bug fixes to support python 2 better

10 Sep 14:44
Compare
Choose a tag to compare

This release contains a minor bug fix for Python 2 support.

  • Changed the python protocol handler between frontend and backend to support python 2 better.

v1.0.6 - Features to handle OOM errors and enhancements to configurability of MMS

02 Sep 16:28
Compare
Choose a tag to compare
  • Load model API takes in JSON requests. #818
  • Implementation of Ping API using the plugins SDK. #814
  • Newer endpoint for predictions. POST /models/{model-id}/invoke. #823
  • Handling OOM errors. MMS returns a HTTP 507 error code when there is a OOM error during runtime of MMS. #822
  • Added changes to allow MMS have the same Management and Inference addresses #826
  • Changes to MMS default behavior. MMS by default runs POST /models in a synchronous way and if there are default_workers_per_model, this value will be used when loading models. #836
  • MMS configuration values can take environment variables. #841

v1.0.5 - Model Server support for plugins

26 Jun 04:16
Compare
Choose a tag to compare

This release contains multiple model server changes

Major features

  1. Plugins support
    1. SDK for plugins
    2. Reference plugins implementation
    3. MMS changes to support plugins
  2. Feature to support default service file configured.
  3. Feature to support return of custom HTTP headers from the model.

Minor features

  1. Option to run MMS in the foreground
    ....
    And multiple bug fixes

v1.0.4 - Contains model-archiver features, integration test framework and bug fixes

16 May 03:18
Compare
Choose a tag to compare

This release contains multiple model-archiver features and bug fixes.

Features

  1. Added support for "no-archive"
  2. Added feature to support optional conversion of ONNX model to MXNet model
  3. Added integration test framework for model-archiver.

v1.0.3 - Base MMS containers available

11 Apr 16:28
Compare
Choose a tag to compare

Features and Bug Fixes

  • Published base MMS containers for Python 2.7 and Python 3.6 with Ubuntu 16.04 and nvidia/cuda 9.2 with CUDNN 7 on ubuntu 16.04.
  • model-archiver changes to handle multiple archive formats
  • model-server configurable through environment variables
  • Contains multiple bug fixes

v1.0.2 - Multiple features and bug fixes

11 Mar 23:06
Compare
Choose a tag to compare

In this release we have addressed all the reported bugs and also added enhancements such as

Features and Major Bug Fixes

  • Frontend listening on Unix Domain Socket.
  • Support Asynchronous logging.
  • Added documentation for batching support.
  • Added features to support
    • Starting default number of workers for models that are launched at MMS Startup time.
    • Configurable response time out for individual models. This is the amount of time MMS waits for the model to respond to a request.
    • Configure Maximum allowable request and response sizes.
    • Changes for new Container images.
    • Passing all HTTP headers to the backend worker.
  • Adding shufflenet to the model server model-zoo.
  • Adding example to bring sockeye model onto MMS.
    ... And bug fixes

v1.0.1 - Apache Model Server for MXNet adds minor features and addresses bugs

12 Dec 01:20
Compare
Choose a tag to compare

In this release of MXNet Model Server, we have added the following features.

Features and Bug fixes

  • Changes for batching support.
  • CORS headers support added to responses.
  • Handle content-type returned by the backend code and pass ContentType to the service code
  • Workaround import mxnet module timeout issue. Now MXNet startup time doesn't cause significant delay upon MMS start on compute optimized hosts
  • Make sure that python prints are not buffered
  • Refactor metrics emission logic
  • Always use utf-8 to decode bytes.
  • Avoid archiving a model archive file recursively.
  • Pythonpath issues for MMS
  • Documentation updates

Apache Model Server for MXNet adds support for hot loading of models

30 Oct 17:27
Compare
Choose a tag to compare

In this release of MXNet Model Server, we have added the following major features.

Features

  1. Loading and Unloading models at run-time (hot loading models). This is now available via management REST API exposed by MMS. More on management API here
  2. Independently scale number of model-worker instances serving inference requests. This is available through management REST API.
  3. Improved model archive representation. More on model-archiver is here
  4. Improved docker container images.
  5. Improved performance compared to MMS v0.4 and decreased dependencies. One of the major changes is replacing monolithic architecture with separate frontend and backend. Netty is used as frontend webserver instead of Flask+GUnicorn combo. Python is for the backend.
  6. Improved logging and metrics collection. Using log4j and corresponding config to control metrics, including custom user metrics. More on logging config is here

New and updated documents:

  1. Migration document to migrate from MMS 0.4 to MMS 1.0.
  2. New Management API.
  3. Updated model zoo.
  4. Updated Inference API.

For further documentation, please refer /docs folder

Bug fixes:

This release fixes all the bugs logged on GitHub.