Arcane

                                               
_____ _______   ____ _____    ____   ____      
\__  \\_  __ \_/ ___\\__  \  /    \_/ __ \     
 / __ \|  | \/\  \___ / __ \|   |  \  ___/     
(____  /__|    \___  >____  /___|  /\___  > /\ 
     \/            \/     \/     \/     \/  \/

Arcane is a powerful CLI tool designed to simplify and streamline the process of distributed training for machine learning models. It allows you to efficiently manage and monitor training tasks across multiple machines or GPUs.

Features

Distributed Training: Seamlessly distribute training tasks across multiple nodes.
Easy Configuration: Simple setup with configuration files and environment variables.
Resource Monitoring: Track resource usage and performance metrics.
Scalability: Easily scale your training tasks to handle large datasets and complex models.

Installation

To install Arcane, clone the repository and use the following command:

git clone https://github.com/yourusername/arcane.git
cd arcane
pip install .

Usage

Arcane provides a command-line interface for managing distributed training tasks. Here are some common commands:

Start Training: Begin a distributed training session.
```
arcane train --config path/to/config.yaml
```
Monitor Progress: Check the status of your training tasks.
```
arcane status
```
Stop Training: Terminate a running training session.
```
arcane stop
```

Configuration

Arcane uses a YAML configuration file to specify training parameters and machine roles. Here is an example configuration:

master:
  host: master-node
  port: 12345

workers:
  - host: worker-node-1
    port: 12346
  - host: worker-node-2
    port: 12347

training:
  model: resnet50
  dataset: /path/to/dataset
  epochs: 10
  batch_size: 32

Testing

To test Arcane across multiple machines:

Ensure all machines are on the same network and have SSH access.
Set up the environment and dependencies on each machine.
Use the sample configuration file to start a distributed training task.
Monitor logs and resource usage to ensure everything is functioning correctly.

Contributing

We welcome contributions! Please read our contributing guidelines for more details.

PS: This is still under construction

License

This project is licensed under the MIT License - see the LICENSE file for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Arcane

Features

Installation

Usage

Configuration

Testing

Contributing

License

Files

README.md

Latest commit

History

README.md

File metadata and controls

Arcane

Features

Installation

Usage

Configuration

Testing

Contributing

License