Skip to content

somethingnew2-0/go-erasure

Repository files navigation

go-erasure Build Status Coverage Status

Disclaimer: I recommend the klauspost/reedsolomon erasure coding library over this one as it is more performant and has better support for multiple architectures.

Go bindings for erasure coding (Reed-Solomon coding).

Erasure coding is similar to RAID based parity encoding, but is more generalized and powerful. When defining an erasure code, you specify a k and m variable. m is the number of shards you wish to encode and k is the number shards it takes to recreate your original data. Hence k must be less than m and usually not equal (as that would be a pointless encoding). The real magic with erasure coding is that fact that ANY k of the m shards can recreate the original data. For example, a erasure coding scheme of k=8 and m=12 means any four of the encoded shards can be lost while the original data can still be constructed from the valid remaining eight shards.

This library is aimed at simplicity and performance. It only has three methods including a constructor which are all thread-safe! Internally it uses Cgo to utilize a complex C library. For a more in-depth look into this library be sure to check out the Intel® Storage Acceleration Library and especially their corresponding video. One feature it does add is an optimization for decoding. Since there are m choose k possible inverse matrices for decoding, this library caches them (via lazy-loading) so as reduce the amount of time decoding. It does so by utilizing a trie where the sorted error list of shards is the key to the trie and the corresponding decode matrix is the value.

I hope you find it useful and pull requests are welcome!

Usage

See the GoDoc for an API reference

Encode and decode random data

package main

import (
  "bytes"
  "log"
  "math/rand"
  
  "github.com/somethingnew2-0/go-erasure"
)

func corrupt(source, errList []byte, shardLength int) []byte {
	corrupted := make([]byte, len(source))
	copy(corrupted, source)
	for _, err := range errList {
		for i := 0; i < shardLength; i++ {
			corrupted[int(err)*shardLength+i] = 0x00
		}
	}
	return corrupted
}

func main() {
	m := 12
	k := 8
	shardLength := 16 // Length of a shard
	size := k * shardLength // Length of the data blob to encode

	code := erasure.NewCode(m, k, size)

	source := make([]byte, size)
	for i := range source {
		source[i] = byte(rand.Int63() & 0xff) //0x62
	}

	encoded := code.Encode(source)

	errList := []byte{0, 2, 3, 4}

	corrupted := corrupt(append(source, encoded...), errList, shardLength)

	recovered := code.Decode(corrupted, errList, true)

	if !bytes.Equal(source, recovered) {
		log.Fatal("Source was not sucessfully recovered with 4 errors")
	}
}

Development

To start run source dev.sh or more simply . dev.sh to setup the git hooks and GOPATH for this project.

Run go test or go test -bench . to test the unit tests and benchmark tests.

About

Erasure coding (Reed–Solomon coding) in Go

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published