Skip to content

Commit

Permalink
Merge pull request #3 from chengchingwen/main
Browse files Browse the repository at this point in the history
Enhancement
  • Loading branch information
ToucheSir authored Apr 20, 2024
2 parents d0568a8 + 26165d2 commit 35235fe
Show file tree
Hide file tree
Showing 8 changed files with 564 additions and 110 deletions.
4 changes: 3 additions & 1 deletion .github/workflows/CI.yml
Original file line number Diff line number Diff line change
Expand Up @@ -24,10 +24,12 @@ jobs:
- 'nightly'
os:
- ubuntu-latest
- macOS-latest
- windows-latest
arch:
- x64
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
- uses: julia-actions/setup-julia@v1
with:
version: ${{ matrix.version }}
Expand Down
7 changes: 7 additions & 0 deletions Project.toml
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,17 @@ authors = ["pevnak <[email protected]> and contributors"]
version = "1.0.0"

[deps]
BFloat16s = "ab4f0b2a-ad5b-11e8-123f-65d77653426b"
DLFP8Types = "f4c16678-4a16-415b-82ef-ed337c5d6c7c"
JSON3 = "0f8b85d8-7281-11e9-16c2-39a750bddbf1"
MappedArrays = "dbb5928d-eab1-5f90-85c2-b9b0edb7c900"
Mmap = "a63ad114-7e13-5084-954f-fe012c677804"

[compat]
BFloat16s = "0.5"
DLFP8Types = "0.1"
JSON3 = "1"
MappedArrays = "0.4"
julia = "1.6"

[extras]
Expand Down
76 changes: 68 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,18 +3,14 @@

[![Build Status](https://github.com/FluxML/SafeTensors.jl/actions/workflows/CI.yml/badge.svg?branch=main)](https://github.com/FluxML/SafeTensors.jl/actions/workflows/CI.yml?query=branch%3Amain)

This packages loads data stored in [safetensor format](https://huggingface.co/docs/safetensors/index).
This packages loads data stored in [safetensor format](https://huggingface.co/docs/safetensors/index).
Since Python is row-major and Julia is column-major, the dimensions are permuted such the tensor has the same shape as in python, but everything is correctly ordered. This includes a performance penalty in sense that we cannot be completely copy-free.

The list of dependencies is kept minimal to `JSON3` for parsing the header.

The package does not allow to save the data.

The main function is `load_safetensors` which returns a `Dict{String,V}` where keys are names of tensors and values are tensors. An example from `runtests` is as follows
```julia
julia> using SafeTensors

julia> d = load_safetensors("model.safetensors")
julia> d = load_safetensors("test/model.safetensors")
Dict{String, Array} with 27 entries:
"int32_357" => Int32[0 7 21 28; 35 42 56 63; 70 77 91 98;;; 1 8 22 29
"uint8_3" => UInt8[0x00, 0x01, 0x02]
Expand Down Expand Up @@ -45,9 +41,73 @@ Dict{String, Array} with 27 entries:
"float64_3" => [0.0, 1.0, 2.0]
```
It is also possible to load just header using unexported `load_header` as
It can also perform a lazy loading with `SafeTensors.deserialize("model.safetensors")` which `mmap` the file and return a `Dict`-like object:
```julia
julia> d = SafeTensors.load_header("model.safetensors")
julia> tensors = SafeTensors.deserialize("test/model.safetensors"; mmap = true #= default to `true`=#);

julia> tensors["float32_35"]
3×5 mappedarray(ltoh, PermutedDimsArray(reshape(reinterpret(Float32, view(::Vector{UInt8}, 0x0000000000000ef5:0x0000000000000f30)), 5, 3), (2, 1))) with eltype Float32:
0.0 1.0 2.0 3.0 4.0
5.0 6.0 7.0 8.0 9.0
10.0 11.0 12.0 13.0 14.0
```
Serialization is also supported:
```julia
julia> using Random, BFloat16s

julia> weights = Dict("W"=>randn(BFloat16, 3, 5), "b"=>rand(BFloat16, 3))
Dict{String, Array{BFloat16}} with 2 entries:
"W" => [0.617188 0.695312 0.390625 -2.0; -0.65625 -0.617188 0.652344 0.244141; 0.226562 2.70312 -0.174805 -0.7773
"b" => [0.111816, 0.566406, 0.283203]

julia> f = tempname();

julia> SafeTensors.serialize(f, weights)

julia> loaded = SafeTensors.deserialize(f);

julia> loaded["W"] weights["W"]
true

julia> SafeTensors.serialize(f, weights, Dict("Package"=>"SafeTensors.jl", "version"=>"1"))

julia> loaded = SafeTensors.deserialize(f);

julia> loaded.metadata
Dict{String, String} with 2 entries:
"Package" => "SafeTensors.jl"
"version" => "1"
```
Working with gpu:
```julia
julia> loaded["W"]
3×5 mappedarray(ltoh, PermutedDimsArray(reshape(reinterpret(BFloat16, view(::Vector{UInt8}, 0x00000000000000b9:0x00000000000000d6)), 5, 3), (2, 1))) with eltype BFloat16:
0.542969 0.201172 1.38281 -0.255859 -1.55469
0.172852 -0.949219 0.0561523 -1.34375 -0.206055
-0.0854492 1.17969 -0.265625 -0.871094 2.25

julia> using CUDA; CUDA.allowscalar(false)

julia> CuArray(loaded["W"])
3×5 CuArray{BFloat16, 2, CUDA.Mem.DeviceBuffer}:
0.542969 0.201172 1.38281 -0.255859 -1.55469
0.172852 -0.949219 0.0561523 -1.34375 -0.206055
-0.0854492 1.17969 -0.265625 -0.871094 2.25

julia> gpu_weights = Dict("W"=>CuArray(loaded["W"]), "b"=>CuArray(loaded["b"]))
Dict{String, CuArray{BFloat16, N, CUDA.Mem.DeviceBuffer} where N} with 2 entries:
"W" => [0.542969 0.201172 -0.255859 -1.55469; 0.172852 -0.949219 -1.34375 -0.206055; -0.0854492 1.17969 -0.871094
"b" => BFloat16[0.871094, 0.773438, 0.703125]

julia> f = tempname();

julia> SafeTensors.serialize(f, gpu_weights)

julia> SafeTensors.deserialize(f)
SafeTensors.SafeTensor{SubArray{UInt8, 1, Vector{UInt8}, Tuple{UnitRange{UInt64}}, true}} with 2 entries:
"W" => BFloat16[0.542969 0.201172 -0.255859 -1.55469; 0.172852 -0.949219 -1.34375 -0.206055; -0.0854492 1.17969 -
"b" => BFloat16[0.871094, 0.773438, 0.703125]
```
Loading

0 comments on commit 35235fe

Please sign in to comment.