Skip to content

Commit

Permalink
Merge pull request #665 from olynch/comptime-refactor
Browse files Browse the repository at this point in the history
Refactor ACSets to use CompTime.jl
  • Loading branch information
epatters authored Sep 15, 2022
2 parents 39637a1 + 63efe26 commit 17b6965
Show file tree
Hide file tree
Showing 13 changed files with 1,308 additions and 824 deletions.
2 changes: 2 additions & 0 deletions Project.toml
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ version = "0.14.4"

[deps]
Colors = "5ae59095-9a9b-59fe-a467-6f913c188581"
CompTime = "0fb5dd42-039a-4ca4-a1d7-89a96eae6d39"
Compose = "a81c6b42-2e10-5240-aca2-a61377ecd94b"
DataStructures = "864edb3b-99cc-5e75-8d2d-829cb0a9cfe8"
GeneralizedGenerated = "6b9d7cbe-bcb9-11e9-073f-15a7a543e2eb"
Expand All @@ -27,6 +28,7 @@ Tables = "bd369af6-aec1-5ad0-b16a-f7cc5008161c"

[compat]
Colors = "0.12"
CompTime = "0.1"
Compose = "0.7, 0.8, 0.9"
DataStructures = "0.17, 0.18"
GeneralizedGenerated = "0.2, 0.3"
Expand Down
37 changes: 31 additions & 6 deletions docs/src/apis/categorical_algebra.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ An acset $$F$$ on a schema consists of...

For those with a categorical background, an acset on a schema $$S$$ consists of a functor from $$S$$ to $$\mathsf{Set}$$, such that objects in $$S^{-1}(0)$$ map to finite sets, and objects in $$S^{-1}(1)$$ map to sets that represent types. For any particular functor $$K \colon S^{-1}(1) \to \mathsf{Set}$$, we can also take the category of acsets that restrict to this map on $$S^{-1}$$.

We can also add relations to this presentation, but we currently do nothing with those relations in the implementation; they mostly serve as documentation.
We can also add equations to this presentation, but we currently do nothing with those equations in the implementation; they mostly serve as documentation.

We will now give an example of how this all works in practice.

Expand Down Expand Up @@ -92,22 +92,47 @@ end

### API

We first give an overview of the data types used in the acset machinery.
The mathematical abstraction of an acset can of course be implemented in many different ways. Currently, there are three implementations of acsets in Catlab, which share a great deal of code.

`FreeSchema` A finite presentation of a category that will be used as the schema of a database in the *algebraic databases* conception of categorical database theory. Functors out of a schema into FinSet are combinatorial structures over the schema. Attributes in a schema allow you to encode numerical (any julia type) into the database. You can find several examples of schemas in `Catlab.Graphs` where they define categorical versions of graph theory.
These implementations can be split into two categories.

`CSet/AttributedCSet` is a struct/constructors whose values (tables, indices) are parameterized by a CatDesc/AttrDesc. These are in memory databases over the schema equiped with `ACSetTranformations` as natural transformations that encode relationships between database instances.
The first category is **static acset types**. In this implementation, different schemas correspond to different Julia types. Methods on these Julia types are then custom-generated for the schema, using [CompTime.jl](https://github.com/AlgebraicJulia/CompTime.jl).

`CSetType/AttributedCSetType`provides a function to construct a julia type for ACSet instances, parameterized by CatDesc/AttrDesc. This function constructs the new type at runtime. In order to have the interactive nature of Julia, and to dynamically construct schemas based on runtime values, we need to define new Julia types at runtime. This function converts the schema spec to the corresponding Julia type.
Under this category, there are two classes of static acset types. The first class is acset types that are generated using the `@acset_type` macro. These acset types are custom-derived structs. The advantage of this is that the structs have names like `Graph` or `WiringDiagram` that are printed out in error messages. The disadvantage is that if you are taking in schemas at runtime, you have to `eval` code in order to use them.

`CatDesc/AttrDesc` the encoding of a schema into a Julia type. These exist because Julia only allows certain kinds of data in the parameter of a dependent type. Thus, we have to serialize a schema into those primitive data types so that we can use them to parameterize the ACSet type over the schema. This is an implementation detail subject to complete overhaul.
Here is an example of using `@acset_type`

```julia
@acset_type WeightedGraph(SchWeightedGraph, index=[:src,:tgt])
g = WeightedGraph()
```

The second class is `AnonACSet`s. Like acset types derived from `@acset_type`, these contain the schema in their type. However, they also contain the type of their fields in their types, so the types printed out in error messages are long and ugly. The advantage of these is that they can be used in situations where the schema is passed in at runtime, and they don't require using `eval` to create a new acset type.

Here is an example of using `AnonACSet`

```julia
const WeightedGraph = AnonACSetType(SchWeightedGraph, index=[:src,:tgt])
g = WeightedGraph()
```

The second category is **dynamic acset types**. Currently, there is just one type that falls under this category: `DynamicACSet`. This type has a **field** for the schema, and no code-generation is done for operations on acsets of this type. This means that if the schema is large compared to the data, this type will often be faster than the static acsets.

However, dynamics acsets are a new addition to Catlab, and much of the machinery of limits, colimits, and other high-level acset constructions assumes that the schema of an acset can be derived from the type. Thus, more work will have to be done before dynamic acsets become a drop-in replacement for static acsets.

Here is an example of using a dynamic acset

```julia
g = DynamicACSet("WeightedGraph", SchWeightedGraph; index=[:src,:tgt])
```

```@autodocs
Modules = [
CategoricalAlgebra.CSets,
CategoricalAlgebra.StructuredCospans,
CategoricalAlgebra.ACSetInterface,
CategoricalAlgebra.ACSetColumns,
CategoricalAlgebra.CSetDataStructures
]
Private = false
```
1 change: 1 addition & 0 deletions src/Catlab.jl
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ include("theories/Theories.jl")

include("categorical_algebra/IndexUtils.jl")
include("categorical_algebra/ACSetInterface.jl")
include("categorical_algebra/ACSetColumns.jl")
include("categorical_algebra/CSetDataStructures.jl")
include("categorical_algebra/Permutations.jl")
include("graphs/Graphs.jl")
Expand Down
276 changes: 276 additions & 0 deletions src/categorical_algebra/ACSetColumns.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,276 @@
"""
An acset column should satisfy the following interface
```julia
Base.getindex
Base.setindex!
Base.values
clear_index!
codom_hint!
preimage
preimage_multi
resize_clearing!
```
"""
module ACSetColumns
export preimage, preimage_multi, clear_index!, clear_indices!, codom_hint!, IndexedVector, resize_clearing!

using ..IndexUtils


"""
This function takes an acset column and an element of the domain and sets the column
value at that element be "missing"; 0 in the case of an integer, or nothing in the case
of a type that is a supertype of nothing. It also clears the index of the previous value
at the element.
The semantics of this function are deeply janky because we don't have proper support
for partial acsets; this will be fixed soon.
"""
function clear_index! end

"""
This is called to alert the column that there are new values in its codomain; the column
may then potentially preallocate some space for those new values.
"""
function codom_hint! end

"""
This gets the preimage of a single value in the codomain.
"""
function preimage end

"""
This gets the preimage of several values in the codomain. This is semantically equivalent
to broadcasting preimage, but a column implementation might instead return a view of
the index.
"""
function preimage_multi end

"""
This resizes a column, and if the column grows, initializes the new elements to the "missing"
value: 0 in the case of an integer or nothing in the case of a type that is a supertype of Nothing.
The semantics of this function are deeply janky because we don't have proper support for partial
acsets; this will be fixed soon. Specifically, if we have values that aren't a supertype of Nothing
or an integer, we get random uninitialized memory.
"""
function resize_clearing! end

# Additionally, the type should be able to be called with no arguments to create an empty column

function clear_indices!(v, idxs)
for i in idxs
clear_index!(v, i)
end
end

function preimage(v::AbstractVector, x)
findall(y -> x == y, v)
end

function preimage_multi(v::AbstractVector, xs)
broadcast(x -> preimage(v,x), xs)
end

function clear_index!(v::AbstractVector{Int}, i::Int)
v[i] = 0
end

function clear_index!(v::AbstractVector{T}, i::Int) where {T >: Nothing}
v[i] = nothing
end

function clear_index!(v::AbstractVector{T}, i::Int) where {T}
end

function codom_hint!(v::AbstractVector{T}, n::Int) where {T}
end

function resize_clearing!(v::AbstractVector{Int}, n::Int)
oldn = length(v)
resize!(v, n)
v[(oldn+1):n] .= 0
end

function resize_clearing!(v::AbstractVector{T}, n::Int) where {T}
resize!(v, n)
end

struct IndexedVector{T,Index} <: AbstractVector{T}
vals::Vector{T}
index::Index
end

function IndexedVector{T,Index}() where {T,Index}
IndexedVector{T,Index}(T[],Index())
end

Base.copy(v::IndexedVector{T,Index}) where {T,Index} =
IndexedVector{T,Index}(copy(v.vals), deepcopy(v.index))

Base.size(v::IndexedVector) = size(v.vals)

function Base.getindex(v::IndexedVector{T}, i::Int) where {T}
v.vals[i]
end

function Base.setindex!(v::IndexedVector{T}, x::T, i::Int) where {T}
if isassigned(v.vals, i)
oldx = v.vals[i]
v.vals[i] = x
update_index!(v.index, x, oldx, i)
else
v.vals[i] = x
insert_index!(v.index, x, i)
end
end

function insert_index!(index::Vector{Vector{Int}}, x::Int, i::Int)
if x != 0
insertsorted!(index[x], i)
end
end

function insert_index!(index::Dict{T,Vector{Int}}, x::T, i::Int) where {T}
if !isnothing(x)
if x keys(index)
insertsorted!(index[x],i)
else
index[x] = [i]
end
end
end

function insert_index!(index::Vector{Int}, x::Int, i::Int)
if x != 0
@assert index[x] == 0
index[x] = i
end
end

function insert_index!(index::Dict{T,Int}, x::T, i::Int) where {T}
if !isnothing(x)
@assert !(x keys(index))
index[x] = i
end
end

Base.values(v::IndexedVector) = v.vals

function update_index!(index::Vector{Vector{Int}}, x::Int, oldx::Int, i::Int)
if 1 oldx length(index)
deletesorted!(index[oldx], i)
end
insert_index!(index,x,i)
end

function update_index!(index::Dict{T,Vector{Int}}, x::T, oldx::T, i::Int) where {T}
if oldx keys(index) # oldx could just be gobbledegook
deletesorted!(index[oldx],i)
end
insert_index!(index,x,i)
end

function update_index!(index::Vector{Int}, x::Int, oldx::Int, i::Int)
if oldx != 0
index[oldx] = 0
end
insert_index!(index,x,i)
end

function update_index!(index::Dict{T,Int}, x::T, oldx::T, i::Int) where {T}
if oldx keys(index)
delete!(index, oldx)
end
insert_index!(index,x,i)
end

function resize_clearing!(v::IndexedVector{Int}, n::Int)
oldn = length(v.vals)
resize!(v.vals, n)
v.vals[(oldn+1):n] .= 0
end

function resize_clearing!(v::IndexedVector{T}, n::Int) where {T}
resize!(v.vals, n)
end

function clear_index!(v::IndexedVector{T}, i::Int) where {T >: Nothing}
v[i] = nothing
end

function clear_index!(v::IndexedVector{Int}, i::Int)
v[i] = 0
end

# There isn't an "empty" variable in this case, but we can still unset the index
function clear_index!(v::IndexedVector{T,Dict{T,Vector{Int}}}, i::Int) where {T}
if isassigned(v.vals, i)
oldx = v[i]
if oldx keys(v.index)
deletesorted!(v.index[oldx], i)
end
end
end

# There isn't an "empty" variable in this case, but we can still unset the index
function clear_index!(v::IndexedVector{T,Dict{T,Int}}, i::Int) where {T}
if isassigned(v.vals, i)
oldx = v[i]
if oldx keys(v.index)
delete!(v.index, oldx)
end
end
end

function preimage(v::IndexedVector{Int, <:Vector{<:Union{Int,Vector{Int}}}}, x::Int)
v.index[x]
end

function preimage(v::IndexedVector{T, Dict{T, Vector{Int}}}, x) where {T}
if x keys(v.index)
v.index[x]
else
[]
end
end

function preimage(v::IndexedVector{T, Dict{T, Int}}, x) where {T}
if x keys(v.index)
v.index[x]
else
0
end
end

function preimage_multi(v::IndexedVector{Int, <:Vector{<:Union{Int,Vector{Int}}}},
xs::Union{AbstractVector,UnitRange})
@view v.index[xs]
end

function preimage_multi(v::IndexedVector{T, <:Dict{T, <:Union{Int,Vector{Int}}}},
xs::Union{AbstractVector,UnitRange}) where {T}
[preimage(v, x) for x in xs]
end


function codom_hint!(v::IndexedVector{T}, n::Int) where {T}
codom_hint_index!(v.index, n)
end

function codom_hint_index!(index::Vector{Vector{Int}}, n::Int)
oldn = length(index)
resize!(index, n)
for i in (oldn + 1):n
index[i] = Vector{Int}[]
end
end

function codom_hint_index!(index::Vector{Int}, n::Int)
oldn = length(index)
resize!(index, n)
index[(oldn + 1):n] .= 0
end

end
Loading

0 comments on commit 17b6965

Please sign in to comment.