05. Extending DataSplits
Custom Clustering
To add a new clustering algorithm:
- Subtype
ClusteringResult
. - Implement
assignments
,nclusters
,counts
,wcounts
for your result type. - Register a clustering function returning your result.
Example:
struct MyClusteringResult <: ClusteringResult
assignments::Vector{Int}
end
assignments(r::MyClusteringResult) = r.assignments
nclusters(r::MyClusteringResult) = maximum(r.assignments)
# ...
Custom Splitter
To add a new splitting strategy:
- Subtype
SplitStrategy
. - Implement a core function (e.g.,
mysplit(N, s, rng, data)
) that returns(train_pos, test_pos)
for positions 1:N. - Implement
_split(data, s; rng)
to callsplit_with_positions(data, s, mysplit; rng=rng)
. - Use
ValidFraction
for fraction validation.
Example:
struct MySplit <: SplitStrategy; frac; end
function mysplit(N, s, rng, data)
cut = floor(Int, s.frac * N)
train_pos = 1:cut
test_pos = (cut+1):N
return train_pos, test_pos
end
function _split(data, s::MySplit; rng=Random.default_rng())
split_with_positions(data, s, mysplit; rng=rng)
end