analysis

import "github.com/umbralcalc/stochadex/pkg/analysis"

Package analysis provides data analysis and aggregation utilities for simulation results. It includes functions for computing statistical measures, working with CSV data, creating dataframes, and performing grouped aggregations over time series data.

Key Features:

Usage Patterns:

Index

func AddPartitionsToStateTimeStorage

func AddPartitionsToStateTimeStorage(storage *simulator.StateTimeStorage, partitions []*simulator.PartitionConfig, windowSizeByPartition map[string]int) *simulator.StateTimeStorage

AddPartitionsToStateTimeStorage extends the state time storage with newly generated values from the specified partitions.

func GetDataFrameFromPartition

func GetDataFrameFromPartition(storage *simulator.StateTimeStorage, partitionName string) dataframe.DataFrame

GetDataFrameFromPartition converts simulation partition data into a Gota DataFrame for convenient data manipulation and analysis.

This function extracts time series data from a simulation partition and converts it into a structured DataFrame format. The resulting DataFrame has a “time” column followed by columns for each state dimension, making it easy to perform data analysis, visualization, and export operations.

DataFrame Structure:

Parameters:

Returns:

Example:

// Extract price data from simulation storage
df := GetDataFrameFromPartition(storage, "prices")

// Access time column
timeCol := df.Col("time")

// Access state columns
price1Col := df.Col("0") // First price dimension
price2Col := df.Col("1") // Second price dimension

// Perform analysis
meanPrice1 := price1Col.Mean()
maxPrice2 := price2Col.Max()

Use Cases:

Performance:

Error Handling:

func NewGroupedAggregationPartition

func NewGroupedAggregationPartition(aggregation func(defaultValues []float64, outputIndexByGroup map[string]int, groupings map[string][]float64, weightings map[string][]float64) []float64, applied AppliedAggregation, storage *GroupedStateTimeStorage) *simulator.PartitionConfig

NewGroupedAggregationPartition creates a partition that performs grouped aggregations over historical state values with customizable binning.

This function creates a partition that aggregates data by grouping values into bins and applying custom aggregation functions within each group. It’s particularly useful for computing statistics over value ranges or categorical data.

Mathematical Concept: Grouped aggregations compute statistics within value bins:

G_i(t) = aggregate({X(s) : X(s) ∈ bin_i, s ≤ t})

where bin_i represents a value range or category, and aggregate is the user-provided function (e.g., mean, sum, count).

Parameters:

Returns:

Example:

// Aggregate price data by volatility bins
config := NewGroupedAggregationPartition(
    func(defaults, indices, groups, weights map[string][]float64) []float64 {
        results := make([]float64, len(indices))
        for group, idx := range indices {
            values := groups[group]
            w := weights[group]
            // Compute weighted mean
            sum := 0.0
            totalWeight := 0.0
            for i, v := range values {
                sum += v * w[i]
                totalWeight += w[i]
            }
            if totalWeight > 0 {
                results[idx] = sum / totalWeight
            } else {
                results[idx] = defaults[idx]
            }
        }
        return results
    },
    AppliedAggregation{
        Name: "volatility_aggregates",
        Data: DataRef{PartitionName: "prices"},
        Kernel: &kernels.ExponentialIntegrationKernel{},
        DefaultValue: 0.0,
    },
    volatilityStorage,
)

Performance Notes:

func NewLikelihoodComparisonPartition

func NewLikelihoodComparisonPartition(applied AppliedLikelihoodComparison, storage *simulator.StateTimeStorage) *simulator.PartitionConfig

NewLikelihoodComparisonPartition builds a PartitionConfig embedding an inner windowed simulation to evaluate the likelihood over a rolling window, producing a per-step comparison score.

func NewLikelihoodMeanFunctionFitPartition

func NewLikelihoodMeanFunctionFitPartition(applied AppliedLikelihoodMeanFunctionFit, storage *simulator.StateTimeStorage) *simulator.PartitionConfig

NewLikelihoodMeanFunctionFitPartition builds a PartitionConfig embedding an inner simulation that runs gradient descent to fit the likelihood mean to the referenced data window.

func NewLinePlotFromDataFrame

func NewLinePlotFromDataFrame(df *dataframe.DataFrame, xAxis string, yAxis string, groupBy ...string) *charts.Line

NewLinePlotFromDataFrame renders a line chart from a dataframe using the specified X and Y columns.

Usage hints:

func NewLinePlotFromPartition

func NewLinePlotFromPartition(storage *simulator.StateTimeStorage, xRef DataRef, yRefs []DataRef, fillYRefs []FillLineRef) *charts.Line

NewLinePlotFromPartition renders a multi-series line chart from storage using an X reference and one or more Y references.

Usage hints:

func NewPosteriorEstimationPartitions

func NewPosteriorEstimationPartitions(applied AppliedPosteriorEstimation, storage *simulator.StateTimeStorage) []*simulator.PartitionConfig

NewPosteriorEstimationPartitions creates a set of PartitionConfigs for an online posterior estimation process using rolling statistics.

func NewScatterPlotFromDataFrame

func NewScatterPlotFromDataFrame(df *dataframe.DataFrame, xAxis string, yAxis string, groupBy ...string) *charts.Scatter

NewScatterPlotFromDataFrame renders a scatter plot using columns of a dataframe.

Usage hints:

func NewScatterPlotFromPartition

func NewScatterPlotFromPartition(storage *simulator.StateTimeStorage, xRef DataRef, yRefs []DataRef) *charts.Scatter

NewScatterPlotFromPartition renders a scatter plot from storage-backed DataRef axes.

Usage hints:

func NewStateTimeStorageFromCsv

func NewStateTimeStorageFromCsv(filePath string, timeColumn int, stateColumnsByPartition map[string][]int, skipHeaderRow bool) (*simulator.StateTimeStorage, error)

NewStateTimeStorageFromCsv creates a StateTimeStorage from CSV data.

This function reads time series data from a CSV file and organizes it into partitions for use in stochadex simulations. It supports multiple partitions with different column configurations.

Parameters:

Returns:

CSV Format Requirements:

Example:

// Load data from a CSV with time in column 0, prices in columns 1-2, volumes in column 3
storage, err := NewStateTimeStorageFromCsv(
    "market_data.csv",
    0, // time in first column
    map[string][]int{
        "prices": {1, 2}, // prices partition uses columns 1 and 2
        "volumes": {3},   // volumes partition uses column 3
    },
    true, // skip header row
)
if err != nil {
    log.Fatal("Failed to load CSV data:", err)
}

Error Handling:

Performance Notes:

func NewStateTimeStorageFromJsonLogEntries

func NewStateTimeStorageFromJsonLogEntries(filename string) (*simulator.StateTimeStorage, error)

NewStateTimeStorageFromJsonLogEntries reads a file up to a given number of iterations into a simulator.StateTimeStorage struct.

func NewStateTimeStorageFromPartitions

func NewStateTimeStorageFromPartitions(partitions []*simulator.PartitionConfig, termination simulator.TerminationCondition, timestep simulator.TimestepFunction, initTime float64) *simulator.StateTimeStorage

NewStateTimeStorageFromPartitions generates a new simulator.StateTimeStorage by running a simulation with the specified partitions configured.

func NewStateTimeStorageFromPostgresDb

func NewStateTimeStorageFromPostgresDb(db *PostgresDb, partitionNames []string, startTime float64, endTime float64) (*simulator.StateTimeStorage, error)

NewStateTimeStorageFromPostgresDb reads from a PostgreSQL database over a pre-defined time interval into a simulator.StateTimeStorage struct.

func NewVectorCovariancePartition

func NewVectorCovariancePartition(mean DataRef, applied AppliedAggregation, storage *simulator.StateTimeStorage) *simulator.PartitionConfig

NewVectorCovariancePartition constructs a PartitionConfig that computes the rolling windowed weighted covariance matrix of the referenced data values. Provide the corresponding rolling mean via the mean DataRef.

func NewVectorMeanPartition

func NewVectorMeanPartition(applied AppliedAggregation, storage *simulator.StateTimeStorage) *simulator.PartitionConfig

NewVectorMeanPartition creates a partition that computes rolling weighted means for each dimension of the referenced data.

This function creates a partition that maintains running weighted averages over historical data using the specified integration kernel for time weighting. Each dimension of the source data is aggregated independently.

Mathematical Concept: Vector mean aggregation computes:

μ_i(t) = Σ w(t-s) * X_i(s) / Σ w(t-s)

where μ_i(t) is the mean for dimension i at time t, w(t-s) is the kernel weight, and X_i(s) is the value of dimension i at historical time s.

Parameters:

Returns:

Example:

// Compute exponentially weighted moving averages of price data
meanPartition := NewVectorMeanPartition(
    AppliedAggregation{
        Name: "price_ema",
        Data: DataRef{
            PartitionName: "prices",
            ValueIndices: []int{0, 1, 2}, // Use first 3 price dimensions
        },
        Kernel: &kernels.ExponentialIntegrationKernel{},
        DefaultValue: 100.0, // Initial price assumption
    },
    priceStorage,
)

Performance:

func NewVectorVariancePartition

func NewVectorVariancePartition(mean DataRef, applied AppliedAggregation, storage *simulator.StateTimeStorage) *simulator.PartitionConfig

NewVectorVariancePartition constructs a PartitionConfig that computes the rolling windowed weighted variance per-index of the referenced data values. Provide the corresponding rolling mean via the mean DataRef.

func SetPartitionFromDataFrame

func SetPartitionFromDataFrame(storage *simulator.StateTimeStorage, partitionName string, df dataframe.DataFrame, overwriteTime bool)

SetPartitionFromDataFrame updates a partition’s values from a Gota dataframe with schema [time, 0, 1, …]. If overwriteTime is true, the storage’s time vector is replaced with the “time” column.

func WriteStateTimeStorageToPostgresDb

func WriteStateTimeStorageToPostgresDb(db *PostgresDb, storage *simulator.StateTimeStorage)

WriteStateTimeStorageToPostgresDb writes all of the data in the state time storage to a PostgreSQL database.

type AppliedAggregation

AppliedAggregation describes how to aggregate a referenced dataset over time using customizable weighting kernels.

This struct configures the aggregation process by specifying the source data, output partition name, weighting scheme, and handling of insufficient history. It serves as a blueprint for creating aggregation partitions in simulations.

Mathematical Concept: Aggregations compute weighted averages over historical data:

A(t) = Σ w(t-s) * f(X(s)) / Σ w(t-s)

where w(t-s) is the kernel weight, f(X(s)) is the source data, and the sum is over all historical samples s ≤ t.

Fields:

Related Types:

Example:

aggregation := AppliedAggregation{
    Name: "rolling_mean",
    Data: DataRef{
        PartitionName: "prices",
        ValueIndices: []int{0, 1}, // Use first two price columns
    },
    Kernel: &kernels.ExponentialIntegrationKernel{},
    DefaultValue: 0.0,
}
type AppliedAggregation struct {
    Name         string
    Data         DataRef
    Kernel       kernels.IntegrationKernel
    DefaultValue float64
}

func (*AppliedAggregation) GetKernel

func (a *AppliedAggregation) GetKernel() kernels.IntegrationKernel

GetKernel returns the configured integration kernel with automatic fallback.

This method ensures that callers never need to handle nil kernels by providing a sensible default. The instantaneous kernel applies no time weighting, effectively using only the most recent value for aggregation.

Returns:

Usage:

kernel := aggregation.GetKernel()
// Safe to use kernel without nil checks

Performance:

type AppliedGrouping

AppliedGrouping configures a grouping transformation on data.

type AppliedGrouping struct {
    GroupBy   []DataRef
    Precision int
}

type AppliedLikelihoodComparison

AppliedLikelihoodComparison configures a rolling likelihood comparison between referenced data and a model over a sliding window.

type AppliedLikelihoodComparison struct {
    Name   string
    Model  ParameterisedModel
    Data   DataRef
    Window WindowedPartitions
}

type AppliedLikelihoodMeanFunctionFit

AppliedLikelihoodMeanFunctionFit configures online fitting of the model’s likelihood mean to data using a gradient function and learning rate over a finite descent schedule.

type AppliedLikelihoodMeanFunctionFit struct {
    Name              string
    Model             ParameterisedModelWithGradient
    Gradient          LikelihoodMeanGradient
    Data              DataRef
    Window            WindowedPartitions
    LearningRate      float64
    DescentIterations int
}

type AppliedPosteriorEstimation

AppliedPosteriorEstimation is the base configuration for an online inference of a simulation (specified by partition configs) from a referenced dataset.

type AppliedPosteriorEstimation struct {
    LogNorm      PosteriorLogNorm
    Mean         PosteriorMean
    Covariance   PosteriorCovariance
    Sampler      PosteriorSampler
    Comparison   AppliedLikelihoodComparison
    PastDiscount float64
    MemoryDepth  int
    Seed         uint64
}

type ColourGenerator

ColourGenerator iterates over the default ECharts categorical palette.

type ColourGenerator struct {
    // contains filtered or unexported fields
}

func (*ColourGenerator) Next

func (cg *ColourGenerator) Next() string

Next returns the next colour in the ECharts palette, cycling when the end is reached.

type DataPlotting

DataPlotting declares optional transformations for plotting, such as treating a reference as time and restricting to a time index range.

type DataPlotting struct {
    IsTime    bool
    TimeRange *IndexRange
}

type DataRef

DataRef identifies a subset of data stored in StateTimeStorage. It can reference the special time axis or one or more value indices of a partition. Optional plotting hints may be supplied via Plotting.

type DataRef struct {
    PartitionName string
    ValueIndices  []int
    Plotting      *DataPlotting
}

func (*DataRef) GetFromStorage

func (d *DataRef) GetFromStorage(storage *simulator.StateTimeStorage) [][]float64

GetFromStorage returns the entire referenced series. For a time reference, this is a single series containing all times; for a value reference, this is one series per value index.

func (*DataRef) GetSeriesNames

func (d *DataRef) GetSeriesNames(storage *simulator.StateTimeStorage) []string

GetSeriesNames returns human-readable series labels for plotting. Time references are labeled “time”; value references are labeled as “<partition> <index>”.

func (*DataRef) GetTimeIndexFromStorage

func (d *DataRef) GetTimeIndexFromStorage(storage *simulator.StateTimeStorage, timeIndex int) []float64

GetTimeIndexFromStorage returns the data at a specific time index. For a time reference, this is a single-element slice containing the time value; for a value reference, this is the row slice for that time index.

func (*DataRef) GetValueIndices

func (d *DataRef) GetValueIndices(storage *simulator.StateTimeStorage) []int

GetValueIndices returns the referenced value indices, defaulting to all indices within the partition when ValueIndices is nil.

type FillLineRef

FillLineRef specifies an upper and lower bound series used to fill a confidence region in a line plot.

type FillLineRef struct {
    Upper DataRef
    Lower DataRef
}

type GroupedStateTimeStorage

GroupedStateTimeStorage is a representation of simulator.StateTimeStorage which has already had a grouping transformation applied to it.

type GroupedStateTimeStorage struct {
    Storage *simulator.StateTimeStorage
    // contains filtered or unexported fields
}

func NewGroupedStateTimeStorage

func NewGroupedStateTimeStorage(applied AppliedGrouping, storage *simulator.StateTimeStorage) *GroupedStateTimeStorage

NewGroupedStateTimeStorage creates a new GroupedStateTimeStorage given the provided simulator.StateTimeStorage and applied grouping.

func (*GroupedStateTimeStorage) GetAcceptedValueGroupLabels

func (g *GroupedStateTimeStorage) GetAcceptedValueGroupLabels() []string

GetAcceptedValueGroupLabels returns the unique group labels that were found in the data which are typically used for labelling plots.

func (*GroupedStateTimeStorage) GetAcceptedValueGroups

func (g *GroupedStateTimeStorage) GetAcceptedValueGroups(tupIndex int) []float64

GetAcceptedValueGroups returns the unique groups that were found in the data which are typically used to configure group aggregation partitions.

func (*GroupedStateTimeStorage) GetAcceptedValueGroupsLength

func (g *GroupedStateTimeStorage) GetAcceptedValueGroupsLength() int

GetAcceptedValueGroupsLength returns the number of accepted value groups (equivalent to the length of the state vector in simulation partition).

func (*GroupedStateTimeStorage) GetGroupTupleLength

func (g *GroupedStateTimeStorage) GetGroupTupleLength() int

GetGroupTupleLength returns the length of tuple in the grouping index construction.

func (*GroupedStateTimeStorage) GetGroupingPartition

func (g *GroupedStateTimeStorage) GetGroupingPartition(tupIndex int) string

GetGroupingPartitions returns the partition used in the data for grouping.

func (*GroupedStateTimeStorage) GetGroupingValueIndices

func (g *GroupedStateTimeStorage) GetGroupingValueIndices(tupIndex int) []float64

GetGroupingValueIndices returns the value indices used in the data for grouping.

func (*GroupedStateTimeStorage) GetPrecision

func (g *GroupedStateTimeStorage) GetPrecision() int

GetPrecision returns the requested float precision for grouping.

type IndexRange

IndexRange represents an inclusive-exclusive [Lower, Upper) span of indices. It is commonly used to clip time-series windows for plotting.

type IndexRange struct {
    Lower int
    Upper int
}

type LikelihoodMeanGradient

LikelihoodMeanGradient specifies a function mapping params and the gradient of the likelihood mean to a parameter update direction.

type LikelihoodMeanGradient struct {
    Function func(
        params *simulator.Params,
        likeMeanGrad []float64,
    ) []float64
    Width int
}

type ParameterisedModel

ParameterisedModel bundles a likelihood distribution with its parameter configuration and any cross-partition parameter wiring required at runtime.

type ParameterisedModel struct {
    Likelihood         inference.LikelihoodDistribution
    Params             simulator.Params
    ParamsAsPartitions map[string][]string
    ParamsFromUpstream map[string]simulator.NamedUpstreamConfig
}

func (*ParameterisedModel) Init

func (p *ParameterisedModel) Init()

Init ensures internal parameter wiring maps are initialised.

type ParameterisedModelWithGradient

ParameterisedModelWithGradient augments ParameterisedModel with gradient support for optimisation routines.

type ParameterisedModelWithGradient struct {
    Likelihood         inference.LikelihoodDistributionWithGradient
    Params             simulator.Params
    ParamsAsPartitions map[string][]string
    ParamsFromUpstream map[string]simulator.NamedUpstreamConfig
}

func (*ParameterisedModelWithGradient) Init

func (p *ParameterisedModelWithGradient) Init()

Init ensures internal parameter wiring maps are initialised.

type PosteriorCovariance

PosteriorCovariance defines the configuration needed to specify the posterior covariance in the AppliedPosteriorEstimation.

type PosteriorCovariance struct {
    Name         string
    Default      []float64
    JustVariance bool
}

type PosteriorLogNorm

PosteriorLogNorm defines the configuration needed to specify the posterior log-normalisation in the AppliedPosteriorEstimation.

type PosteriorLogNorm struct {
    Name    string
    Default float64
}

type PosteriorMean

PosteriorMean defines the configuration needed to specify the posterior mean in the AppliedPosteriorEstimation.

type PosteriorMean struct {
    Name    string
    Default []float64
}

type PosteriorSampler

PosteriorSampler defines the configuration needed to specify the posterior sampler in the AppliedPosteriorEstimation.

type PosteriorSampler struct {
    Name         string
    Default      []float64
    Distribution ParameterisedModel
}

type PostgresDb

PostgresDb is a struct which can be configured to define interactions with a PostgresSQL database.

type PostgresDb struct {
    User      string
    Password  string
    Dbname    string
    TableName string
    // contains filtered or unexported fields
}

func (*PostgresDb) OpenTableConnection

func (p *PostgresDb) OpenTableConnection() error

OpenTableConnection connects to the PostgreSQL database or creates it if it doesn’t exist.

func (*PostgresDb) ReadStateInRange

func (p *PostgresDb) ReadStateInRange(partitionName string, startTime float64, endTime float64) (*sql.Rows, error)

ReadStateInRange retrieves all entries between a specified start and end time range for a given partition.

func (*PostgresDb) WriteState

func (p *PostgresDb) WriteState(partitionName string, time float64, state []float64) error

WriteState writes a new partition state value to the database.

type PostgresDbOutputFunction

PostgresDbOutputFunction writes the data from the simulation to a PostgresSQL database when the simulator.OutputCondition is met.

type PostgresDbOutputFunction struct {
    // contains filtered or unexported fields
}

func NewPostgresDbOutputFunction

func NewPostgresDbOutputFunction(db *PostgresDb) *PostgresDbOutputFunction

NewPostgresDbOutputFunction creates a new PostgresDbOutputFunction.

func (*PostgresDbOutputFunction) Output

func (p *PostgresDbOutputFunction) Output(partitionName string, state []float64, cumulativeTimesteps float64)

type WindowedPartition

WindowedPartition configures a partition that participates in a finite windowed simulation.

Usage hints:

type WindowedPartition struct {
    Partition        *simulator.PartitionConfig
    OutsideUpstreams map[string]simulator.NamedUpstreamConfig
}

type WindowedPartitions

WindowedPartitions defines the sliding-window context used by analysis.

Usage hints:

type WindowedPartitions struct {
    Partitions []WindowedPartition
    Data       []DataRef
    Depth      int
}

Generated by gomarkdoc