Skip to content

jelli.core.theory_correlations

TheoryCorrelations

A class to represent theory correlations.

Parameters:

Name Type Description Default
hash_val str

The hash value representing the combination of row and column observable names.

required
data ndarray

The data array containing the correlation data.

required
row_names Dict[str, int]

A dictionary mapping row observable names to their indices.

required
col_names Dict[str, int]

A dictionary mapping column observable names to their indices.

required

Attributes:

Name Type Description
hash_val str

The hash value representing the combination of row and column observable names.

data ndarray

The data array containing the correlation data.

row_names Dict[str, int]

A dictionary mapping row observable names to their indices.

col_names Dict[str, int]

A dictionary mapping column observable names to their indices.

_correlations Dict[str, TheoryCorrelations]

A class attribute to cache all theory correlations.

_covariance_scaled Dict[str, ndarray]

A class attribute to cache scaled covariance matrices.

_popxf_h5_versions Set[str]

A set of supported versions of the popxf-h5 JSON schema.

Methods:

Name Description
load

Load theory correlations from HDF5 files in the specified path.

_load_file

Load theory correlations from a single HDF5 file.

from_hdf5_group

Create a TheoryCorrelations instance from an HDF5 group.

get_data

Get the correlation data for the specified row and column observable names.

get_cov_scaled

include_measurements: Iterable[str], row_names: Iterable[str], col_names: Iterable[str], std_th_scaled_row: np.ndarray, std_th_scaled_col: np.ndarray

) -> jnp.ndarray

Get the scaled covariance matrix for the specified measurements, and row and column observable names.

Examples:

Load theory correlations from HDF5 files in a directory:

>>> TheoryCorrelations.load('path/to/directory')

Load theory correlations from a single HDF5 file:

>>> TheoryCorrelations.load('path/to/file.hdf5')

Get correlation data for specific row and column observable names:

>>> data = TheoryCorrelations.get_data(['obs1', 'obs2'], ['obs3', 'obs4'])

Get scaled covariance matrix for specific measurements and observable names:

>>> cov_scaled = TheoryCorrelations.get_cov_scaled(
...     include_measurements=['meas1', 'meas2'],
...     row_names=['obs1', 'obs2'],
...     col_names=['obs3', 'obs4'],
...     std_th_scaled_row=np.array([[0.1, 0.2], [0.3, 0.4]]),
...     std_th_scaled_col=np.array([[0.5, 0.6], [0.7, 0.8]])
Source code in jelli/core/theory_correlations.py
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
class TheoryCorrelations:
    '''
    A class to represent theory correlations.

    Parameters
    ----------
    hash_val : str
        The hash value representing the combination of row and column observable names.
    data : np.ndarray
        The data array containing the correlation data.
    row_names : Dict[str, int]
        A dictionary mapping row observable names to their indices.
    col_names : Dict[str, int]
        A dictionary mapping column observable names to their indices.

    Attributes
    ----------
    hash_val : str
        The hash value representing the combination of row and column observable names.
    data : np.ndarray
        The data array containing the correlation data.
    row_names : Dict[str, int]
        A dictionary mapping row observable names to their indices.
    col_names : Dict[str, int]
        A dictionary mapping column observable names to their indices.
    _correlations : Dict[str, 'TheoryCorrelations']
        A class attribute to cache all theory correlations.
    _covariance_scaled : Dict[str, jnp.ndarray]
        A class attribute to cache scaled covariance matrices.
    _popxf_h5_versions : Set[str]
        A set of supported versions of the popxf-h5 JSON schema.

    Methods
    -------
    load(path: str) -> None
        Load theory correlations from HDF5 files in the specified path.
    _load_file(path: str) -> None
        Load theory correlations from a single HDF5 file.
    from_hdf5_group(hash_val: str, hdf5_group: h5py.Group) -> None
        Create a TheoryCorrelations instance from an HDF5 group.
    get_data(row_names: Iterable[str], col_names: Iterable[str]) -> np.ndarray or None
        Get the correlation data for the specified row and column observable names.
    get_cov_scaled(
        include_measurements: Iterable[str],
        row_names: Iterable[str],
        col_names: Iterable[str],
        std_th_scaled_row: np.ndarray,
        std_th_scaled_col: np.ndarray
    ) -> jnp.ndarray
        Get the scaled covariance matrix for the specified measurements, and row and column observable names.

    Examples
    --------
    Load theory correlations from HDF5 files in a directory:

    >>> TheoryCorrelations.load('path/to/directory')

    Load theory correlations from a single HDF5 file:

    >>> TheoryCorrelations.load('path/to/file.hdf5')

    Get correlation data for specific row and column observable names:

    >>> data = TheoryCorrelations.get_data(['obs1', 'obs2'], ['obs3', 'obs4'])

    Get scaled covariance matrix for specific measurements and observable names:

    >>> cov_scaled = TheoryCorrelations.get_cov_scaled(
    ...     include_measurements=['meas1', 'meas2'],
    ...     row_names=['obs1', 'obs2'],
    ...     col_names=['obs3', 'obs4'],
    ...     std_th_scaled_row=np.array([[0.1, 0.2], [0.3, 0.4]]),
    ...     std_th_scaled_col=np.array([[0.5, 0.6], [0.7, 0.8]])
    '''

    _correlations: Dict[str, 'TheoryCorrelations'] = {}
    _covariance_scaled: Dict[str, jnp.ndarray] = {}
    _popxf_h5_versions = {'1.0'} # Set of supported versions of the popxf-h5 JSON schema

    def __init__(
        self,
        hash_val: str,
        data: np.ndarray,
        row_names: Dict[str, int],
        col_names: Dict[str, int]
    ) -> None:
        '''
        Initialize an instance of the `TheoryCorrelations` class.

        Parameters
        ----------
        hash_val : str
            The hash value representing the combination of row and column observable names.
        data : np.ndarray
            The data array containing the correlation data.
        row_names : Dict[str, int]
            A dictionary mapping row observable names to their indices.
        col_names : Dict[str, int]
            A dictionary mapping column observable names to their indices.

        Returns
        -------
        None

        Examples
        --------
        Initialize a TheoryCorrelations instance:

        >>> theory_corr = TheoryCorrelations(...)
        '''
        self.hash_val = hash_val
        self.data = data
        self.row_names = row_names
        self.col_names = col_names
        self._correlations[hash_val] = self

    @classmethod
    def _load_file(cls, path: str) -> None:
        '''
        Load theory correlations from a single HDF5 file.

        Parameters
        ----------
        path : str
            The path to the HDF5 file.

        Returns
        -------
        None

        Examples
        --------
        Load theory correlations from a single HDF5 file:

        >>> TheoryCorrelations._load_file('path/to/file.hdf5')
        '''
        with h5py.File(path, 'r') as f:
            schema_name, schema_version = get_json_schema(dict(f.attrs))
            if schema_name == 'popxf-h5' and schema_version in cls._popxf_h5_versions:
                for hash_val in f:
                    cls.from_hdf5_group(hash_val, f[hash_val])

    @classmethod
    def load(cls, path: str) -> None:
        '''
        Load theory correlations from HDF5 files in the specified path.

        Parameters
        ----------
        path : str
            The path to a directory containing HDF5 files or a single HDF5 file.

        Returns
        -------
        None

        Examples
        --------
        Load theory correlations from HDF5 files in a directory:

        >>> TheoryCorrelations.load('path/to/directory')

        Load theory correlations from a single HDF5 file:

        >>> TheoryCorrelations.load('path/to/file.hdf5')
        '''
        # load all hdf5 files in the directory
        if os.path.isdir(path):
            for file in os.listdir(path):
                if file.endswith('.hdf5'):
                    cls._load_file(os.path.join(path, file))
        # load single hdf5 file
        else:
            cls._load_file(path)

    @classmethod
    def from_hdf5_group(cls, hash_val: str, hdf5_group: h5py.Group) -> None:
        '''
        Create a `TheoryCorrelations` instance from an HDF5 group.

        Parameters
        ----------
        hash_val : str
            The hash value representing the combination of row and column observable names.
        hdf5_group : h5py.Group
            The HDF5 group containing the correlation data.

        Returns
        -------
        None

        Examples
        --------
        Create a `TheoryCorrelations` instance from an HDF5 group:

        >>> TheoryCorrelations.from_hdf5_group('hash_value', hdf5_group)
        '''
        data = hdf5_group['data']
        data = np.array(data[()], dtype=np.float64) * data.attrs.get('scale', 1.0)
        row_names = {name: i for i, name in enumerate(hdf5_group['row_names'][()].astype(str))}
        col_names = {name: i for i, name in enumerate(hdf5_group['col_names'][()].astype(str))}
        cls(hash_val, data, row_names, col_names)

    @classmethod
    def get_data(
        cls,
        row_names: Iterable[str],
        col_names: Iterable[str],
    ):
        '''
        Get the correlation data for the specified row and column observable names.

        Parameters
        ----------
        row_names : Iterable[str]
            The names of the row observables.
        col_names : Iterable[str]
            The names of the column observables.

        Returns
        -------
        np.ndarray or None
            The correlation data array if found, otherwise None.

        Examples
        --------
        Get correlation data for specific row and column observable names:

        >>> data = TheoryCorrelations.get_data(['obs1', 'obs2'], ['obs3', 'obs4'])
        '''
        hash_val = hash_names(row_names, col_names)
        if hash_val in cls._correlations:
            data = cls._correlations[hash_val].data
        else:
            hash_val = hash_names(col_names, row_names)
            if hash_val in cls._correlations:
                data = np.moveaxis(
                    cls._correlations[hash_val].data,
                    [0,1,2,3], [1,0,3,2]
                )
            else:
                data = None
        return data

    @classmethod
    def get_cov_scaled(
        cls,
        include_measurements: Iterable[str],
        row_names: Iterable[str],
        col_names: Iterable[str],
        std_th_scaled_row: np.ndarray,
        std_th_scaled_col: np.ndarray,
    ):
        '''
        Get the scaled covariance matrix for the specified measurements, and row and column observable names.

        Parameters
        ----------
        include_measurements : Iterable[str]
            The names of the measurements to include.
        row_names : Iterable[str]
            The names of the row observables.
        col_names : Iterable[str]
            The names of the column observables.
        std_th_scaled_row : np.ndarray
            The standard deviations for the row observables.
        std_th_scaled_col : np.ndarray
            The standard deviations for the column observables.

        Returns
        -------
        jnp.ndarray
            The scaled covariance matrix.

        Examples
        --------
        Get scaled covariance matrix for specific measurements and observable names:

        >>> cov_scaled = TheoryCorrelations.get_cov_scaled(
        ...     include_measurements=['meas1', 'meas2'],
        ...     row_names=['obs1', 'obs2'],
        ...     col_names=['obs3', 'obs4'],
        ...     std_th_scaled_row=np.array([[0.1, 0.2], [0.3, 0.4]]),
        ...     std_th_scaled_col=np.array([[0.5, 0.6], [0.7, 0.8]])
        '''
        row_measurements = Measurement.get_measurements(row_names, include_measurements=include_measurements)
        col_measurements = Measurement.get_measurements(col_names, include_measurements=include_measurements)
        hash_val = hash_names(row_measurements, col_measurements, row_names, col_names)
        if hash_val in cls._covariance_scaled:
            cov_scaled = cls._covariance_scaled[hash_val]
        else:
            corr = cls.get_data(row_names, col_names)
            if corr is None:
                raise ValueError(f"Correlation data for {row_names} and {col_names} not found.")
            cov_scaled = corr * np.einsum('ki,lj->ijkl', std_th_scaled_row, std_th_scaled_col)
            cov_scaled = jnp.array(cov_scaled, dtype=jnp.float64)
            cls._covariance_scaled[hash_val] = cov_scaled
        return cov_scaled

__init__(hash_val, data, row_names, col_names)

Initialize an instance of the TheoryCorrelations class.

Parameters:

Name Type Description Default
hash_val str

The hash value representing the combination of row and column observable names.

required
data ndarray

The data array containing the correlation data.

required
row_names Dict[str, int]

A dictionary mapping row observable names to their indices.

required
col_names Dict[str, int]

A dictionary mapping column observable names to their indices.

required

Returns:

Type Description
None

Examples:

Initialize a TheoryCorrelations instance:

>>> theory_corr = TheoryCorrelations(...)
Source code in jelli/core/theory_correlations.py
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
def __init__(
    self,
    hash_val: str,
    data: np.ndarray,
    row_names: Dict[str, int],
    col_names: Dict[str, int]
) -> None:
    '''
    Initialize an instance of the `TheoryCorrelations` class.

    Parameters
    ----------
    hash_val : str
        The hash value representing the combination of row and column observable names.
    data : np.ndarray
        The data array containing the correlation data.
    row_names : Dict[str, int]
        A dictionary mapping row observable names to their indices.
    col_names : Dict[str, int]
        A dictionary mapping column observable names to their indices.

    Returns
    -------
    None

    Examples
    --------
    Initialize a TheoryCorrelations instance:

    >>> theory_corr = TheoryCorrelations(...)
    '''
    self.hash_val = hash_val
    self.data = data
    self.row_names = row_names
    self.col_names = col_names
    self._correlations[hash_val] = self

_load_file(path) classmethod

Load theory correlations from a single HDF5 file.

Parameters:

Name Type Description Default
path str

The path to the HDF5 file.

required

Returns:

Type Description
None

Examples:

Load theory correlations from a single HDF5 file:

>>> TheoryCorrelations._load_file('path/to/file.hdf5')
Source code in jelli/core/theory_correlations.py
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
@classmethod
def _load_file(cls, path: str) -> None:
    '''
    Load theory correlations from a single HDF5 file.

    Parameters
    ----------
    path : str
        The path to the HDF5 file.

    Returns
    -------
    None

    Examples
    --------
    Load theory correlations from a single HDF5 file:

    >>> TheoryCorrelations._load_file('path/to/file.hdf5')
    '''
    with h5py.File(path, 'r') as f:
        schema_name, schema_version = get_json_schema(dict(f.attrs))
        if schema_name == 'popxf-h5' and schema_version in cls._popxf_h5_versions:
            for hash_val in f:
                cls.from_hdf5_group(hash_val, f[hash_val])

from_hdf5_group(hash_val, hdf5_group) classmethod

Create a TheoryCorrelations instance from an HDF5 group.

Parameters:

Name Type Description Default
hash_val str

The hash value representing the combination of row and column observable names.

required
hdf5_group Group

The HDF5 group containing the correlation data.

required

Returns:

Type Description
None

Examples:

Create a TheoryCorrelations instance from an HDF5 group:

>>> TheoryCorrelations.from_hdf5_group('hash_value', hdf5_group)
Source code in jelli/core/theory_correlations.py
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
@classmethod
def from_hdf5_group(cls, hash_val: str, hdf5_group: h5py.Group) -> None:
    '''
    Create a `TheoryCorrelations` instance from an HDF5 group.

    Parameters
    ----------
    hash_val : str
        The hash value representing the combination of row and column observable names.
    hdf5_group : h5py.Group
        The HDF5 group containing the correlation data.

    Returns
    -------
    None

    Examples
    --------
    Create a `TheoryCorrelations` instance from an HDF5 group:

    >>> TheoryCorrelations.from_hdf5_group('hash_value', hdf5_group)
    '''
    data = hdf5_group['data']
    data = np.array(data[()], dtype=np.float64) * data.attrs.get('scale', 1.0)
    row_names = {name: i for i, name in enumerate(hdf5_group['row_names'][()].astype(str))}
    col_names = {name: i for i, name in enumerate(hdf5_group['col_names'][()].astype(str))}
    cls(hash_val, data, row_names, col_names)

get_cov_scaled(include_measurements, row_names, col_names, std_th_scaled_row, std_th_scaled_col) classmethod

Get the scaled covariance matrix for the specified measurements, and row and column observable names.

Parameters:

Name Type Description Default
include_measurements Iterable[str]

The names of the measurements to include.

required
row_names Iterable[str]

The names of the row observables.

required
col_names Iterable[str]

The names of the column observables.

required
std_th_scaled_row ndarray

The standard deviations for the row observables.

required
std_th_scaled_col ndarray

The standard deviations for the column observables.

required

Returns:

Type Description
ndarray

The scaled covariance matrix.

Examples:

Get scaled covariance matrix for specific measurements and observable names:

>>> cov_scaled = TheoryCorrelations.get_cov_scaled(
...     include_measurements=['meas1', 'meas2'],
...     row_names=['obs1', 'obs2'],
...     col_names=['obs3', 'obs4'],
...     std_th_scaled_row=np.array([[0.1, 0.2], [0.3, 0.4]]),
...     std_th_scaled_col=np.array([[0.5, 0.6], [0.7, 0.8]])
Source code in jelli/core/theory_correlations.py
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
@classmethod
def get_cov_scaled(
    cls,
    include_measurements: Iterable[str],
    row_names: Iterable[str],
    col_names: Iterable[str],
    std_th_scaled_row: np.ndarray,
    std_th_scaled_col: np.ndarray,
):
    '''
    Get the scaled covariance matrix for the specified measurements, and row and column observable names.

    Parameters
    ----------
    include_measurements : Iterable[str]
        The names of the measurements to include.
    row_names : Iterable[str]
        The names of the row observables.
    col_names : Iterable[str]
        The names of the column observables.
    std_th_scaled_row : np.ndarray
        The standard deviations for the row observables.
    std_th_scaled_col : np.ndarray
        The standard deviations for the column observables.

    Returns
    -------
    jnp.ndarray
        The scaled covariance matrix.

    Examples
    --------
    Get scaled covariance matrix for specific measurements and observable names:

    >>> cov_scaled = TheoryCorrelations.get_cov_scaled(
    ...     include_measurements=['meas1', 'meas2'],
    ...     row_names=['obs1', 'obs2'],
    ...     col_names=['obs3', 'obs4'],
    ...     std_th_scaled_row=np.array([[0.1, 0.2], [0.3, 0.4]]),
    ...     std_th_scaled_col=np.array([[0.5, 0.6], [0.7, 0.8]])
    '''
    row_measurements = Measurement.get_measurements(row_names, include_measurements=include_measurements)
    col_measurements = Measurement.get_measurements(col_names, include_measurements=include_measurements)
    hash_val = hash_names(row_measurements, col_measurements, row_names, col_names)
    if hash_val in cls._covariance_scaled:
        cov_scaled = cls._covariance_scaled[hash_val]
    else:
        corr = cls.get_data(row_names, col_names)
        if corr is None:
            raise ValueError(f"Correlation data for {row_names} and {col_names} not found.")
        cov_scaled = corr * np.einsum('ki,lj->ijkl', std_th_scaled_row, std_th_scaled_col)
        cov_scaled = jnp.array(cov_scaled, dtype=jnp.float64)
        cls._covariance_scaled[hash_val] = cov_scaled
    return cov_scaled

get_data(row_names, col_names) classmethod

Get the correlation data for the specified row and column observable names.

Parameters:

Name Type Description Default
row_names Iterable[str]

The names of the row observables.

required
col_names Iterable[str]

The names of the column observables.

required

Returns:

Type Description
ndarray or None

The correlation data array if found, otherwise None.

Examples:

Get correlation data for specific row and column observable names:

>>> data = TheoryCorrelations.get_data(['obs1', 'obs2'], ['obs3', 'obs4'])
Source code in jelli/core/theory_correlations.py
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
@classmethod
def get_data(
    cls,
    row_names: Iterable[str],
    col_names: Iterable[str],
):
    '''
    Get the correlation data for the specified row and column observable names.

    Parameters
    ----------
    row_names : Iterable[str]
        The names of the row observables.
    col_names : Iterable[str]
        The names of the column observables.

    Returns
    -------
    np.ndarray or None
        The correlation data array if found, otherwise None.

    Examples
    --------
    Get correlation data for specific row and column observable names:

    >>> data = TheoryCorrelations.get_data(['obs1', 'obs2'], ['obs3', 'obs4'])
    '''
    hash_val = hash_names(row_names, col_names)
    if hash_val in cls._correlations:
        data = cls._correlations[hash_val].data
    else:
        hash_val = hash_names(col_names, row_names)
        if hash_val in cls._correlations:
            data = np.moveaxis(
                cls._correlations[hash_val].data,
                [0,1,2,3], [1,0,3,2]
            )
        else:
            data = None
    return data

load(path) classmethod

Load theory correlations from HDF5 files in the specified path.

Parameters:

Name Type Description Default
path str

The path to a directory containing HDF5 files or a single HDF5 file.

required

Returns:

Type Description
None

Examples:

Load theory correlations from HDF5 files in a directory:

>>> TheoryCorrelations.load('path/to/directory')

Load theory correlations from a single HDF5 file:

>>> TheoryCorrelations.load('path/to/file.hdf5')
Source code in jelli/core/theory_correlations.py
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
@classmethod
def load(cls, path: str) -> None:
    '''
    Load theory correlations from HDF5 files in the specified path.

    Parameters
    ----------
    path : str
        The path to a directory containing HDF5 files or a single HDF5 file.

    Returns
    -------
    None

    Examples
    --------
    Load theory correlations from HDF5 files in a directory:

    >>> TheoryCorrelations.load('path/to/directory')

    Load theory correlations from a single HDF5 file:

    >>> TheoryCorrelations.load('path/to/file.hdf5')
    '''
    # load all hdf5 files in the directory
    if os.path.isdir(path):
        for file in os.listdir(path):
            if file.endswith('.hdf5'):
                cls._load_file(os.path.join(path, file))
    # load single hdf5 file
    else:
        cls._load_file(path)