Python Dts Calibration

dtscalibration
Release 0.6.3
Apr 09, 2019

Contents
1 Overview 1
1.1 Installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Learn by examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2 Installation 3
3 Usage 5
4 Learn by Examples 7
4.1 1. Load your first measurement files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
4.2 2. Common DataStore functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
4.3 3. Define calibration sections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
4.4 4. Calculate variance of Stokes and anti-Stokes measurements . . . . . . . . . . . . . . . . . . . . . 13
4.5 5. Calibration of double ended measurement with OLS . . . . . . . . . . . . . . . . . . . . . . . . . 17
4.6 6. Calibration of double ended measurement with OLS . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.7 7. Calibration of single ended measurement with WLS and confidence intervals . . . . . . . . . . . . 23
4.8 8. Calibration of double ended measurement with WLS and confidence intervals . . . . . . . . . . . 30
4.9 9. Import a time series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
4.10 10. Align double ended measurements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
5 Reference 41
5.1 dtscalibration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
6 Contributing 53
6.1 Bug reports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
6.2 Documentation improvements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
6.3 Feature requests and feedback . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
6.4 Development . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
7 Authors 55
8 Changelog 57
8.1 0.6.3 (2019-04-03) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
8.2 0.6.2 (2019-02-26) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
8.3 0.6.1 (2019-01-04) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
8.4 0.6.0 (2018-12-08) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
i
8.5 0.5.3 (2018-10-26) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
8.6 0.5.2 (2018-10-26) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
8.7 0.5.1 (2018-10-19) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
8.8 0.4.0 (2018-09-06) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
8.9 0.2.0 (2018-08-16) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
8.10 0.1.0 (2018-08-01) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
9 Indices and tables 61
Python Module Index 63
ii
CHAPTER 1
Overview
docs
tests
package
citable
Example notebooks
A Python package to load raw DTS files, perform a calibration, and plot the result
• Free software: BSD 3-Clause License
1.1 Installation
pip install dtscalibration
Or the development version directly from GitHub
pip install https://github.com/dtscalibration/python-dts-calibration/zipball/master --

˓→upgrade
1
dtscalibration, Release 0.6.3
1.2 Learn by examples
Interactively run the example notebooks online by clicking the launch-binder button.
1.3 Documentation
https://python-dts-calibration.readthedocs.io/
2 Chapter 1. Overview
CHAPTER 2
Installation
At the command line:
pip install dtscalibration
3
4 Chapter 2. Installation
CHAPTER 3
Usage
To use dtscalibration in a project:
import dtscalibration
5
6 Chapter 3. Usage
CHAPTER 4
Learn by Examples
4.1 1. Load your first measurement files
This notebook is located in https://github.com/bdestombe/python-dts-calibration/tree/master/examples/notebooks

The goal of this notebook is to show the different options of loading measurements from raw DTS files. These files
are loaded into a DataStore object. This object has various methods for calibration, plotting. The current supported
devices are: - Silixa - Sensornet
This example loads Silixa files. Both single-ended and double-ended measurements are supported. The first step is to
load the correct read routine from dtscalibration. - Silixa -> dtscalibration.read_silixa_files -
Sensornet -> dtscalibration.read_sensornet_files
import os
import glob
from dtscalibration import read_silixa_files
The example data files are located in ./python-dts-calibration/tests/data.
filepath = os.path.join('..', '..', 'tests', 'data', 'double_ended2')

print(filepath)
../../tests/data/double_ended2
# Bonus: Just to show which files are in the folder

filepathlist = sorted(glob.glob(os.path.join(filepath, '*.xml')))
filenamelist = [os.path.basename(path) for path in filepathlist]
for fn in filenamelist:
print(fn)
7
channel 1_20180328014052498.xml
channel 1_20180328014057119.xml
channel 1_20180328014101652.xml
channel 1_20180328014106243.xml
channel 1_20180328014110917.xml
channel 1_20180328014115480.xml
Define in which timezone the measurements are taken. In this case it is the timezone of the Silixa Ultima computer
was set to ‘Europe/Amsterdam’. The default timezone of netCDF files is UTC. All the steps after loading the raw files
are performed in this timezone. Please see www..com for a full list of supported timezones. We also explicitely define
the file extension (.xml) because the folder is polluted with files other than measurement files.
ds = read_silixa_files(directory=filepath,
timezone_netcdf='UTC',
file_ext='*.xml')
6 files were found, each representing a single timestep

6 recorded vars were found: LAF, ST, AST, REV-ST, REV-AST, TMP
Recorded at 1693 points along the cable
The measurement is double ended
Reading the data from disk
The object tries to gather as much metadata from the measurement files as possible (temporal and spatial coordinates,
filenames, temperature probes measurements). All other configuration settings are loaded from the first files and stored
as attributes of the DataStore.
print(ds)
<dtscalibration.DataStore>
Sections: ()
Dimensions: (time: 6, x: 1693)
Coordinates:
* x (x) float64 -80.5 -80.38 -80.25 ... 134.3 134.4 134.5
filename (time) <U31 'channel 1_20180328014052498.xml' ... 'channel
˓→1_20180328014115480.xml'
filename_tstamp (time) int64 20180328014052498 ... 20180328014115480

timeFWstart (time) datetime64[ns] 2018-03-28T00:40:52.097000 ... 2018-
˓→03-28T00:41:15.061000
timeFWend (time) datetime64[ns] 2018-03-28T00:40:54.097000 ... 2018-

˓→03-28T00:41:17.061000
timeFW (time) datetime64[ns] 2018-03-28T00:40:53.097000 ... 2018-

˓→03-28T00:41:16.061000
timeBWstart (time) datetime64[ns] 2018-03-28T00:40:54.097000 ... 2018-

˓→03-28T00:41:17.061000
timeBWend (time) datetime64[ns] 2018-03-28T00:40:56.097000 ... 2018-

˓→03-28T00:41:19.061000
timeBW (time) datetime64[ns] 2018-03-28T00:40:55.097000 ... 2018-

˓→03-28T00:41:18.061000
timestart (time) datetime64[ns] 2018-03-28T00:40:52.097000 ... 2018-

˓→03-28T00:41:15.061000
timeend (time) datetime64[ns] 2018-03-28T00:40:56.097000 ... 2018-

˓→03-28T00:41:19.061000
* time (time) datetime64[ns] 2018-03-28T00:40:54.097000 ... 2018-

˓→03-28T00:41:17.061000
acquisitiontimeFW (time) timedelta64[ns] 00:00:02 00:00:02 ... 00:00:02

acquisitiontimeBW (time) timedelta64[ns] 00:00:02 00:00:02 ... 00:00:02
(continues on next page)
8 Chapter 4. Learn by Examples

(continued from previous page)

Data variables:
ST (x, time) float64 1.281 -0.5321 ... -43.44 -41.08
AST (x, time) float64 0.4917 1.243 ... -30.14 -32.09
REV-ST (x, time) float64 0.4086 -0.568 ... 4.822e+03
REV-AST (x, time) float64 2.569 -1.603 ... 4.224e+03
TMP (x, time) float64 196.1 639.1 218.7 ... 8.442 18.47
acquisitionTime (time) float32 2.098 2.075 2.076 2.133 2.085 2.062
referenceTemperature (time) float32 21.0536 21.054 ... 21.0531 21.057
probe1Temperature (time) float32 4.36149 4.36025 ... 4.36021 4.36118
referenceProbeVoltage (time) float32 0.121704 0.121704 ... 0.121705
probe1Voltage (time) float32 0.114 0.114 0.114 0.114 0.114 0.114
userAcquisitionTimeFW (time) float32 2.0 2.0 2.0 2.0 2.0 2.0
userAcquisitionTimeBW (time) float32 2.0 2.0 2.0 2.0 2.0 2.0
Attributes:
uid: ...
nameWell: ...
nameWellbore: ...
name: ...
indexType: ...
startIndex:uom: ...
startIndex:#text: ...
endIndex:uom: ...
endIndex:#text: ...
.. and many more attributes. See: ds.attrs
4.2 2. Common DataStore functions
Examples of how to do some of the more commonly used functions:

1. mean, min, max, std
2. Selecting
3. Selecting by index
4. Downsample (time dimension)
5. Upsample / Interpolation (length and time dimension)
import os
First we load the raw measurements into a DataStore object, as we learned from the previous notebook.
filepath = os.path.join('..', '..', 'tests', 'data', 'single_ended')
ds = read_silixa_files(
directory=filepath,
file_ext='*.xml')
4.2. 2. Common DataStore functions 9


4 recorded vars were found: LAF, ST, AST, TMP
The measurement is single ended
4.2.1 0 Access the data
The implemented read routines try to read as much data from the raw DTS files as possible. Usually they would have
coordinates (time and space) and Stokes and anti Stokes measurements. We can access the data by key. It is presented
as a DataArray. More examples are found at http://xarray.pydata.org/en/stable/indexing.html
ds['ST'] # is the data stored, presented as a DataArray
<xarray.DataArray 'ST' (x: 1461, time: 3)>

array([[-8.05791e-01, 4.28741e-01, -5.13021e-01],
[-4.58870e-01, -1.24484e-01, 9.68469e-03],
[ 4.89174e-01, -9.57734e-02, 5.62837e-02],
...,
[ 4.68457e+01, 4.72201e+01, 4.79139e+01],
[ 3.76634e+01, 3.74649e+01, 3.83160e+01],
[ 2.79879e+01, 2.78331e+01, 2.88055e+01]])
Coordinates:
* x (x) float64 -80.74 -80.62 -80.49 ... 104.6 104.7 104.8
filename (time) <U31 'channel 2_20180504132202074.xml' ... 'channel 2_
˓→20180504132303723.xml'
filename_tstamp (time) int64 20180504132202074 ... 20180504132303723

timestart (time) datetime64[ns] 2018-05-04T12:22:02.710000 ... 2018-05-
˓→04T12:23:03.716000
timeend (time) datetime64[ns] 2018-05-04T12:22:32.710000 ... 2018-05-

˓→04T12:23:33.716000
* time (time) datetime64[ns] 2018-05-04T12:22:17.710000 ... 2018-05-

˓→04T12:23:18.716000
acquisitiontimeFW (time) timedelta64[ns] 00:00:30 00:00:30 00:00:30

Attributes:
name: ST
description: Stokes intensity
units: -
ds['TMP'].plot(figsize=(12, 8));
4.2.2 1 mean, min, max
The first argument is the dimension. The function is taken along that dimension. dim can be any dimension (e.g.,
time, x). The returned DataStore does not contain that dimension anymore.
Normally, you would like to keep the attributes (the informative texts from the loaded files), so set keep_attrs to
True. They don’t take any space compared to your Stokes data, so keep them.
Note that also the sections are stored as attribute. If you delete the attributes, you would have to redefine the sections.
ds_min = ds.mean(dim='time', keep_attrs=True) # take the minimum of all data

˓→variables (e.g., Stokes, Temperature) along the time dimension

ds_max = ds.max(dim='x', keep_attrs=True) # Take the maximum of all data variables

˓→(e.g., Stokes, Temperature) along the x dimension
ds_std = ds.std(dim='time', keep_attrs=True) # Calculate the standard deviation

˓→along the time dimension
4.2.3 2 Selecting
What if you would like to get the maximum temperature between 𝑥 >= 20 m and 𝑥 < 35 m over time? We first have
to select a section along the cable.
section = slice(20., 35.)

section_of_interest = ds.sel(x=section)
section_of_interest_max = section_of_interest.max(dim='x')
What if you would like to have the measurement at approximately 𝑥 = 20 m?
point_of_interest = ds.sel(x=20., method='nearest')
4.2.4 3 Selecting by index
What if you would like to see what the values on the first timestep are? We can use isel (index select)
section_of_interest = ds.isel(time=slice(0, 2)) # The first two time steps
section_of_interest = ds.isel(x=0)
4.2.5 4 Downsample (time dimension)
We currently have measurements at 3 time steps, with 30.001 seconds inbetween. For our next exercise we would like
to down sample the measurements to 2 time steps with 47 seconds inbetween. The calculated variances are not valid
anymore. We use the function resample_datastore.
ds_resampled = ds.resample_datastore(how='mean', time="47S")
4.2.6 5 Upsample / Interpolation (length and time dimension)
So we have measurements every 0.12 cm starting at 𝑥 = 0 m. What if we would like to change our coordinate system
to have a value every 12 cm starting at 𝑥 = 0.05 m. We use (linear) interpolation, extrapolation is not supported. The
calculated variances are not valid anymore.
x_old = ds.x.data
x_new = x_old[:-1] + 0.05 # no extrapolation
ds_xinterped = ds.interp(coords={'x': x_new})
We can do the same in the time dimension
4.2. 2. Common DataStore functions 11

import numpy as np
time_old = ds.time.data
time_new = time_old + np.timedelta64(10, 's')
ds_tinterped = ds.interp(coords={'time': time_new})
4.3 3. Define calibration sections
The goal of this notebook is to show how you can define calibration sections. That means that we define certain parts
of the fiber to a timeseries of temperature measurements. Here, we assume the temperature timeseries is already part
of the DataStore object.
import os

directory=filepath,
file_ext='*.xml')

First we have a look at which temperature timeseries are available for calibration. Therefore we access ds.
data_vars and we find probe1Temperature and probe2Temperature that refer to the temperature mea-
surement timeseries of the two probes attached to the Ultima.
Alternatively, we can access the ds.timeseries_keys property to list all timeseries that can be used for calibra-
tion.
print(ds.timeseries_keys) # list the available timeseeries
ds.probe1Temperature.plot(figsize=(12, 8)); # plot one of the timeseries
['acquisitionTime', 'referenceTemperature', 'probe1Temperature', 'probe2Temperature',

˓→'referenceProbeVoltage', 'probe1Voltage', 'probe2Voltage', 'userAcquisitionTimeFW',
˓→'userAcquisitionTimeBW']
/Users/bfdestombe/Projects/dts-calibration/python-dts-calibration/.tox/docs/lib/
˓→python3.6/site-packages/pandas/plotting/_converter.py:129: FutureWarning: Using an
˓→implicitly registered datetime converter for a matplotlib plotting method. The
˓→converter was registered by pandas on import. Future versions of pandas will
˓→require you to explicitly register matplotlib converters.
To register the converters:

>>> from pandas.plotting import register_matplotlib_converters
>>> register_matplotlib_converters()
warnings.warn(msg, FutureWarning)
A calibration is needed to estimate temperature from Stokes and anti-Stokes measurements. There are three unknowns
for a single ended calibration procedure 𝛾, 𝐶, and 𝛼. The parameters 𝛾 and 𝛼 remain constant over time, while 𝐶 may

vary.
At least two calibration sections of different temperatures are needed to perform a decent calibration procedure.
This setup has two baths, named ‘cold’ and ‘warm’. Each bath has 2 sections. probe1Temperature is the
temperature timeseries of the cold bath and probe2Temperature is the temperature timeseries of the warm bath.
Name sec- Name reference temperature time se- Number of sec- Location of sections
tion ries tions (m)
Cold bath probe1Temperature 2 7.5-17.0; 70.0-80.0
Warm bath probe2Temperature 2 24.0-34.0; 85.0-95.0
Sections are defined in a dictionary with its keywords of the names of the reference temperature time series. Its values
are lists of slice objects, where each slice object is a section.
Note that slice is part of the standard Python library and no import is required.
sections = {
'probe1Temperature': [slice(7.5, 17.), slice(70., 80.)], # cold bath
'probe2Temperature': [slice(24., 34.), slice(85., 95.)], # warm bath
}
ds.sections = sections
ds.sections
{'probe1Temperature': [slice(7.5, 17.0, None), slice(70.0, 80.0, None)],

'probe2Temperature': [slice(24.0, 34.0, None), slice(85.0, 95.0, None)]}
NetCDF files do not support reading/writing python dictionaries. Internally the sections dictionary is stored in ds.
_sections as a string encoded with yaml, which can be saved to a netCDF file. Each time the sections dictionary is
requested, yaml decodes the string and evaluates it to the Python dictionary.
4.4 4. Calculate variance of Stokes and anti-Stokes measurements
The goal of this notebook is to estimate the variance of the noise of the Stokes measurement. The measured Stokes
and anti-Stokes signals contain noise that is distributed approximately normal. We need to estimate the variance of the
noise to: - Perform a weighted calibration - Construct confidence intervals
import os

from matplotlib import pyplot as plt
%matplotlib inline
directory=filepath,
file_ext='*.xml')
4.4. 4. Calculate variance of Stokes and anti-Stokes measurements 13


And we define the sections as we learned from the previous notebook. Sections are required to calculate the variance
in the Stokes.
sections = {
}
Lets first read the documentation about the ds.variance_stokes method.
print(ds.variance_stokes.__doc__)
Calculates the variance between the measurements and a best fit

at each reference section. This fits a function to the nt * nx
measurements with ns * nt + nx parameters, where nx are the total
number of obervation locations along all sections. The temperature is
constant along the reference sections, so the expression of the
Stokes power can be split in a time series per reference section and
a constant per observation location.
Assumptions: 1) the temperature is the same along a reference

section.
Idea from discussion at page 127 in Richter, P. H. (1995). Estimating

errors in least-squares fitting.
Parameters
----------
reshape_residuals
st_label : str
label of the Stokes, anti-Stokes measurement.
E.g., ST, AST, REV-ST, REV-AST
sections : dict, optional
Define sections. See documentation
Returns
-------
I_var : float
Variance of the residuals between measured and best fit
resid : array_like
Residuals between measured and best fit
Notes
-----
Because there are a large number of unknowns, spend time on
calculating an initial estimate. Can be turned off by setting to False.
I_var, residuals = ds.variance_stokes(st_label='ST')

print("The variance of the Stokes signal along the reference sections "


"is approximately {} on a {} sec acquisition time".format(I_var, ds.
˓→userAcquisitionTimeFW.data[0]))
The variance of the Stokes signal along the reference sections is approximately 8.
˓→181920419777416 on a 2.0 sec acquisition time
from dtscalibration import plot
fig_handle = plot.plot_residuals_reference_sections(
residuals,
sections,
title='Distribution of the noise in the Stokes signal',
plot_avg_std=I_var ** 0.5,
plot_names=True,
robust=True,
units='',
method='single')
˓→python3.6/site-packages/numpy/lib/nanfunctions.py:1628: RuntimeWarning: Degrees of
˓→freedom <= 0 for slice.
keepdims=keepdims)
˓→python3.6/site-packages/xarray/core/nanops.py:161: RuntimeWarning: Mean of empty
˓→slice
return np.nanmean(a, axis=axis, dtype=dtype)
4.4. 4. Calculate variance of Stokes and anti-Stokes measurements 15

The residuals should be normally distributed and independent from previous time steps and other points along the
cable. If you observe patterns in the residuals plot (above), it might be caused by: - The temperature in the calibration
bath is not uniform - Attenuation caused by coils/sharp bends in cable - Attenuation caused by a splice
import scipy
import numpy as np
sigma = residuals.std()
mean = residuals.mean()
x = np.linspace(mean - 3*sigma, mean + 3*sigma, 100)
approximated_normal_fit = scipy.stats.norm.pdf(x, mean, sigma)
residuals.plot.hist(bins=50, figsize=(12, 8), density=True)
plt.plot(x, approximated_normal_fit);

We can follow the same steps to calculate the variance from the noise in the anti-Stokes measurments by setting
st_label='AST and redo the steps.
4.5 5. Calibration of double ended measurement with OLS
A double ended calibration is performed with Ordinary Least Squares. Over all timesteps simultaneous. 𝛾 and 𝛼
remain constant, while 𝐶 varies over time. The weights are considered equal here and no variance or confidence
interval is calculated.
Note that the internal reference section can not be used since there is a connector between the internal and external
fiber and therefore the integrated differential attenuation cannot be considered to be linear anymore.
import os

import matplotlib.pyplot as plt
%matplotlib inline
directory=filepath,
file_ext='*.xml')
4.5. 5. Calibration of double ended measurement with OLS 17


ds100 = ds.sel(x=slice(-30, 101)) # only calibrate parts of the fiber, in meters
sections = {
'probe1Temperature': [slice(20, 25.5)], # warm bath
'probe2Temperature': [slice(5.5, 15.5)], # cold bath
}
ds100.sections = sections

print(ds100.calibration_single_ended.__doc__)
Parameters
----------
store_p_cov : str
Key to store the covariance matrix of the calibrated parameters
store_p_val : str
Key to store the values of the calibrated parameters
nt : int, optional
Number of timesteps. Should be defined if method=='external'
z : array-like, optional
Distances. Should be defined if method=='external'
p_val
p_var
p_cov
st_label : str
Label of the forward stokes measurement
ast_label : str
Label of the anti-Stoke measurement
st_var : float, optional
The variance of the measurement noise of the Stokes signals in
the forward
direction Required if method is wls.
ast_var : float, optional
The variance of the measurement noise of the anti-Stokes signals
in the forward
direction. Required if method is wls.
store_c : str
Label of where to store C
store_gamma : str
Label of where to store gamma
store_dalpha : str
Label of where to store dalpha; the spatial derivative of alpha.
store_alpha : str
Label of where to store alpha; The integrated differential
attenuation.
alpha(x=0) = 0
store_tmpf : str
Label of where to store the calibrated temperature of the forward
direction
variance_suffix : str, optional


String appended for storing the variance. Only used when method
is wls.
method : {'ols', 'wls'}
Use 'ols' for ordinary least squares and 'wls' for weighted least
squares
solver : {'sparse', 'stats'}
Either use the homemade weighted sparse solver or the weighted
dense matrix solver of
statsmodels
Returns
-------
ds100.calibration_single_ended(st_label='ST',
ast_label='AST',
method='ols')
Lets compare our calibrated values with the device calibration

ds1 = ds100.isel(time=0) # take only the first timestep
ds1.TMPF.plot(linewidth=1, figsize=(12, 8), label='User calibrated') # plot the

˓→temperature calibrated by us
ds1.TMP.plot(linewidth=1, label='Device calibrated') # plot the temperature

˓→calibrated by the device
plt.title('Temperature at the first time step')

plt.legend();

4.6 6. Calibration of double ended measurement with OLS

∫︀ 𝑙
A double ended calibration is performed with ordinary least squares. Over all timesteps simultaneous. 𝛾 and 0 𝛼d𝑥
remain constant, while 𝐶 varies over time. The weights are considered equal here and no variance is calculated.
Before starting the calibration procedure, the forward and the backward channel should be aligned.
import os

%matplotlib inline
directory=filepath,
file_ext='*.xml')
ds100 = ds.sel(x=slice(0, 100)) # only calibrate parts of the fiber

sections = {
}

print(ds100.calibration_double_ended.__doc__)
Parameters
----------
store_p_cov
store_p_val
nt
z
p_val
p_var
p_cov
st_label : str
ast_label : str
rst_label : str
Label of the reversed Stoke measurement
rast_label : str
Label of the reversed anti-Stoke measurement
the forward


in the forward
rst_var : float, optional
the backward
rast_var : float, optional
in the backward
store_d : str
Label of where to store D. Equals the integrated differential
attenuation at x=0
And should be equal to half the total integrated differential
attenuation.
store_gamma : str
store_alpha : str
Label of where to store alpha
store_tmpf : str
direction
store_tmpb : str
Label of where to store the calibrated temperature of the
backward direction
store_tmpw : str
tmpw_mc_size : int
is wls.
method : {'ols', 'wls', 'external'}
squares
statsmodels
Returns
-------
st_label = 'ST'
ast_label = 'AST'
rst_label = 'REV-ST'
rast_label = 'REV-AST'
ds100.calibration_double_ended(sections=sections,
st_label=st_label,
ast_label=ast_label,
rst_label=rst_label,
rast_label=rast_label,
method='ols')
After calibration, two data variables are added to the DataStore object: - TMPF, temperature calculated along the
forward direction - TMPB, temperature calculated along the backward direction

A better estimate, with a lower expected variance, of the temperature along the fiber is the average of the two. We
cannot weigh on more than the other, as we do not have more information about the weighing.
ds1 = ds100.isel(time=0) # take only the first timestep
ds1.TMPF.plot(linewidth=1, label='User cali. Forward', figsize=(12, 8)) # plot the

ds1.TMPB.plot(linewidth=1, label='User cali. Backward') # plot the temperature

˓→calibrated by us
ds1.TMP.plot(linewidth=1, label='Device calibrated') # plot the temperature

plt.legend();
Lets compare our calibrated values with the device calibration. Lets average the temperature of the forward channel
and the backward channel first.
ds1['TMPAVG'] = (ds1.TMPF + ds1.TMPB) / 2

ds1_diff = ds1.TMP - ds1.TMPAVG
ds1_diff.plot(figsize=(12, 8));

The device calibration sections and calibration sections defined by us differ. The device only allows for 2 sections,
one per thermometer. And most likely the 𝛾 is fixed in the device calibration.
4.7 7. Calibration of single ended measurement with WLS and confi-

dence intervals
A single ended calibration is performed with weighted least squares. Over all timesteps simultaneous. 𝛾 and 𝛼 remain
constant, while 𝐶 varies over time. The weights are not considered equal here. The weights kwadratically decrease
with the signal strength of the measured Stokes and anti-Stokes signals.
The confidence intervals can be calculated as the weights are correctly defined.
The confidence intervals consist of two sources of uncertainty.
1. Measurement noise in the measured Stokes and anti-Stokes signals. Expressed in a single variance value.
2. Inherent to least squares procedures / overdetermined systems, the parameters are estimated with limited cer-
tainty and all parameters are correlated. Which is expressen in the covariance matrix.
Both sources of uncertainty are propagated to an uncertainty in the estimated temperature via Monte Carlo.
import os

%matplotlib inline
4.7. 7. Calibration of single ended measurement with WLS and confidence intervals 23

directory=filepath,
file_ext='*.xml')
ds = ds.sel(x=slice(-30, 101)) # only calibrate parts of the fiber

sections = {
'probe1Temperature': [slice(20, 25.5)], # warm bath
'probe2Temperature': [slice(5.5, 15.5)], # cold bath
# 'referenceTemperature': [slice(-24., -4)] # The internal coil is not
˓→so uniform
}

print(ds.calibration_single_ended.__doc__)
Parameters
----------
store_p_cov : str
Key to store the covariance matrix of the calibrated parameters
store_p_val : str
Key to store the values of the calibrated parameters
nt : int, optional
Number of timesteps. Should be defined if method=='external'
z : array-like, optional
Distances. Should be defined if method=='external'
p_val
p_var
p_cov
st_label : str
ast_label : str
the forward
in the forward
store_c : str
Label of where to store C
store_gamma : str
store_dalpha : str
Label of where to store dalpha; the spatial derivative of alpha.
store_alpha : str


Label of where to store alpha; The integrated differential
attenuation.
alpha(x=0) = 0
store_tmpf : str
direction
is wls.
method : {'ols', 'wls'}
squares
statsmodels
Returns
-------
st_label = 'ST'
ast_label = 'AST'
First calculate the variance in the measured Stokes and anti-Stokes signals, in the forward and backward direction.
The Stokes and anti-Stokes signals should follow a smooth decaying exponential. This function fits a decaying expo-
nential to each reference section for each time step. The variance of the residuals between the measured Stokes and
anti-Stokes signals and the fitted signals is used as an estimate of the variance in measured signals.
st_var, resid = ds.variance_stokes(st_label=st_label)

ast_var, _ = ds.variance_stokes(st_label=ast_label)
Similar to the ols procedure, we make a single function call to calibrate the temperature. If the method is wls and
confidence intervals are passed to conf_ints, confidence intervals calculated. As weigths are correctly passed to
the least squares procedure, the covariance matrix can be used. This matrix holds the covariances between all the
parameters. A large parameter set is generated from this matrix, assuming the parameter space is normally distributed
with their mean at the best estimate of the least squares procedure.
The large parameter set is used to calculate a large set of temperatures. By using percentiles or quantile the
95% confidence interval of the calibrated temperature between 2.5% and 97.5% are calculated.
The confidence intervals differ per time step. If you would like to calculate confidence intervals of all time steps
together you have the option ci_avg_time_flag=True. ‘We can say with 95% confidence that the temperature
remained between this line and this line during the entire measurement period’.
ds.calibration_single_ended(sections=sections,
st_label=st_label,
st_var=st_var,
ast_var=ast_var,
method='wls',
solver='sparse',
store_p_val='p_val',
store_p_cov='p_cov'
)
ds.conf_int_single_ended(
p_val='p_val',
p_cov='p_cov',
st_label=st_label,
st_var=st_var,
ast_var=ast_var,
store_tmpf='TMPF',
store_tempvar='_var',
conf_ints=[2.5, 97.5],
mc_sample_size=500,
ci_avg_time_flag=False)
Lets compare our calibrated values with the device calibration
ds1 = ds.isel(time=0) # take only the first timestep

ds1.TMPF.plot(linewidth=0.8, figsize=(12, 8), label='User calibrated') # plot the
ds1.TMP.plot(linewidth=0.8, label='Device calibrated') # plot the temperature

ds1.TMPF_MC.plot(linewidth=0.8, hue='CI', label='CI device')

plt.title('Temperature at the first time step')
plt.legend();

ds.TMPF_MC_var.plot(figsize=(12, 8));
ds1.TMPF_MC.sel(CI=2.5).plot(label = '2.5% CI', figsize=(12, 8))

ds1.TMPF_MC.sel(CI=97.5).plot(label = '97.5% CI')
ds1.TMPF.plot(label='User calibrated')
plt.title('User calibrated temperature with 95% confidence interval')
plt.legend();
We can tell from the graph above that the 95% confidence interval widens furtherdown the cable. Lets have a look
at the calculated variance along the cable for a single timestep. According to the device manufacturer this should be
around 0.0059 degC.
ds1.TMPF_MC_var.plot(figsize=(12, 8));

The variance of the temperature measurement appears to be larger than what the manufacturer reports. This is already
the case for the internal cable; it is not caused by a dirty connector/bad splice on our side. Maybe the length of the
calibration section was not sufficient.
At 30 m the variance sharply increases. There are several possible explanations. E.g., large temperatures or decreased
signal strength.
Lets have a look at the Stokes and anti-Stokes signal.
ds1.ST.plot(figsize=(12, 8))
ds1.AST.plot();
Clearly there was a bad splice at 30 m that resulted in the sharp increase of measurement uncertainty for the cable
section after the bad splice.
4.8 8. Calibration of double ended measurement with WLS and con-

fidence intervals
4.8.1 Calibration procedure
A double ended calibration is performed with weighted least squares. Over all timesteps simultaneous. 𝛾 and 𝛼 remain
constant, while 𝐶 varies over time. The weights are not considered equal here. The weights kwadratically decrease
with the signal strength of the measured Stokes and anti-Stokes signals.
The confidence intervals can be calculated as the weights are correctly defined.
The confidence intervals consist of two sources of uncertainty.
1. Measurement noise in the measured Stokes and anti-Stokes signals. Expressed in a single variance value.
2. Inherent to least squares procedures / overdetermined systems, the parameters are estimated with limited cer-
tainty and all parameters are correlated. Which is expressen in the covariance matrix.
Both sources of uncertainty are propagated to an uncertainty in the estimated temperature via Monte Carlo.
import os



%matplotlib inline
ds_ = read_silixa_files(
directory=filepath,
file_ext='*.xml')
ds = ds_.sel(x=slice(0, 100)) # only calibrate parts of the fiber

sections = {
}

st_label = 'ST'
ast_label = 'AST'
rst_label = 'REV-ST'
rast_label = 'REV-AST'
First calculate the variance in the measured Stokes and anti-Stokes signals, in the forward and backward direction.
The Stokes and anti-Stokes signals should follow a smooth decaying exponential. This function fits a decaying expo-
nential to each reference section for each time step. The variance of the residuals between the measured Stokes and
anti-Stokes signals and the fitted signals is used as an estimate of the variance in measured signals.
st_var, resid = ds.variance_stokes(st_label=st_label)

ast_var, _ = ds.variance_stokes(st_label=ast_label)
rst_var, _ = ds.variance_stokes(st_label=rst_label)
rast_var, _ = ds.variance_stokes(st_label=rast_label)
resid.plot(figsize=(12, 8));
4.8. 8. Calibration of double ended measurement with WLS and confidence intervals 31
We calibrate the measurement with a single method call. The labels refer to the keys in the DataStore object containing
the Stokes, anti-Stokes, reverse Stokes and reverse anti-Stokes. The variance in those measurements were calculated
in the previous step. We use a sparse solver because it saves us memory.
ds.calibration_double_ended(
st_label=st_label,
st_var=st_var,
ast_var=ast_var,
rst_var=rst_var,
rast_var=rast_var,
store_tmpw='TMPW',
method='wls',
solver='sparse')
ds.TMPW.plot()
<matplotlib.collections.QuadMesh at 0x11e485e48>

4.8.2 Confidence intervals
With another method call we estimate the confidence intervals. If the method is wls and confidence intervals are
passed to conf_ints, confidence intervals calculated. As weigths are correctly passed to the least squares proce-
dure, the covariance matrix can be used as an estimator for the uncertainty in the parameters. This matrix holds the
covariances between all the parameters. A large parameter set is generated from this matrix as part of the Monte Carlo
routine, assuming the parameter space is normally distributed with their mean at the best estimate of the least squares
procedure.
The large parameter set is used to calculate a large set of temperatures. By using percentiles or quantile the
95% confidence interval of the calibrated temperature between 2.5% and 97.5% are calculated.
The confidence intervals differ per time step. If you would like to calculate confidence intervals of all time steps
together you have the option ci_avg_time_flag=True. ‘We can say with 95% confidence that the temperature
remained between this line and this line during the entire measurement period’. This is ideal if you’d like to calculate
the background temperature with a confidence interval.
ds.conf_int_double_ended(
p_val='p_val',
p_cov='p_cov',
st_label=st_label,
st_var=st_var,
ast_var=ast_var,
rst_var=rst_var,
rast_var=rast_var,
store_tmpf='TMPF',

store_tmpb='TMPB',
store_tmpw='TMPW',
store_tempvar='_var',
conf_ints=[2.5, 50., 97.5],
mc_sample_size=500, # <- choose a much larger sample size
ci_avg_time_flag=False)
ds1 = ds.isel(time=-1) # take only the first timestep

ds1.TMPW.plot(linewidth=0.7, figsize=(12, 8))
ds1.TMPW_MC.isel(CI=0).plot(linewidth=0.7, label='CI: 2.5%')
ds1.TMPW_MC.isel(CI=2).plot(linewidth=0.7, label='CI: 97.5%')
plt.legend();
The DataArrays TMPF_MC and TMPB_MC and the dimension CI are added. MC stands for monte carlo and the CI
dimension holds the confidence interval ‘coordinates’.
(ds1.TMPW_MC_var**0.5).plot(figsize=(12, 4));
plt.ylabel('$\sigma$ ($^\circ$C)');

ds.data_vars
Data variables:
ST (x, time) float64 4.049e+03 4.044e+03 ... 3.501e+03
AST (x, time) float64 3.293e+03 3.296e+03 ... 2.803e+03
REV-ST (x, time) float64 4.061e+03 4.037e+03 ... 4.584e+03
REV-AST (x, time) float64 3.35e+03 3.333e+03 ... 3.707e+03
TMP (x, time) float64 16.69 16.87 16.51 ... 13.6 13.69
gamma float64 482.6
alpha (x) float64 -0.007156 -0.003301 ... -0.005165
d (time) float64 1.465 1.465 1.464 1.465 1.465 1.465
gamma_var float64 0.03927
alpha_var (x) float64 1.734e-07 1.814e-07 ... 1.835e-07
d_var (time) float64 4.854e-07 4.854e-07 ... 4.854e-07
TMPF (x, time) float64 16.8 17.05 16.32 ... 13.49 13.78
TMPB (x, time) float64 16.8 16.83 16.88 ... 13.74 13.69
TMPF_MC_var (x, time) float64 dask.array<shape=(787, 6),
˓→chunksize=(699, 6)>
TMPB_MC_var (x, time) float64 dask.array<shape=(787, 6),

TMPW (x, time) float64 dask.array<shape=(787, 6),

TMPW_MC_var (x, time) float64 dask.array<shape=(787, 6),

p_val (params1) float64 482.6 1.465 ... -0.005271 -0.005165

p_cov (params1, params2) float64 0.03927 ... 1.835e-07
TMPF_MC (CI, x, time) float64 dask.array<shape=(3, 787, 6),
˓→chunksize=(3, 699, 6)>
TMPB_MC (CI, x, time) float64 dask.array<shape=(3, 787, 6),

˓→chunksize=(3, 699, 6)>
TMPW_MC (CI, x, time) float64 dask.array<shape=(3, 787, 6),

˓→chunksize=(3, 699, 6)>
4.9 9. Import a time series
In this tutorial we are adding a timeseries to the DataStore object. This might be useful if the temperature in one of
the calibration baths was measured with an external device. It requires three steps to add the measurement files to the
DataStore object: 1. Load the measurement files (e.g., csv, txt) with pandas into a pandas.Series object 2. Add the
pandas.Series object to the DataStore 3. Align the time to that of the DTS measurement (required for calibration)
import pandas as pd
import os
4.9.1 Step 1: load the measurement files
filepath = os.path.join('..', '..', 'tests', 'data',

'external_temperature_timeseries',
'Loodswaternet2018-03-28 02h.csv')
# Bonus:
print(filepath, '\n')
with open(filepath, 'r') as f:
head = [next(f) for _ in range(5)]
print(' '.join(head))
../../tests/data/external_temperature_timeseries/Loodswaternet2018-03-28 02h.csv
"time","Pt100 2"
2018-03-28 02:00:05, 12.748
2018-03-28 02:00:10, 12.747
2018-03-28 02:00:15, 12.746
2018-03-28 02:00:20, 12.747
ts = pd.read_csv(filepath, sep=',', index_col=0, parse_dates=True,

squeeze=True, engine='python') # the latter 2 kwargs are to ensure
˓→a pd.Series is returned
ts = ts.tz_localize('Europe/Amsterdam') # set the timezone
ts.head() # Double check the timezone
time
2018-03-28 02:00:05+02:00 12.748
2018-03-28 02:00:10+02:00 12.747
2018-03-28 02:00:15+02:00 12.746
2018-03-28 02:00:20+02:00 12.747
2018-03-28 02:00:26+02:00 12.747
Name: Pt100 2, dtype: float64
Now we quickly create a DataStore from xml-files with Stokes measurements to add the external timeseries to
filepath_ds = os.path.join('..', '..', 'tests', 'data', 'double_ended2')

ds = read_silixa_files(directory=filepath_ds,
file_ext='*.xml')


4.9.2 Step 2: Add the temperature measurements of the external probe to the Data-
Store.
First add the coordinates
ds.coords['time_external'] = ts.index.values
Second we add the measured values
ds['external_probe'] = (('time_external',), ts)
4.9.3 Step 3: Align the time of the external measurements to the Stokes measure-
ment times
We linearly interpolate the measurements of the external sensor to the times we have DTS measurements
ds['external_probe_dts'] = ds['external_probe'].interp(time_external=ds.time)
print(ds.data_vars)
Data variables:
ST (x, time) float64 1.281 -0.5321 ... -43.44 -41.08
AST (x, time) float64 0.4917 1.243 ... -30.14 -32.09
REV-ST (x, time) float64 0.4086 -0.568 ... 4.822e+03
REV-AST (x, time) float64 2.569 -1.603 ... 4.224e+03
TMP (x, time) float64 196.1 639.1 218.7 ... 8.442 18.47
external_probe (time_external) float64 12.75 12.75 ... 12.76 12.76
external_probe_dts (time) float64 12.75 12.75 12.75 12.75 12.75 12.75
Now we can use external_probe_dts when we define sections and use it for calibration
4.10 10. Align double ended measurements
The cable length was initially configured during the DTS measurement. For double ended measurements it is important
to enter the correct length so that the forward channel and the backward channel are aligned.
4.10. 10. Align double ended measurements 37

This notebook shows how to better align the forward and the backward measurements. Do this before the calibration
steps.
import os
from dtscalibration.datastore_utils import suggest_cable_shift_double_ended, shift_
˓→double_ended
import numpy as np
%matplotlib inline
suggest_cable_shift_double_ended?
ds_aligned = read_silixa_files(
directory=filepath,
file_ext='*.xml') # this one is already correctly aligned

Because our loaded files were already nicely aligned, we are purposely offsetting the forward and backward channel
by 3 ‘spacial indices’.
ds_notaligned = shift_double_ended(ds_aligned, 3)
I dont know what to do with the following data ['TMP']
The device-calibrated temperature doesnot have a valid meaning anymore and is dropped
suggested_shift = suggest_cable_shift_double_ended(
ds_notaligned,
np.arange(-5, 5),
plot_result=True,
figsize=(12,8))
/Users/bfdestombe/Projects/dts-calibration/python-dts-calibration/src/dtscalibration/
˓→datastore_utils.py:240: RuntimeWarning: invalid value encountered in log
i_f = np.log(st / ast)

/Users/bfdestombe/Projects/dts-calibration/python-dts-calibration/src/dtscalibration/
˓→datastore_utils.py:241: RuntimeWarning: invalid value encountered in log
i_b = np.log(rst / rast)

The two approaches suggest a shift of -3 and -4. It is up to the user which suggestion to follow. Usually the two
suggested shift are close
ds_restored = shift_double_ended(ds_notaligned, suggested_shift[0])
print(ds_aligned.x, 3*'\n', ds_restored.x)
<xarray.DataArray 'x' (x: 1693)>

array([-80.5043, -80.3772, -80.2501, ..., 134.294 , 134.421 , 134.548 ])
Coordinates:
* x (x) float64 -80.5 -80.38 -80.25 -80.12 ... 134.2 134.3 134.4 134.5
Attributes:
name: distance
description: Length along fiber
long_description: Starting at connector of forward channel
units: m
<xarray.DataArray 'x' (x: 1687)>

array([-80.123 , -79.9959, -79.8688, ..., 133.913 , 134.04 , 134.167 ])
Coordinates:
* x (x) float64 -80.12 -80.0 -79.87 -79.74 ... 133.8 133.9 134.0 134.2
Attributes:
name: distance
description: Length along fiber
long_description: Starting at connector of forward channel
units: m
Note that our fiber has become shorter by 2*3 spatial indices
4.10. 10. Align double ended measurements 39


CHAPTER 5
Reference
5.1 dtscalibration
class dtscalibration.DataStore(*args, **kwargs)

The data class that stores the measurements, contains calibration methods to relate Stokes and anti-Stokes to
temperature. The user should never initiate this class directly, but use read_xml_dir or open_datastore functions
instead.
data_vars [dict-like, optional] A mapping from variable names to DataArray objects,
Variable objects or tuples of the form (dims, data[, attrs]) which can be used
as arguments to create a new Variable. Each dimension must have the same length in all
variables in which it appears.
coords [dict-like, optional] Another mapping in the same form as the variables argument, except the
each item is saved on the datastore as a “coordinate”. These variables have an associated mean-
ing: they describe constant/fixed/independent quantities, unlike the varying/measured/dependent
quantities that belong in variables. Coordinates values may be given by 1-dimensional arrays or
scalars, in which case dims do not need to be supplied: 1D arrays will be assumed to give index
values along the dimension with the same name.
attrs [dict-like, optional] Global attributes to save on this datastore.
sections [dict, optional] Sections for calibration. The dictionary should contain key-var couples in
which the key is the name of the calibration temp time series. And the var is a list of slice objects
as ‘slice(start, stop)’; start and stop in meter (float).
compat [{‘broadcast_equals’, ‘equals’, ‘identical’}, optional] String indicating how to compare
variables of the same name for potential conflicts when initializing this datastore: - ‘broad-
cast_equals’: all values must be equal when variables are
broadcast against each other to ensure common dimensions.
• ‘equals’: all values and dimensions must be the same.

• ‘identical’: all values, dimensions and attributes must be the same.
41
dtscalibration.read_xml_dir : Load measurements stored in XML-files dtscalibration.open_datastore

: Load (calibrated) measurements from netCDF-like file
calibration_double_ended(sections=None, st_label=’ST’, ast_label=’AST’,
rst_label=’REV-ST’, rast_label=’REV-AST’, st_var=None,
ast_var=None, rst_var=None, rast_var=None,
store_d=’d’, store_gamma=’gamma’, store_alpha=’alpha’,
store_tmpf=’TMPF’, store_tmpb=’TMPB’, store_tmpw=’TMPW’,
tmpw_mc_size=50, store_p_cov=’p_cov’, store_p_val=’p_val’,
variance_suffix=’_var’, method=’ols’, solver=’sparse’,
nt=None, z=None, p_val=None, p_var=None, p_cov=None,
remove_mc_set_flag=True, reduce_memory_usage=False)
Parameters
• store_p_cov
• store_p_val
• nt
• z
• p_val
• p_var
• p_cov
• sections (dict, optional)
• st_label (str) – Label of the forward stokes measurement
• ast_label (str) – Label of the anti-Stoke measurement
• rst_label (str) – Label of the reversed Stoke measurement
• rast_label (str) – Label of the reversed anti-Stoke measurement
• st_var (float, optional) – The variance of the measurement noise of the Stokes signals in
the forward direction Required if method is wls.
• ast_var (float, optional) – The variance of the measurement noise of the anti-Stokes sig-
nals in the forward direction. Required if method is wls.
• rst_var (float, optional) – The variance of the measurement noise of the Stokes signals in
the backward direction. Required if method is wls.
• rast_var (float, optional) – The variance of the measurement noise of the anti-Stokes
signals in the backward direction. Required if method is wls.
• store_d (str) – Label of where to store D. Equals the integrated differential attenuation at
x=0 And should be equal to half the total integrated differential attenuation.
• store_gamma (str) – Label of where to store gamma
• store_alpha (str) – Label of where to store alpha
• store_tmpf (str) – Label of where to store the calibrated temperature of the forward direc-
tion
• store_tmpb (str) – Label of where to store the calibrated temperature of the backward
direction
• store_tmpw (str)
• tmpw_mc_size (int)
42 Chapter 5. Reference
• variance_suffix (str, optional) – String appended for storing the variance. Only used when
method is wls.
• method ({‘ols’, ‘wls’, ‘external’}) – Use ‘ols’ for ordinary least squares and ‘wls’ for
weighted least squares
• solver ({‘sparse’, ‘stats’}) – Either use the homemade weighted sparse solver or the
weighted dense matrix solver of statsmodels
calibration_single_ended(sections=None, st_label=’ST’, ast_label=’AST’, st_var=None,
ast_var=None, store_c=’c’, store_gamma=’gamma’,
store_dalpha=’dalpha’, store_alpha=’alpha’, store_tmpf=’TMPF’,
store_p_cov=’p_cov’, store_p_val=’p_val’, variance_suffix=’_var’,
method=’ols’, solver=’sparse’, nt=None, z=None, p_val=None,
p_var=None, p_cov=None)
Parameters
• store_p_cov (str) – Key to store the covariance matrix of the calibrated parameters
• store_p_val (str) – Key to store the values of the calibrated parameters
• nt (int, optional) – Number of timesteps. Should be defined if method==’external’
• z (array-like, optional) – Distances. Should be defined if method==’external’
• p_val
• p_var
• p_cov
• sections (dict, optional)
• st_label (str) – Label of the forward stokes measurement
• ast_label (str) – Label of the anti-Stoke measurement
• st_var (float, optional) – The variance of the measurement noise of the Stokes signals in
the forward direction Required if method is wls.
• ast_var (float, optional) – The variance of the measurement noise of the anti-Stokes sig-
nals in the forward direction. Required if method is wls.
• store_c (str) – Label of where to store C
• store_gamma (str) – Label of where to store gamma
• store_dalpha (str) – Label of where to store dalpha; the spatial derivative of alpha.
• store_alpha (str) – Label of where to store alpha; The integrated differential attenuation.
alpha(x=0) = 0
• store_tmpf (str) – Label of where to store the calibrated temperature of the forward direc-
tion
• variance_suffix (str, optional) – String appended for storing the variance. Only used when
method is wls.
• method ({‘ols’, ‘wls’}) – Use ‘ols’ for ordinary least squares and ‘wls’ for weighted least
squares
• solver ({‘sparse’, ‘stats’}) – Either use the homemade weighted sparse solver or the
weighted dense matrix solver of statsmodels
channel_configuration
5.1. dtscalibration 43
chbw
chfw
conf_int_double_ended(p_val=’p_val’, p_cov=’p_cov’, st_label=’ST’, ast_label=’AST’,
rst_label=’REV-ST’, rast_label=’REV-AST’, st_var=None,
ast_var=None, rst_var=None, rast_var=None, store_tmpf=’TMPF’,
store_tmpb=’TMPB’, store_tmpw=’TMPW’, store_tempvar=’_var’,
conf_ints=None, mc_sample_size=100, ci_avg_time_flag=False,
ci_avg_x_flag=False, var_only_sections=False,
da_random_state=None, remove_mc_set_flag=True, re-
duce_memory_usage=False)
Parameters
• p_val (array-like or string) – parameter solution directly from calibra-
tion_double_ended_wls
• p_cov (array-like or string) – parameter covariance at the solution directly from calibra-
tion_double_ended_wls If set to False, no uncertainty in the parameters is propagated into
the confidence intervals. Similar to the spec sheets of the DTS manufacturers. And similar
to passing an array filled with zeros
• st_label (str) – Key of the forward Stokes
• ast_label (str) – Key of the forward anti-Stokes
• rst_label (str) – Key of the backward Stokes
• rast_label (str) – Key of the backward anti-Stokes
• st_var (float) – Float of the variance of the Stokes signal
• ast_var (float) – Float of the variance of the anti-Stokes signal
• rst_var (float) – Float of the variance of the backward Stokes signal
• rast_var (float) – Float of the variance of the backward anti-Stokes signal
• store_tmpf (str) – Key of how to store the Forward calculated temperature. Is calculated
using the forward Stokes and anti-Stokes observations.
• store_tmpb (str) – Key of how to store the Backward calculated temperature. Is calculated
using the backward Stokes and anti-Stokes observations.
• store_tmpw (str) – Key of how to store the forward-backward-weighted temperature.
First, the variance of TMPF and TMPB are calculated. The Monte Carlo set of TMPF
and TMPB are averaged, weighted by their variance. The median of this set is thought to
be the a reasonable estimate of the temperature
• store_tempvar (str) – a string that is appended to the store_tmp_ keys. and the variance
is calculated for those store_tmp_ keys
• conf_ints (iterable object of float) – A list with the confidence boundaries that are calcu-
lated. Valid values are between [0, 1].
• mc_sample_size (int) – Size of the monte carlo parameter set used to calculate the confi-
dence interval
• ci_avg_time_flag (bool) – The confidence intervals differ per time step. If you would
like to calculate confidence intervals of all time steps together. ‘We can say with 95%
confidence that the temperature remained between this line and this line during the entire
measurement period’.
• ci_avg_x_flag (bool) – Similar to ci_avg_time_flag but then the averaging takes place over
the x dimension. And we can observe to variance over time.
• var_only_sections (bool) – useful if using the ci_avg_x_flag. Only calculates the var over
the sections, so that the values can be compared with accuracy along the reference sections.
Where the accuracy is the variance of the residuals between the estimated temperature and
temperature of the water baths
• da_random_state – For testing purposes. Similar to random seed. The seed for dask.
Makes random not so random. To produce reproducable results for testing environments.
• remove_mc_set_flag (bool) – Remove the monte carlo data set, from which the CI and
the variance are calculated.
• reduce_memory_usage (bool) – Use less memory but at the expense of longer computa-
tion time
conf_int_single_ended(p_val=’p_val’, p_cov=’p_cov’, st_label=’ST’, ast_label=’AST’,
st_var=None, ast_var=None, store_tmpf=’TMPF’,
store_tempvar=’_var’, conf_ints=None, mc_sample_size=100,
ci_avg_time_flag=False, ci_avg_x_flag=False, da_random_state=None,
remove_mc_set_flag=True, reduce_memory_usage=False)
Parameters
• p_val (array-like or string) – parameter solution directly from calibra-
tion_double_ended_wls
• p_cov (array-like or string or bool) – parameter covariance at p_val directly from calibra-
tion_double_ended_wls. If set to False, no uncertainty in the parameters is propagated into
the confidence intervals. Similar to the spec sheets of the DTS manufacturers. And similar
to passing an array filled with zeros. If set to string, the p_cov is retreived by accessing
ds[p_cov] . See p_cov keyword argument in the calibration routine.
• st_label (str) – Key of the forward Stokes
• ast_label (str) – Key of the forward anti-Stokes
• st_var (float) – Float of the variance of the Stokes signal
• ast_var (float) – Float of the variance of the anti-Stokes signal
• store_tmpf (str) – Key of how to store the Forward calculated temperature. Is calculated
using the forward Stokes and anti-Stokes observations.
• store_tempvar (str) – a string that is appended to the store_tmp_ keys. and the variance
is calculated for those store_tmp_ keys
• conf_ints (iterable object of float) – A list with the confidence boundaries that are calcu-
lated. Valid values are between [0, 1].
• mc_sample_size (int) – Size of the monte carlo parameter set used to calculate the confi-
dence interval
• ci_avg_time_flag (bool) – The confidence intervals differ per time step. If you would
like to calculate confidence intervals of all time steps together. ‘We can say with 95%
confidence that the temperature remained between this line and this line during the entire
measurement period’.
• ci_avg_x_flag (bool) – Similar to ci_avg_time_flag but then over the x-dimension instead
of the time-dimension
• da_random_state – For testing purposes. Similar to random seed. The seed for dask.
Makes random not so random. To produce reproducable results for testing environments.
• remove_mc_set_flag (bool) – Remove the monte carlo data set, from which the CI and
the variance are calculated.
• reduce_memory_usage (bool) – Use less memory but at the expense of longer computa-
tion time
get_default_encoding()
get_time_dim(data_var_key=None)
Find relevant time dimension. by educative guessing
Parameters data_var_key (str) – The data variable key that contains a relevant time dimension.
If None, ‘ST’ is used.
get_x_dim(data_var_key=None)
Find relevant x dimension. by educative guessing
Parameters data_var_key (str) – The data variable key that contains a relevant time dimension.
If None, ‘ST’ is used.
in_confidence_interval(ci_label, conf_ints, sections=None)
Returns an array with bools wether the temperature of the reference sections are within the confidence
intervals
Parameters
• sections (Dict[str, List[slice]])
• ci_label
• conf_ints
inverse_variance_weighted_mean(tmp1=’TMPF’, tmp2=’TMPB’,
tmp1_var=’TMPF_MC_var’,
tmp2_var=’TMPB_MC_var’, tmpw_store=’TMPW’,
tmpw_var_store=’TMPW_var’)
Average two temperature datasets with the inverse of the variance as weights. The two temperature datasets
tmp1 and tmp2 with their variances tmp1_var and tmp2_var, respectively. Are averaged and stored in the
DataStore.
Parameters
• tmp1 (str) – The label of the first temperature dataset that is averaged
• tmp2 (str) – The label of the second temperature dataset that is averaged
• tmp1_var (str) – The variance of tmp1
• tmp2_var (str) – The variance of tmp2
• tmpw_store (str) – The label of the averaged temperature dataset
• tmpw_var_store (str) – The label of the variance of the averaged temperature dataset
inverse_variance_weighted_mean_array(tmp_label=’TMPF’,
tmp_var_label=’TMPF_MC_var’,
tmpw_store=’TMPW’,
tmpw_var_store=’TMPW_var’, dim=’time’)
Calculates the weighted average across a dimension.
See also:
-()
is_double_ended
resample_datastore(how, freq=None, dim=None, skipna=None, closed=None, label=None,

base=0, keep_attrs=True, **indexer)
Returns a resampled DataStore. Always define the how. Handles both downsampling and upsampling. If
any intervals contain no values from the original object, they will be given the value NaN. :Parameters: *
freq
• dim
• how (str) – Any function that is available via groupby. E.g., ‘mean’ http://pandas.pydata.org/
pandas-docs/stable/groupby.html#groupby -dispatch
• skipna (bool, optional) – Whether to skip missing values when aggregating in downsampling.
• closed (‘left’ or ‘right’, optional) – Side of each interval to treat as closed.
• label (‘left or ‘right’, optional) – Side of each interval to use for labeling.
• base (int, optional) – For frequencies that evenly subdivide 1 day, the “origin” of the aggregated
intervals. For example, for ‘24H’ frequency, base could range from 0 through 23.
• keep_attrs (bool, optional) – If True, the object’s attributes (attrs) will be copied from the original
object to the new one. If False (default), the new object will be returned without attributes.
• **indexer ({dim: freq}) – Dictionary with a key indicating the dimension name to resample over and
a value corresponding to the resampling frequency.
Returns resampled (same type as caller) – This object resampled.
sections
Define calibration sections. Each section requires a reference temperature time series, such as the temper-
ature measured by an external temperature sensor. They should already be part of the DataStore object.
Please look at the example notebook on sections if you encounter difficulties.
Parameters sections (Dict[str, List[slice]]) – Sections are defined in a dictionary with its key-
words of the names of the reference temperature time series. Its values are lists of slice
objects, where each slice object is a stretch.
temperature_residuals(label=None)
Parameters label (str) – The key of the temperature DataArray
Returns resid_da (xarray.DataArray) – The residuals as DataArray
timeseries_keys
Returns the keys of all timeseires that can be used for calibration.
to_netcdf(path=None, mode=’w’, format=None, group=None, engine=None, encoding=None, un-
limited_dims=None, compute=True)
Write datastore contents to a netCDF file. :Parameters: * path (str, Path or file-like object, optional) –
Path to which to save this dataset. File-like objects are only
supported by the scipy engine. If no path is provided, this function returns the resulting
netCDF file as bytes; in this case, we need to use scipy, which does not support netCDF
version 4 (the default format becomes NETCDF3_64BIT).
• mode ({‘w’, ‘a’}, optional) – Write (‘w’) or append (‘a’) mode. If mode=’w’, any existing
file at this location will be overwritten. If mode=’a’, existing variables will be overwritten.
• format ({‘NETCDF4’, ‘NETCDF4_CLASSIC’, ‘NETCDF3_64BIT’,)
• ‘NETCDF3_CLASSIC’}, optional – File format for the resulting netCDF file: *
NETCDF4: Data is stored in an HDF5 file, using netCDF4 API
features.
– NETCDF4_CLASSIC: Data is stored in an HDF5 file, using only netCDF 3 compatible

API features.
– NETCDF3_64BIT: 64-bit offset version of the netCDF 3 file format, which fully supports
2+ GB files, but is only compatible with clients linked against netCDF version 3.6.0 or
later.
– NETCDF3_CLASSIC: The classic netCDF 3 file format. It does not handle 2+ GB files
very well.
All formats are supported by the netCDF4-python library. scipy.io.netcdf only supports the
last two formats. The default format is NETCDF4 if you are saving a file to disk and have
the netCDF4-python library available. Otherwise, xarray falls back to using scipy to write
netCDF files and defaults to the NETCDF3_64BIT format (scipy does not support netCDF4).
• group (str, optional) – Path to the netCDF4 group in the given file to open (only works for
format=’NETCDF4’). The group(s) will be created if necessary.
• engine ({‘netcdf4’, ‘scipy’, ‘h5netcdf’}, optional) – Engine to use when writing netCDF
files. If not provided, the default engine is chosen based on available dependencies, with a
preference for ‘netcdf4’ if writing to a file on disk.
• encoding (dict, optional) – defaults to reasonable compression. Use encoding={} to disable
encoding. Nested dictionary with variable names as keys and dictionaries of variable specific
encodings as values, e.g., ‘‘{‘my_variable’: {‘dtype’: ‘int16’, ‘scale_factor’: 0.1,
‘zlib’: True}, . . . }‘‘
The h5netcdf engine supports both the NetCDF4-style compression encoding parameters
{'zlib': True, 'complevel': 9} and the h5py ones {'compression':
'gzip', 'compression_opts': 9}. This allows using any compression plugin
installed in the HDF5 library, e.g. LZF.
• unlimited_dims (sequence of str, optional) – Dimension(s) that should be serialized as un-
limited dimensions. By default, no dimensions are treated as unlimited dimensions. Note
that unlimited_dims may also be set via dataset.encoding['unlimited_dims'].
• compute (boolean) – If true compute immediately, otherwise return a dask.delayed.
Delayed object that can be computed later.
ufunc_per_section(sections=None, func=None, label=None, subtract_from_label=None,

temp_err=False, x_indices=False, ref_temp_broadcasted=False,
calc_per=’stretch’, **func_kwargs)
User function applied to parts of the cable. Super useful, many options and slightly complicated.
The function func is taken over all the timesteps and calculated per calc_per. This is returned as a dictio-
nary
Parameters
• sections (Dict[str, List[slice]])
• func (callable, str) – A numpy function, or lambda function to apple to each ‘calc_per’.
• label
• subtract_from_label
• temp_err (bool) – The argument of the function is label minus the reference temperature.
• x_indices (bool) – To retreive an integer array with the indices of the x-coordinates in the
section/stretch
• ref_temp_broadcasted (bool)
• calc_per ({‘all’, ‘section’, ‘stretch’})
• func_kwargs (dict) – Dictionary with options that are passed to func
• TODO (Spend time on creating a slice instead of appendng everything)
• to a list and concatenating after.
Examples
# Calculate the variance of the residuals in the along ALL the # reference sections wrt the temperature of
the water baths TMPF_var = d.ufunc_per_section(
func=’var’, calc_per=’all’, label=’TMPF’, temp_err=True )
# Calculate the variance of the residuals in the along PER # reference section wrt the temperature of the
water baths TMPF_var = d.ufunc_per_section(
func=’var’, calc_per=’stretch’, label=’TMPF’, temp_err=True )
# Calculate the variance of the residuals in the along PER # water bath wrt the temperature of the water
baths TMPF_var = d.ufunc_per_section(
func=’var’, calc_per=’section’, label=’TMPF’, temp_err=True )
# Obtain the coordinates of the measurements per section locs = d.ufunc_per_section(
func=None, label=’x’, temp_err=False, ref_temp_broadcasted=False, calc_per=’stretch’)
# Number of observations per stretch nlocs = d.ufunc_per_section(
func=len, label=’x’, temp_err=False, ref_temp_broadcasted=False, calc_per=’stretch’)
# broadcast the temperature of the reference sections to stretch/section/all dimensions. The value of the
reference temperature (a timeseries) is broadcasted to the shape of self[ label]. The self[label] is not used
for anything else. temp_ref = d.ufunc_per_section(
label=’ST’, ref_temp_broadcasted=True, calc_per=’all’)
# x-coordinate index ix_loc = d.ufunc_per_section(x_indices=True)
Note: If self[label] or self[subtract_from_label] is a Dask array, a Dask array is returned Else a numpy
array is returned
variance_stokes(st_label, sections=None, reshape_residuals=True)

Calculates the variance between the measurements and a best fit at each reference section. This fits a
function to the nt * nx measurements with ns * nt + nx parameters, where nx are the total number of
obervation locations along all sections. The temperature is constant along the reference sections, so the
expression of the Stokes power can be split in a time series per reference section and a constant per
observation location.
Assumptions: 1) the temperature is the same along a reference section.
Idea from discussion at page 127 in Richter, P. H. (1995). Estimating errors in least-squares fitting.
Parameters
• reshape_residuals
• st_label (str) – label of the Stokes, anti-Stokes measurement. E.g., ST, AST, REV-ST,
REV-AST
• sections (dict, optional) – Define sections. See documentation
Returns
• I_var (float) – Variance of the residuals between measured and best fit
• resid (array_like) – Residuals between measured and best fit
Notes
Because there are a large number of unknowns, spend time on calculating an initial estimate. Can be turned
off by setting to False.
variance_stokes_exponential(st_label, sections=None, use_statsmodels=False, sup-
press_info=True, reshape_residuals=True)
Calculates the variance between the measurements and a best fit exponential at each reference section.
This fits a two-parameter exponential to the stokes measurements. The temperature is constant and there
are no splices/sharp bends in each reference section. Therefore all signal decrease is due to differential
attenuation, which is the same for each reference section. The scale of the exponential does differ per
reference section.
Assumptions: 1) the temperature is the same along a reference section. 2) no sharp bends and splices in
the reference sections. 3) Same type of optical cable in each reference section.
Idea from discussion at page 127 in Richter, P. H. (1995). Estimating errors in least-squares fitting. For
weights used error propagation: w^2 = 1/sigma(lny)^2 = y^2/sigma(y)^2 = y^2
Parameters
• reshape_residuals
• use_statsmodels
• suppress_info
• st_label (str) – label of the Stokes, anti-Stokes measurement. E.g., ST, AST, REV-ST,
REV-AST
• sections (dict, optional) – Define sections. See documentation
Returns
• I_var (float) – Variance of the residuals between measured and best fit
• resid (array_like) – Residuals between measured and best fit
dtscalibration.open_datastore(filename_or_obj, group=None, decode_cf=True,
mask_and_scale=None, decode_times=True, con-
cat_characters=True, decode_coords=True, engine=None,
chunks=None, lock=None, cache=None, drop_variables=None,
backend_kwargs=None, **kwargs)
Load and decode a datastore from a file or file-like object. :Parameters: * filename_or_obj (str, Path, file or
xarray.backends.*DataStore) – Strings and Path objects are interpreted as a path to a netCDF file
or an OpenDAP URL and opened with python-netCDF4, unless the filename ends with
.gz, in which case the file is gunzipped and opened with scipy.io.netcdf (only netCDF3
supported). File-like objects are opened with scipy.io.netcdf (only netCDF3 supported).
• group (str, optional) – Path to the netCDF4 group in the given file to open (only works for
netCDF4 files).
• decode_cf (bool, optional) – Whether to decode these variables, assuming they were saved
according to CF conventions.
• mask_and_scale (bool, optional) – If True, replace array values equal to _FillValue with NA
and scale values according to the formula original_values * scale_factor + add_offset, where
_FillValue, scale_factor and add_offset are taken from variable attributes (if they exist). If the
_FillValue or missing_value attribute contains multiple values a warning will be issued and all
array values matching one of the multiple values will be replaced by NA. mask_and_scale de-
faults to True except for the pseudonetcdf backend.
• decode_times (bool, optional) – If True, decode times encoded in the standard NetCDF datetime
format into datetime objects. Otherwise, leave them encoded as numbers.
• concat_characters (bool, optional) – If True, concatenate along the last dimension of character
arrays to form string arrays. Dimensions will only be concatenated over (and removed) if they
have no corresponding variable and if they are only used as the last dimension of character
arrays.
• decode_coords (bool, optional) – If True, decode the ‘coordinates’ attribute to identify coordi-
nates in the resulting dataset.
• engine ({‘netcdf4’, ‘scipy’, ‘pydap’, ‘h5netcdf’, ‘pynio’,)
• ‘pseudonetcdf’}, optional – Engine to use when reading files. If not provided, the default
engine is chosen based on available dependencies, with a preference for ‘netcdf4’.
• chunks (int or dict, optional) – If chunks is provided, it used to load the new dataset into dask
arrays. chunks={} loads the dataset with dask using a single chunk for all arrays.
• lock (False, True or threading.Lock, optional) – If chunks is provided, this argument is passed
on to dask.array.from_array(). By default, a global lock is used when reading data
from netCDF files with the netcdf4 and h5netcdf engines to avoid issues with concurrent access
when using dask’s multithreaded backend.
• cache (bool, optional) – If True, cache data loaded from the underlying datastore in memory as
NumPy arrays when accessed to avoid reading from the underlying datastore multiple times.
Defaults to True unless you specify the chunks argument to use dask, in which case it defaults to
False. Does not change the behavior of coordinates corresponding to dimensions, which always
load their data from disk into a pandas.Index.
• drop_variables (string or iterable, optional) – A variable or list of variables to exclude from
being parsed from the dataset. This may be useful to drop variables with problems or inconsistent
values.
• backend_kwargs (dictionary, optional) – A dictionary of keyword arguments to pass on to the
backend. This may be useful when backend options would improve performance or allow user
control of dataset processing.
Returns dataset (Dataset) – The newly created dataset.
See also:
read_xml_dir()
dtscalibration.read_sensornet_files(filepathlist=None, directory=None, file_ext=’*.ddf’,
timezone_netcdf=’UTC’, timezone_input_files=’UTC’,
silent=False, **kwargs)
Read a folder with measurement files. Each measurement file contains values for a single timestep. Remember
to check which timezone you are working in.
Parameters
• filepathlist (list of str, optional) – List of paths that point the the silixa files
• directory (str, Path, optional) – Path to folder
• timezone_netcdf (str, optional) – Timezone string of the netcdf file. UTC follows CF-
conventions.
• timezone_input_files (str, optional) – Timezone string of the measurement files. Remember
to check when measurements are taken. Also if summertime is used.
• file_ext (str, optional) – file extension of the measurement files
• silent (bool) – If set tot True, some verbose texts are not printed to stdout/screen
• kwargs (dict-like, optional) – keyword-arguments are passed to DataStore initialization
Notes
Compressed sensornet files can not be directly decoded, because the files are encoded with encoding=’windows-
1252’ instead of UTF-8.
Returns datastore (DataStore) – The newly created datastore.
dtscalibration.read_silixa_files(filepathlist=None, directory=None, zip_handle=None,
file_ext=’*.xml’, timezone_netcdf=’UTC’, silent=False,
load_in_memory=’auto’, **kwargs)
Read a folder with measurement files. Each measurement file contains values for a single timestep. Remember
to check which timezone you are working in.
The silixa files are already timezone aware
Parameters
• filepathlist (list of str, optional) – List of paths that point the the silixa files
• directory (str, Path, optional) – Path to folder
• timezone_netcdf (str, optional) – Timezone string of the netcdf file. UTC follows CF-
conventions.
• file_ext (str, optional) – file extension of the measurement files
• silent (bool) – If set tot True, some verbose texts are not printed to stdout/screen
• load_in_memory ({‘auto’, True, False}) – If ‘auto’ the Stokes data is only loaded to mem-
ory for small files
• kwargs (dict-like, optional) – keyword-arguments are passed to DataStore initialization
Returns datastore (DataStore) – The newly created datastore.
dtscalibration.plot_dask(arr, file_path=None)
For debugging the scheduling of the calculation of dask arrays. Requires additional libraries to be installed.
Parameters
• arr (Dask-Array) – An uncomputed dask array
• file_path (Path-like, str, optional) – Path to save graph
Returns out (array-like) – The calculated array
CHAPTER 6
Contributing
Contributions are welcome, and they are greatly appreciated! Every little bit helps, and credit will always be given.
6.1 Bug reports
When reporting a bug please include:

• Your operating system name and version.
• Any details about your local setup that might be helpful in troubleshooting.
• Detailed steps to reproduce the bug.
6.2 Documentation improvements
dtscalibration could always use more documentation, whether as part of the official dtscalibration docs, in docstrings,
or even on the web in blog posts, articles, and such.
6.3 Feature requests and feedback
The best way to send feedback is to file an issue at https://github.com/bdestombe/python-dts-calibration/issues.

If you are proposing a feature:
• Explain in detail how it would work.
• Keep the scope as narrow as possible, to make it easier to implement.
• Remember that this is a volunteer-driven project, and that code contributions are welcome :)
53
6.4 Development
To set up python-dts-calibration for local development:

1. Fork python-dts-calibration (look for the “Fork” button).
2. Clone your fork locally:
git clone git@github.com:your_name_here/python-dts-calibration.git
3. Create a branch for local development:
git checkout -b name-of-your-bugfix-or-feature
Now you can make your changes locally.

4. When you’re done making changes, run all the checks, doc builder and spell checker with tox one command:
tox
5. Commit your changes and push your branch to GitHub:
git add .
git commit -m "Your detailed description of your changes."
git push origin name-of-your-bugfix-or-feature
6. Submit a pull request through the GitHub website.
6.4.1 Pull Request Guidelines
If you need some code review or feedback while you’re developing the code just make the pull request.
For merging, you should:
1. Include passing tests (run tox)1 .
2. Update documentation when there’s new API, functionality etc.
3. Add a note to CHANGELOG.rst about the changes.
4. Add yourself to AUTHORS.rst.
6.4.2 Tips
To run a subset of tests:
tox -e envname -- pytest -k test_myfeature
To run all the test environments in parallel (you need to pip install detox):
detox
1 If you don’t have all the necessary python versions available locally you can rely on Travis - it will run the tests for each change you add in the
pull request.
It will be slower though . . .
54 Chapter 6. Contributing
CHAPTER 7
Authors
• Bas des Tombe - https://github.com/bdestombe

• Bart Schilperoort - https://github.com/BSchilperoort
55
56 Chapter 7. Authors
CHAPTER 8
Changelog
8.1 0.6.3 (2019-04-03)
• Added reading support for zipped silixa files. Still rarely fails due to upstream bug.
• pretty __repr__
• Reworked double ended calibration procedure. Integrated differential attenuation outside of reference sections
is now calculated seperately.
• New approach for estimation of Stokes variance. Not restricted to a decaying exponential
• Bug in averaging TMPF and TMPB to TMPW
• Modified residuals plot, especially useful for long fibers (Great work Bart!)
• Example notebooks updatred accordingly
• Bug in to_netcdf when passing encodings
• Better support for sections that are not related to a timeseries.
8.2 0.6.2 (2019-02-26)
• Double-ended weighted calibration procedure is rewritten so that the integrated differential attenuation outside
of the reference sections is calculated seperately. Better memory usage and faster
• Other calibration routines cleaned up
• Official support for Python 3.7
• Coverage figures are now trustworthy
• String representation improved
• Include test for aligning double ended measurements
• Example for aligning double ended measurements
57
8.3 0.6.1 (2019-01-04)
• Many examples were shown in the documentation

• Fixed verbose settings of solvers
• Revised example notebooks
• Moved to 80 characters per line (PEP)
• More Python formatting using YAPF
• Use example of plot_residuals_reference_sections function in Stokes variance example notebook
• Support Python 3.7
8.4 0.6.0 (2018-12-08)
• Reworked the double-ended calibration routine and the routine for confidence intervals. The integrated differ-
ential attenuation is not zero at x=0 anymore.
• Verbose commands carpentry
• Bug fixed that would make the read_silixa routine crash if there are copies of the same file in the same folder
• Routine to read sensornet files. Only single-ended configurations supported for now. Anyone has double-ended
measurements?
• Lazy calculation of the confidence intervals
• Bug solved. The x-coordinates where not calculated correctly. The bug only appeared for measurements along
long cables.
• Example notebook of importing a timeseries. For example, importing measurments from an external temperature
sensor for calibration.
• Updated documentation
8.5 0.5.3 (2018-10-26)
• No changes
8.6 0.5.2 (2018-10-26)
• New resample_datastore method (see basic usage notebook)

• New notebook on basic usage of DataStore
• Support for Silixa v4 (Windows xp based system) and Silixa v6 (Windows 7) measurement files
• The representation string now includes the sections
• Reorganized the IO related files
• CI: Add appveyor to continuesly test on Windows platform
• Auto load Silixa files to memory option, if size is small
58 Chapter 8. Changelog
8.7 0.5.1 (2018-10-19)
• Rewritten the routine that reads Silixa measurement files

• dts-calibration is now citable
• Refractored the MC confidence interval routine
• MC confidence interval routine speed up, with full dask support
• Link to mybinder.org to try the example notebooks online
• Added a few missing dependencies
• The routine to read the Silixa files is completely refractored. Faster, smarter. Supports both the path to a
directory and a list of file paths.
• Changed imports from dtscalibration to be relative
8.8 0.4.0 (2018-09-06)
• Single ended calibration

• Confidence intervals for single ended calibration
• Example notebooks have figures embedded
• Several bugs squashed
• Reorganized DataStore functions
8.9 0.2.0 (2018-08-16)
• Double ended calibration

• Confidence intervals for double ended calibration
8.10 0.1.0 (2018-08-01)
• First release on PyPI.
8.7. 0.5.1 (2018-10-19) 59

60 Chapter 8. Changelog
CHAPTER 9
Indices and tables
• genindex
• modindex
• search
61
62 Chapter 9. Indices and tables

Python Module Index
d
dtscalibration, 41
63
64 Python Module Index

Index
C R
calibration_double_ended() (dtscalibra- read_sensornet_files() (in module dtscalibra-
tion.DataStore method), 42 tion), 51
calibration_single_ended() (dtscalibra- read_silixa_files() (in module dtscalibration),
tion.DataStore method), 43 52
channel_configuration (dtscalibra- resample_datastore() (dtscalibration.DataStore
tion.DataStore attribute), 43 method), 46
chbw (dtscalibration.DataStore attribute), 43
chfw (dtscalibration.DataStore attribute), 44 S
conf_int_double_ended() (dtscalibra- sections (dtscalibration.DataStore attribute), 47
tion.DataStore method), 44
conf_int_single_ended() (dtscalibra- T
tion.DataStore method), 45 temperature_residuals() (dtscalibra-
D tion.DataStore method), 47
timeseries_keys (dtscalibration.DataStore at-
DataStore (class in dtscalibration), 41 tribute), 47
dtscalibration (module), 41 to_netcdf() (dtscalibration.DataStore method), 47
G U
get_default_encoding() (dtscalibra-
ufunc_per_section() (dtscalibration.DataStore
tion.DataStore method), 46
method), 48
get_time_dim() (dtscalibration.DataStore method),
46
get_x_dim() (dtscalibration.DataStore method), 46
V
variance_stokes() (dtscalibration.DataStore
I method), 49
in_confidence_interval() (dtscalibra- variance_stokes_exponential() (dtscalibra-
tion.DataStore method), 46 tion.DataStore method), 50
inverse_variance_weighted_mean() (dtscali-
bration.DataStore method), 46
inverse_variance_weighted_mean_array()
(dtscalibration.DataStore method), 46
is_double_ended (dtscalibration.DataStore at-
tribute), 46
O
open_datastore() (in module dtscalibration), 50
P
plot_dask() (in module dtscalibration), 52
65

Python Dts Calibration

Загружено:

Сведения о документе

Исходное описание:

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Python Dts Calibration

Загружено:

Авторское право:

Доступные форматы

dtscalibration

Apr 09, 2019

9 Indices and tables 61

Python Module Index 63

pip install dtscalibration

Or the development version directly from GitHub

pip install https://github.com/dtscalibration/python-dts-calibration/zipball/master --

1.2 Learn by examples

At the command line:

pip install dtscalibration

To use dtscalibration in a project:

4.1 1. Load your first measurement files

This notebook is located in https://github.com/bdestombe/python-dts-calibration/tree/master/examples/notebooks

from dtscalibration import read_silixa_files

The example data files are located in ./python-dts-calibration/tests/data.

filepath = os.path.join('..', '..', 'tests', 'data', 'double_ended2')

# Bonus: Just to show which files are in the folder

6 files were found, each representing a single timestep

filename_tstamp (time) int64 20180328014052498 ... 20180328014115480

timeFWend (time) datetime64[ns] 2018-03-28T00:40:54.097000 ... 2018-

timeFW (time) datetime64[ns] 2018-03-28T00:40:53.097000 ... 2018-

timeBWstart (time) datetime64[ns] 2018-03-28T00:40:54.097000 ... 2018-

timeBWend (time) datetime64[ns] 2018-03-28T00:40:56.097000 ... 2018-

timeBW (time) datetime64[ns] 2018-03-28T00:40:55.097000 ... 2018-

timestart (time) datetime64[ns] 2018-03-28T00:40:52.097000 ... 2018-

timeend (time) datetime64[ns] 2018-03-28T00:40:56.097000 ... 2018-

* time (time) datetime64[ns] 2018-03-28T00:40:54.097000 ... 2018-

acquisitiontimeFW (time) timedelta64[ns] 00:00:02 00:00:02 ... 00:00:02

8 Chapter 4. Learn by Examples

(continued from previous page)

.. and many more attributes. See: ds.attrs

4.2 2. Common DataStore functions

Examples of how to do some of the more commonly used functions:

from dtscalibration import read_silixa_files

filepath = os.path.join('..', '..', 'tests', 'data', 'single_ended')

4.2. 2. Common DataStore functions 9

3 files were found, each representing a single timestep

4.2.1 0 Access the data

ds['ST'] # is the data stored, presented as a DataArray

<xarray.DataArray 'ST' (x: 1461, time: 3)>

filename_tstamp (time) int64 20180504132202074 ... 20180504132303723

timeend (time) datetime64[ns] 2018-05-04T12:22:32.710000 ... 2018-05-

* time (time) datetime64[ns] 2018-05-04T12:22:17.710000 ... 2018-05-

acquisitiontimeFW (time) timedelta64[ns] 00:00:30 00:00:30 00:00:30

4.2.2 1 mean, min, max

ds_min = ds.mean(dim='time', keep_attrs=True) # take the minimum of all data

10 Chapter 4. Learn by Examples

ds_max = ds.max(dim='x', keep_attrs=True) # Take the maximum of all data variables

ds_std = ds.std(dim='time', keep_attrs=True) # Calculate the standard deviation

section = slice(20., 35.)

What if you would like to have the measurement at approximately 𝑥 = 20 m?

point_of_interest = ds.sel(x=20., method='nearest')

4.2.4 3 Selecting by index

section_of_interest = ds.isel(time=slice(0, 2)) # The first two time steps

4.2.5 4 Downsample (time dimension)

ds_resampled = ds.resample_datastore(how='mean', time="47S")

4.2.6 5 Upsample / Interpolation (length and time dimension)

We can do the same in the time dimension

4.2. 2. Common DataStore functions 11

4.3 3. Define calibration sections

from dtscalibration import read_silixa_files

filepath = os.path.join('..', '..', 'tests', 'data', 'double_ended2')

6 files were found, each representing a single timestep