pygwb.preprocessing
The preprocessing
module combines all the functions that handle the preprocessing of the data used in the analysis.
This is anything related to the preparation of the data for the pygwb
analysis run.
It can read data from frame files, locally or publicly (for additional information on frame files, see here).
Other functionalities include resampling the data, applying a high-pass filter to data or applying a timeshift.
These functionalities come together in the triplet of preprocessing_data
functions
which read in data and resample and/or high-passe the data on the fly.
The triplet can work for a gwpy.timeseries.TimeSeries
, a normal array or using a gravitational-wave channel
that will read data from that channel using the provided local or public frame files. Another functionality of the module is to
gate data based on the gating function in gwpy
, gwpy.timeseries.TimeSeries.gate
.
More information can be found here.
Examples
As an example, we read in some data from a certain channel and then resample, high-pass and apply gating to the data. First, we have to import the module.
>>> import pygwb.preprocessing as ppp
Then, we read in some data using the read_data
method.
For concreteness, we read in public data from the LIGO Hanford “H1” detector. This can be done as shown below.
The “public” tag indicates we are obtaining public data from the GWOSC servers.
>>> IFO = "H1"
>>> data_timeseries = ppp.read_data(
IFO,
"public", # data_type
"H1:GWOSC-16KHZ_R1_STRAIN", # channel
1247644138, # t0
1247648138, # tf
"", # local_data_path
16384 # input_sample_rate
)
>>> print(data_timeseries.sample_rate)
16384.0 Hz
The sample rate is shown for illustrative purposes. Now, we preprocess the data, meaning it is resampled and a high-pass filter is applied to the data. As an example, the data is resampled to 4 kHz.
>>> new_sample_rate = 4096
>>> preprocessed_timeseries = ppp.preprocessing_data_gwpy_timeseries(
IFO,
data_timeseries,
new_sample_rate,
11, # cutoff_frequency
2, # number_cropped_seconds
"hamming", # window_downsampling
"fir", # ftype
0 # timeshift
)
>>> print(preprocessed_timeseries.sample_rate)
4096.0 Hz
One can see that the sample rate was indeed modified. Another important part of preprocessing is gating the data. In that case, using again default values for parameters, one can run the following lines:
>>> gated_timeseries, deadtime = ppp.self_gate_data(
preprocessed_timeseries,
1.0, # gate_tzero
0.5, # gate_tpad
50.0, # gate_threshold
0.5, # cluster_window
True # gate_whiten
)
More information on the gating procedure can be found here.
Functions
|
Function to apply a high pass filter to a timeseries. |
|
Function doing the pre-processing of the data to be used in the remainder of the code. |
|
Function doing the pre-processing of a gwpy timeseries to be used in the remainder of the code. |
|
Function performing the pre-processing of a time-series array to be used in the remainder of the code. |
|
Function that read in the data to be used in the rest of the code. |
|
Function doing part of the pre-processing (resampling and filtering) of the data to be used in the remainder of the code. |
|
Function to self-gate data to be used in the stochastic pipeline. |
|
Function to identify segment start times either with or without sidereal option. |
|
Function that shifts a timeseries by an amount |