Once we have the configuration file set, we can collect and upload data. The usage for the command can be shown using the --help. Running that will display the following:

python3 collect_data.py --help

usage: collect_data.py [-h] [--config_file CONFIG_FILE]

Automate Regatta cluster data collection.

optional arguments:

 -h, --help show this help message and exit

 --config_file CONFIG_FILE Path to 'data_collector_config.json' config file.

Below is a description of the main functions.

Collecting Data

Once deployed, you may now run the collect_data script on each server. The command to run it is:

python3 collect_data.py --config_file /path/to/collect_data_configuration.json

This will execute the collection based on the predefined parameters in the configuration file.

For example, lets show a run with the defaults:

python3 collect_data.py –config_file collect_data_configuration.json

The script then will do as follows:

  1. Collect all the files from the default directories into a compressed archive.

    The archive name have the format:

    [hostname]-[ip]-[date]-[time].tar.zst

  2. The archive will be stored in the archive output directory

The file will be stored in the archive directory available for uploading.

An example of the compressed archive file:

tmp/regatta-host05-172_27_12_2-2024_11_14-15_52_56.tar

As we can see in this example, the directory is constructed of the 4 parameters mentioned:

a. Server Name: regatta-host05

b. Internal IP: 172_27_12_2

c. Date: 2024_11_14

d. Time: 15_52_56

Uploading Data

If we would like to upload the data collected from the previous command, we would update the following parameters in the data_collect_configuration.json:

upload=true

tar_path = "/tmp/regatta-host05-172_27_12_2-2024_11_14-15_52_56.tar"

As described previously, this will cause the data_collect script to do as follows:

  1. Avoid collecting data:

When tar_path is not null, the data will not be collected from the default paths but rather read from the path in tar_path.

  1. Apply the filter:

The data collected from the directory will be filtered according to the templates in file_ignore_list.

  1. Upload

The data will be uploaded to the Regatta cloud storage bucket

NOTE: We assume that there is active internet connection, otherwise the upload will fail.

Once the data is uploaded, Regatta Support will be able to review and analyze the data.

Note that if you have internet access when collecting the data you may set upload to true and leave the tar_path as null. In this case the data will be collected and uploaded to the Regatta cloud bucket in a single run.