Encoding Data

The encoding process is structured as the reverse of the decoding process. Data in memory is described in an Encoder object. This data is then encoded into a sequence of frames and written to an output data stream.

Encoder

The Encoder provides several options for handling memory layouts, in the same way as the Decoder.

Row-major layout

In row-major layout, consecutive elements in a data row reside adjacent to each other in memory, and the block of memory comprises a sequence of rows.

long nrows = 1000;
int ncols = 6;
double data[nrows][ncols];
// set up the data here...

odc_encoder_t* encoder = NULL;
odc_new_encoder(&encoder);

odc_encoder_add_column(encoder, "column0", ODC_INTEGER);
odc_encoder_add_column(encoder, "column1", ODC_INTEGER);
odc_encoder_add_column(encoder, "column2", ODC_REAL);
odc_encoder_add_column(encoder, "column3", ODC_STRING);
odc_encoder_add_column(encoder, "column4", ODC_REAL);

// column3 is a 16-byte string column (hence takes 2 cols in the array --> ncols=6)
odc_encoder_column_set_data_size(encoder, 3, 16);

odc_encoder_set_data_array(encoder, data, ncols*sizeof(double), nrows, 0);

// encode the data here...

odc_free_encoder(encoder);
Column-major layout

In column-major layout, consecutive elements in a single data column are adjacent to each other in memory, and the block of memory comprises a sequence of columns.

long nrows = 1000;
int ncols = 6;
double data[ncols][nrows];
// set up the data here...

odc_encoder_t* encoder = NULL;
odc_new_encoder(&encoder);

odc_encoder_add_column(encoder, "column0", ODC_INTEGER);
odc_encoder_add_column(encoder, "column1", ODC_INTEGER);
odc_encoder_add_column(encoder, "column2", ODC_REAL);
odc_encoder_add_column(encoder, "column3", ODC_STRING);
odc_encoder_add_column(encoder, "column4", ODC_REAL);

// column3 is a 16-byte string column (hence takes 2 cols in the array --> ncols=6)
odc_encoder_column_set_data_size(encoder, 3, 16);

odc_encoder_set_data_array(encoder, data, ncols*sizeof(double), nrows, sizeof(double));

// encode the data here...

odc_free_encoder(encoder);
Custom layout

For a custom periodic layout, a periodic memory layout can be specified for each column independently to match the data layout of a specific source of data.

long nrows = 1000;

uint64_t data0[nrows];
uint64_t data1[nrows];
double data2[nrows];
char data3[nrows][16];
double data4[nrows];
// set up the data here...

odc_encoder_t* encoder = NULL;
odc_new_encoder(&encoder);

odc_encoder_set_row_count(encoder, nrows);

odc_encoder_add_column(encoder, "column0", ODC_INTEGER);
odc_encoder_add_column(encoder, "column1", ODC_INTEGER);
odc_encoder_add_column(encoder, "column2", ODC_REAL);
odc_encoder_add_column(encoder, "column3", ODC_STRING);
odc_encoder_add_column(encoder, "column4", ODC_REAL);

// column3 is a 16-byte string column
odc_encoder_column_set_data_size(encoder, 3, 16);

odc_encoder_column_set_data_array(encoder, 0, sizeof(uint64_t), sizeof(uint64_t), data0);
odc_encoder_column_set_data_array(encoder, 1, sizeof(uint64_t), sizeof(uint64_t), data1);
odc_encoder_column_set_data_array(encoder, 2, sizeof(double), sizeof(double), data2);
odc_encoder_column_set_data_array(encoder, 3, 16, 16, data3);
odc_encoder_column_set_data_array(encoder, 4, sizeof(double), sizeof(double), data4);

// encode the data here...

odc_free_encoder(encoder);

Once an Encoder describing the data has been constructed, the data can be encoded into frames.

C supports data encoding in three ways.

File descriptor

Data can be encoded into an already open file descriptor using odc_encode_to_file_descriptor() function.

#include <fcntl.h>
#include <unistd.h>

int file_descriptor = open("imaginary/path.odb", O_CREAT|O_TRUNC|O_WRONLY, 0666);
long bytes_encoded;

odc_encode_to_file_descriptor(encoder, file_descriptor, &bytes_encoded);

close(file_descriptor);
Memory buffer

Data can be encoded into a pre-allocated memory buffer using odc_encode_to_buffer() function.

char buffer[4096];
long bytes_encoded;

odc_encode_to_buffer(encoder, buffer, sizeof(buffer), &bytes_encoded);

Note

In case an insufficiently large buffer is supplied, an error will be returned.

Stream handler

Data can be encoded via a stream handler using odc_encode_to_stream() function. A write callback function is called for each chunk of data to be written to the output stream in an analogue to the POSIX write() function.

long write_fn(void* context, const void* buffer, long length) {
    // user defined action
    return length; // return handled length
}

// user defined context, passed unchanged to callback
void* context;
long bytes_encoded;

odc_encode_to_stream(encoder, context, write_fn, &bytes_encoded);

Bitfields

Bitfield columns can be used to store data for flags, up to a maximum of 32-bits per column. Within an integer, the bits can be identified and named by their offset. Groups of bits can be named and identified as well as individual bits, therefore each item has an offset and a size.

long nrows = 1000;
int ncols = 1;
uint64_t data[nrows];
// set up the data here...

odc_encoder_t* encoder = NULL;
odc_new_encoder(&encoder);

odc_encoder_set_row_count(encoder, nrows);

odc_encoder_add_column(encoder, "flags", ODC_BITFIELD);

odc_encoder_column_add_bitfield(encoder, 0, "flag_a", 1);
odc_encoder_column_add_bitfield(encoder, 0, "flag_b", 2);
odc_encoder_column_add_bitfield(encoder, 0, "flag_c", 3);
odc_encoder_column_add_bitfield(encoder, 0, "flag_d", 1);

odc_encoder_column_set_data_array(encoder, 0, sizeof(uint64_t), sizeof(uint64_t), data);

// encode the data here...

odc_free_encoder(encoder);

Properties

An arbitrary dictionary of string key:value pairs can be associated with a frame.

const char* property_key = "encoded_by";
const char* property_value = "ECMWF";

odc_encoder_add_property(encoder, property_key, property_value);