API reference

pybufrkit

Work with WMO BUFR messages with Pure Python.

Note

APIs of version 0.2.0 are breaking changes from those of version 0.1.x and it is always recommended to upgrade to the latest version.

github:https://github.com/ywangd/pybufrkit
docs:http://pybufrkit.readthedocs.io/
author:Yang Wang (ywangd@gmail.com)

pybufrkit.bufr

Classes for representing a BUFR message and its components.

class pybufrkit.bufr.BufrMessage(filename='')

This class represents a single BUFR message that is comprised of different sections. Note this is different from BufrTemplateData which is only part of the overall message and dedicates to data associated to the Template.

Properties of this class are proxies to actual fields of its sections. They are set by the sections when they are processed. The proxy approach allows these properties to be referenced in a consistent way no matter where they actually come from. This makes sections loosely coupled, i.e. one section does not need to know about other sections, and free to change if needed.

add_section(section)

Add a section to the message

Parameters:section (BufrSection) – The Bufr Section to add
build_template(tables_root_dir, normalize=1)

Build the BufrTemplate object using the list of unexpanded descriptors and corresponding table group.

Parameters:
  • tables_root_dir – The root directory to find BUFR tables
  • normalize – Whether to use some default table group if the specific one is not available.
Returns:

A tuple of BufrTemplate and the associated TableGroup

wire()

Wire the flat list of descriptors and values to a full hierarchical structure. Also allocate all attributes to their corresponding descriptors.

class pybufrkit.bufr.BufrSection

This class represents a Section in a Bufr Message.

add_parameter(parameter)

Add a parameter to the section object.

Parameters:parameter (SectionParameter) –
get_metadata(k)

Get value for metadata of the given name.

Parameters:k (str) – Name of the metadata.
Returns:Value of the metadata
get_parameter_offset(parameter_name)

Get the bit offset from the beginning of the section for parameter of the given name.

Returns:The bit offset.
Return type:int
set_metadata(k, v)

Set value to a metadata of the given key.

Parameters:
  • k (str) – Name of the metadata
  • v (object) – Value of the metadata
class pybufrkit.bufr.SectionConfigurer(definitions_dir=None)

This class is responsible for loading the section config JSON files. It also initialise and configure a requested Section.

configure_section(bufr_message, section_index, configuration_transformers=())

Initialise and Configure a section for the give section index and version.

Parameters:
  • bufr_message (BufrMessage) – The Bufr Message object to configure
  • section_index (int) – (Zero-based) index of the section
  • configuration_transformers (collection) – A collection of configuration transformation functions. These functions make it possible to use the same set of JSON files while still dynamically providing different coder behaviours.
Returns:

The configured section or None if not present

configure_section_with_values(bufr_message, section_index, values, overrides=None)

Initialise and Configure a section for the give section index and version and also populate the value of each section parameter with the given list of values. Used by the encoder.

Parameters:
  • bufr_message (BufrMessage) – The BUFR message object to configure
  • section_index (int) – The zero-based section index
  • values (list) – A list of values for the parameters.
Returns:

The configured section or None if the section is not present

static get_section_index_and_edition(fname)

Get Section Index and version from file name of a configuration file.

Parameters:fname (str) – The base file name
Returns:The index and edition numbers.
static ignore_value_expectation(config)

Remove any expectation value check.

Parameters:config (dict) – The config JSON object loaded from a configuration file.
static info_configuration(config)

This is a configuration transformation function to make the decoder work only for the part of message before the template data.

Parameters:config (dict) – The config JSON object loaded from a configuration file.
class pybufrkit.bufr.SectionNamespace(**kwds)

A Section Namespace is an ordered dictionary that store the decoded parameters with their names as the keys.

class pybufrkit.bufr.SectionParameter(name, nbits, data_type, expected, as_property, value=None)

This class represents a Parameter of a Bufr Section.

pybufrkit.descriptors

The Descriptors should always be instantiated by Tables. Because the Tables provide caching and other wiring work. Do NOT instantiated the Descriptors directly!!

This module contains many Descriptor classes, covering not only the canonical descriptor types of the BUFR spec, but also Conceptual Descriptors that help the processing. For an example, an AssociatedDescriptor class is needed to represent associated values signified by operator descriptor 204YYY.

class pybufrkit.descriptors.AssociatedDescriptor(id_, nbits)

Associated field for element descriptor

Parameters:nbits (int) – Number of bits used by this descriptor.
class pybufrkit.descriptors.BufrTemplate(id_=999999, name='', members=None)

This class represents a BUFR Template. A Template is composed of one or more BUFR Descriptors. It is used in a BUFR message to describe the data section.

original_descriptor_ids

Get the list of descriptor IDs that can be used to instantiate the Template.

:rtype [int]

class pybufrkit.descriptors.DelayedReplicationDescriptor(id_, members=None, factor=None)

Delayed replication Descriptor 1XX000

class pybufrkit.descriptors.Descriptor(id_)

This class is the base class of all BUFR descriptors. It provides common machinery for Descriptors.

Parameters:id (int) – The descriptor ID.
F

The F value of the descriptor.

X

The X value of the descriptor.

Y

THe Y value of the descriptor.

class pybufrkit.descriptors.ElementDescriptor(id_, name, unit, scale, refval, nbits, crex_unit, crex_scale, crex_nchars)

Element Descriptor 0XXYYY

Parameters:
  • id (int) – The descriptor ID
  • name (str) – Name of the descriptor
  • unit (str) – Units of the descriptor
  • scale (int) – Scale factor of the descriptor value
  • refval (int) – Reference value of the descriptor value
  • nbits (int) – The number of bits used by the descriptor
  • crex_unit (str) – Units of the descriptor for CREX spec
  • crex_scale (int) – Scale factor of the descriptor value for CREX Spec
  • crex_nchars (int) – Number of characters used by the descriptor for CREX Spec
class pybufrkit.descriptors.FixedReplicationDescriptor(id_, members=None)

Fixed replication Descriptor 1XXYYY

n_repeats

Number of times to perform the replication. This value is decoded directly from the descriptor ID.

class pybufrkit.descriptors.MarkerDescriptor(id_, name, unit, scale, refval, nbits, crex_unit, crex_scale, crex_nchars)

A marker descriptor is useful in the case when marker operator descriptors are used to signify a statistical value of an element descriptor. For an example, 224255 and 225255.

static from_element_descriptor(ed, marker_id, scale=None, refval=None, nbits=None)

Create from a given element descriptor with the option to override its scale, refval and nbits.

Parameters:
  • ed (ElementDescriptor) – The element descriptor
  • marker_id (int) – The marker operator ID
  • scale (int) – Overridden value for scale.
  • refval (int) – Overridden value for reference.
  • nbits (int) – Overridden value for number of bits.
Return type:

MarkerDescriptor

class pybufrkit.descriptors.OperatorDescriptor(id_)

Operator Descriptor 2XXYYY

class pybufrkit.descriptors.ReplicationDescriptor(id_, members=None)

The replication factor member stores only the replication factor descriptor NOT the actual value. So it is OK as it should be reusable for the same sequence descriptor. That is to say, when a Sequence Descriptor, e.g. 309052, is reused, the Replication Descriptor inside it should always have the same replication factor descriptor. Although these replication factor descriptor can have different values in different reuses of 309052, it does not matter as it does not store the actual values.

When the replication descriptor is reused as naked descriptor, i.e. not part of a Sequence Descriptor but directly under a Template, the associated replication factor descriptor could be different. But since the replication descriptor is NOT cached when used as naked. Every time a new Replication Descriptor is spawn thus there is no risk on the associated replication factor descriptor gets mixed up.

Parameters:members ([Descriptor]) – The group of descriptors to be replicated
n_items

Number of descriptors to be replicated. This value is decoded from the ID of the descriptor.

n_members

Due to the hierarchical structure of the BUFR Template, The number of members is not always equal to number of items. For an example, the delayed replication factor counts towards number of items to be repeated for its outer replication (if nested). However, it will never be counted towards number of members. Other potential difference comes from Virtual descriptors, where virtual sequences and fixed replications are inserted/removed without fixing the enclosing replication descriptors. So in summary, the number of members is a more accurate count of the members to be replicated by the replication descriptor.

class pybufrkit.descriptors.SequenceDescriptor(id_, name, members=None)

Sequence Descriptor 3XXYYY

class pybufrkit.descriptors.SkippedLocalDescriptor(id_, nbits)

The skipped local descriptor is a placeholder for any descriptors followed by operator descriptor 206YYY.

class pybufrkit.descriptors.UndefinedDescriptor(id_)

Undefined Descriptors are only useful when loading BUFR tables that are NOT completely defined. For an example, an element descriptor is used by one of the sequence descriptor but the element descriptor itself is not defined in Table B. In this case, an Undefined descriptor is created in place of the actual element descriptor to allow tables to be loaded normally. As long as the Undefined descriptor is not used in the actual decoding (the Template of a BUFR message may not contain the descriptor at all), it is harmless to stay in the loaded Table Group.

Ideally this is not necessary if all tables are well defined. However, in practice, this is needed so some not-well-defined local tables can be used.

class pybufrkit.descriptors.UndefinedElementDescriptor(id_)
class pybufrkit.descriptors.UndefinedSequenceDescriptor(id_)
pybufrkit.descriptors.flat_member_ids(descriptor)

Return a flat list of expanded numeric IDs for the given descriptor. The list is generated by recursively flatten all its child members.

Parameters:descriptor (Descriptor) – A BUFR descriptor
Returns:[int]

pybufrkit.tables

The Table Cache makes sure tables of the same version only get loaded from disk once. When the get_table_group method is called, it returned a group of table either from the cache or loaded from disk if they are not available in the cache yet (and save them to the cache for future use).

  • A Table Group contains a set of tables, e.g. A, B, C, D, that belong to the same table group key.
  • A Table instance, e.g. B, D, maintains a cache of its descriptors so that only a single instance is created for an unique descriptor.
  • The Pseudo Replication Descriptor table is created to make the API for all tables look alike.
  • TableCache –creates–> TableGroup –lookup–> Descriptors/Template

Template are then processed by Coder in conjunction with a bit operator to create a BufrMessage object.

class pybufrkit.tables.TableGroupKey(tables_root_dir, wmo_tables_sn, local_tables_sn)
local_tables_sn

Alias for field number 2

tables_root_dir

Alias for field number 0

wmo_tables_sn

Alias for field number 1

pybufrkit.coder

class pybufrkit.coder.AuditedList

This class provides wrappers for some list methods, e.g. append, so that it is possible to execute additional code when the method is invoked. It is used mainly for debug purpose.

append(p_object)

L.append(object) – append object to end

class pybufrkit.coder.BSRModifier(nbits_increment, scale_increment, refval_factor)
nbits_increment

Alias for field number 0

refval_factor

Alias for field number 2

scale_increment

Alias for field number 1

class pybufrkit.coder.Coder(definitions_dir=None, tables_root_dir=None)

This class is an abstract superclass for Decoder and Encoder. By itself it cannot do anything. But it provides common operations for subclasses.

Parameters:
  • definitions_dir – Where to find the BPCL definition files.
  • tables_root_dir – Where to find the BUFR table files.
define_bitmap(state, reuse)

Define a bit map.

Parameters:
  • state
  • reuse – Is this bitmap for reuse?
get_value_for_delayed_replication_factor(state)

Get value of the latest delayed replication factor. This is called when processing through the Template. But the actual implementation will be provided by sub-classes.

Parameters:state
Returns:The value for the latest processed delayed replication factor
process(*args, **kwargs)

Entry point of the class

process_associated_field(state, bit_operator, descriptor)
Parameters:
  • bit_operator
  • descriptor
process_bitmap_definition(state, bit_operator, descriptor)

Process bitmap definition. This is basically done as a state machine for processing all the bits associated to the bitmap.

Parameters:
  • bit_operator
  • descriptor
process_bitmapped_descriptor(state, bit_operator, descriptor)

A generic method for processing bitmapped descriptors. It is wrapped by providing different funcs to handle encoding and decoding for uncompressed and compressed data.

process_codeflag(state, bit_operator, descriptor, nbits)

Process a descriptor that has code/flag value. A code/flag value does not need to scale and refval.

Parameters:
  • descriptor – The BUFR descriptor
  • nbits – Number of bits to process for the descriptor.
process_constant(state, bit_operator, descriptor, value)

Process a constant, with no bit operations, for the given descriptor.

Parameters:
  • descriptor – The BUFR descriptor.
  • value – The constant value.
process_define_new_refval(state, bit_operator, descriptor)

Process defining a new reference value for the given descriptor.

Parameters:
  • state
  • bit_operator
  • descriptor
process_delayed_replication_descriptor(state, bit_operator, descriptor)

Process the delayed replication factor descriptor.

Parameters:
  • state
  • bit_operator
process_element_descriptor(state, bit_operator, descriptor)

Process an ElementDescriptor.

process_fixed_replication_descriptor(state, bit_operator, descriptor)

Process a fixed replication descriptor including all members belong to this replication structure.

Parameters:
  • state
  • bit_operator
process_marker_operator_descriptor(state, bit_operator, descriptor)
Parameters:
  • bit_operator
  • descriptor
process_members(state, bit_operator, members)

Process a list of descriptors that are members of a composite descriptor.

Parameters:
  • state – The state of the processing.
  • bit_operator – The bit operator for read/write bits.
  • members – A list of descriptors.
process_new_refval(state, bit_operator, descriptor, nbits)

Process the new reference value for the given descriptor.

Parameters:
  • descriptor – The BUFR descriptor.
  • nbits – Number of bits to process.
process_numeric(state, bit_operator, descriptor, nbits, scale_powered, refval)

Process a descriptor that has numeric value.

Parameters:
  • descriptor – A BUFR descriptor that has numeric value
  • nbits – Number of bits to process for the descriptor.
  • scale_powered – 10 to the scale factor power, i.e. 10 ** scale
  • refval – The reference value
process_numeric_of_new_refval(state, bit_operator, descriptor, nbits, scale_powered, refval_factor)

Process a descriptor that has numeric value with new reference value.

Parameters:
  • descriptor – The BUFR descriptor.
  • nbits – Number of bits to process for the descriptor.
  • scale_powered – 10 to the scale factor power, i.e. 10 ** scale
  • refval_factor – The factor to be applied to the new refval.
process_operator_descriptor(state, bit_operator, descriptor)

Process Operator Descriptor.

Parameters:
  • state
  • bit_operator
process_section(bufr_message, bit_operator, section)

Process the given section of a BUFR message

Parameters:
  • bufr_message – The BufrMessage object to process
  • bit_operator – The bit operator (reader or writer)
  • section
process_skipped_local_descriptor(state, bit_operator, descriptor)

Skip number of bits defined for the local descriptor.

Parameters:
  • state
  • bit_operator
  • descriptor
process_string(state, bit_operator, descriptor, nbytes)

Process a descriptor that has string value

Parameters:
  • descriptor – The BUFR descriptor
  • nbytes – Number of BYTES to process for the descriptor.
process_template(state, bit_operator, template)

Process the top level BUFR Template

Parameters:
  • state – The state of the processing.
  • bit_operator – The bit operator for read/write bits.
  • template – The BUFR Template of the message.
class pybufrkit.coder.CoderState(is_compressed, n_subsets, decoded_values_all_subsets=None)

The state of Coder for keeping track of variables when a Coder is working. The use of a new state for each run makes it possible to use a single Coder to run multiple decoding/encoding tasks.

Parameters:decoded_values_all_subsets – This is only for Encoder use.

Must be called before the descriptor is processed

build_bitmapped_descriptors(bitmap)

Build the bitmapped descriptors based on the given bitmap. Also build the back referenced descriptors if it is not already defined.

static minmax(values)

Give a list of values, find out the minimum and maximum, ignore any Nones.

switch_subset_context(idx_subset)

This function is only useful for uncompressed data.

pybufrkit.decoder

class pybufrkit.decoder.Decoder(definitions_dir=None, tables_root_dir=None, compiled_template_cache_max=None)

The decoder takes a bytes type string and decode it to a BUFR Message object.

define_bitmap(state, reuse)

For compressed data, bitmap and back referenced descriptors must be identical Otherwise it makes no sense in compressing different bitmapped descriptors into one slot.

Parameters:
  • state
  • reuse – Is this bitmap for reuse?
Returns:

The bitmap as a list of 0 and 1.

get_value_for_delayed_replication_factor(state)

Get value of the latest delayed replication factor. This is called when processing through the Template. But the actual implementation will be provided by sub-classes.

Parameters:state
Returns:The value for the latest processed delayed replication factor
process(s, file_path='<string>', start_signature='BUFR', info_only=False, ignore_value_expectation=False, wire_template_data=True)

Decoding the given message string.

Parameters:
  • s – Message string that contains the BUFR Message
  • file_path – The file where this string is read from.
  • start_signature – Locate the starting position of the message string with the given signature.
  • info_only – Only show information up to template data (exclusive)
  • ignore_value_expectation – Do not validate the expected value
  • wire_template_data – Whether to wire the template data to construct a fully hierarchical structure from the flat lists. Only takes effect when it is NOT info_only.
Returns:

A BufrMessage object that contains the decoded information.

process_codeflag(state, bit_reader, descriptor, nbits)

Process a descriptor that has code/flag value. A code/flag value does not need to scale and refval.

Parameters:
  • descriptor – The BUFR descriptor
  • nbits – Number of bits to process for the descriptor.
process_constant(state, bit_reader, descriptor, value)

Process a constant, with no bit operations, for the given descriptor.

Parameters:
  • descriptor – The BUFR descriptor.
  • value – The constant value.
process_new_refval(state, bit_reader, descriptor, nbits)

Process the new reference value for the given descriptor.

Parameters:
  • descriptor – The BUFR descriptor.
  • nbits – Number of bits to process.
process_numeric(state, bit_reader, descriptor, nbits, scale_powered, refval)

Process a descriptor that has numeric value.

Parameters:
  • descriptor – A BUFR descriptor that has numeric value
  • nbits – Number of bits to process for the descriptor.
  • scale_powered – 10 to the scale factor power, i.e. 10 ** scale
  • refval – The reference value
process_numeric_of_new_refval(state, bit_reader, descriptor, nbits, scale_powered, refval_factor)

Process a descriptor that has numeric value with new reference value.

Parameters:
  • descriptor – The BUFR descriptor.
  • nbits – Number of bits to process for the descriptor.
  • scale_powered – 10 to the scale factor power, i.e. 10 ** scale
  • refval_factor – The factor to be applied to the new refval.
process_section(bufr_message, bit_reader, section)

Decode the given configured Section.

Parameters:
  • bufr_message – The BUFR message object.
  • section – The BUFR section object.
  • bit_reader
Returns:

Number of bits decoded for this section.

process_string(state, bit_reader, descriptor, nbytes)

Process a descriptor that has string value

Parameters:
  • descriptor – The BUFR descriptor
  • nbytes – Number of BYTES to process for the descriptor.
process_template_data(bufr_message, bit_reader)

Decode data described by the template.

Parameters:
  • bufr_message – The BUFR message object.
  • bit_reader
Returns:

TemplateData decoded from the bit stream.

process_unexpanded_descriptors(bit_reader, section)

Decode for the list of unexpanded descriptors.

Parameters:
  • section – The BUFR section object.
  • bit_reader
Returns:

The unexpanded descriptors as a list.

pybufrkit.decoder.generate_bufr_message(decoder, s, info_only=False, continue_on_error=False, filter_expr=None, *args, **kwargs)

This is a generator function that processes the given string for one or more BufrMessage till it is exhausted.

Parameters:
  • decoder (Decoder) – Decoder to use
  • s (bytes) – String to decode for messages
Returns:

BufrMessage object

pybufrkit.encoder

class pybufrkit.encoder.Encoder(definitions_dir=None, tables_root_dir=None, ignore_declared_length=True, compiled_template_cache_max=None, master_table_number=None, master_table_version=None)

The encoder takes a JSON object or string and encoded it to a BUFR message.

Parameters:ignore_declared_length – If set, ignore the section_length declared in the input JSON message and always calculated it.
define_bitmap(state, reuse)

For compressed data, bitmap and back referenced descriptors must be identical Otherwise it makes no sense in compressing different bitmapped descriptors into one slot.

Parameters:reuse – Is this bitmap for reuse?
get_value_for_delayed_replication_factor(state)

Get value of the latest delayed replication factor. This is called when processing through the Template. But the actual implementation will be provided by sub-classes.

Parameters:state
Returns:The value for the latest processed delayed replication factor
process(s, file_path='<string>', wire_template_data=True)

Entry point for the encoding process. The process encodes a JSON format message to BUFR message.

Parameters:
  • s – A JSON or its string serialized form
  • file_path – The file path to the JSON file.
  • wire_template_data – Whether to wire the template data to construct a fully hierarchical structure from the flat lists.
Returns:

A bitstring object of the encoded message.

process_codeflag(state, bit_writer, descriptor, nbits)

Process a descriptor that has code/flag value. A code/flag value does not need to scale and refval.

Parameters:
  • descriptor – The BUFR descriptor
  • nbits – Number of bits to process for the descriptor.
process_codeflag_uncompressed(state, bit_writer, descriptor, nbits)

Decode a descriptor of code or flag value. A code or flag value does not need to scale and refval.

process_constant(state, bit_writer, descriptor, value)

Process a constant, with no bit operations, for the given descriptor.

Parameters:
  • descriptor – The BUFR descriptor.
  • value – The constant value.
process_constant_compressed(state, bit_writer, descriptor, value)

This method is used for pop out value 0 for 222000, etc.

process_constant_uncompressed(state, bit_writer, descriptor, value)

This is in fact skip the value for encoding. Useful for operator descriptor 222000 etc.

process_new_refval(state, bit_writer, descriptor, nbits)

Process the new reference value for the given descriptor.

Parameters:
  • descriptor – The BUFR descriptor.
  • nbits – Number of bits to process.
process_numeric(state, bit_writer, descriptor, nbits, scale_powered, refval)

Process a descriptor that has numeric value.

Parameters:
  • descriptor – A BUFR descriptor that has numeric value
  • nbits – Number of bits to process for the descriptor.
  • scale_powered – 10 to the scale factor power, i.e. 10 ** scale
  • refval – The reference value
process_numeric_of_new_refval(state, bit_writer, descriptor, nbits, scale_powered, refval_factor)

Encode a descriptor of numeric value that has a new reference value set by 203 YYY. This new reference value must be retrieved at runtime as it is defined in the data section.

Parameters:refval_factor (int) – The refval factor set as part of 207 YYY
Returns:
process_section(bufr_message, bit_writer, section)

Process the given section of a BUFR message

Parameters:
  • bufr_message – The BufrMessage object to process
  • bit_operator – The bit operator (reader or writer)
  • section
process_string(state, bit_writer, descriptor, nbytes)

Process a descriptor that has string value

Parameters:
  • descriptor – The BUFR descriptor
  • nbytes – Number of BYTES to process for the descriptor.
process_string_uncompressed(state, bit_writer, descriptor, nbytes)

Decode a string value of the given number of bytes

Parameters:
  • descriptor
  • nbytes – Number of bytes to read for the string.
process_template_data(bufr_message, bit_writer, section_parameter)
Parameters:bit_writer
Returns:
process_unexpanded_descriptors(bit_writer, section_parameter)

Encode the list of unexpanded descriptors. :type bit_writer: bitops.BitWriter :type section_parameter: bufr.SectionParameter

pybufrkit.templatecompiler

pybufrkit.templatecompiler.loads_compiled_template(s)

Load a compiled template object from its JSON string representation.

Parameters:s – A JSON string represents the compiled template.
Returns:The compiled template
class pybufrkit.templatecompiler.TemplateCompiler

The compiler for the BUFR Template. This class does its job by recording calls from the generic Coder.

define_bitmap(state, reuse)

Define a bit map.

Parameters:
  • state
  • reuse – Is this bitmap for reuse?
get_value_for_delayed_replication_factor(state)

Get value of the latest delayed replication factor. This is called when processing through the Template. But the actual implementation will be provided by sub-classes.

Parameters:state
Returns:The value for the latest processed delayed replication factor
process(template, table_group)

Entry point of the Compiler.

Parameters:
  • template (descriptors.BufrTemplate) – The BUFR template to compile
  • table_group (tables.TableGroup) – The Table Group used to instantiate the Template.
Returns:

CompiledTemplate

process_bitmap_definition(state, bit_operator, descriptor)

Process bitmap definition. This is basically done as a state machine for processing all the bits associated to the bitmap.

Parameters:
  • bit_operator
  • descriptor
process_bitmapped_descriptor(state, bit_operator, descriptor)

A generic method for processing bitmapped descriptors. It is wrapped by providing different funcs to handle encoding and decoding for uncompressed and compressed data.

process_codeflag(state, bit_operator, descriptor, nbits)

Process a descriptor that has code/flag value. A code/flag value does not need to scale and refval.

Parameters:
  • descriptor – The BUFR descriptor
  • nbits – Number of bits to process for the descriptor.
process_constant(state, bit_operator, descriptor, value)

Process a constant, with no bit operations, for the given descriptor.

Parameters:
  • descriptor – The BUFR descriptor.
  • value – The constant value.
process_delayed_replication_descriptor(state, bit_operator, descriptor)

Process the delayed replication factor descriptor.

Parameters:
  • state
  • bit_operator
process_fixed_replication_descriptor(state, bit_operator, descriptor)

Process a fixed replication descriptor including all members belong to this replication structure.

Parameters:
  • state
  • bit_operator
process_new_refval(state, bit_operator, descriptor, nbits)

Process the new reference value for the given descriptor.

Parameters:
  • descriptor – The BUFR descriptor.
  • nbits – Number of bits to process.
process_numeric(state, bit_operator, descriptor, nbits, scale_powered, refval)

Process a descriptor that has numeric value.

Parameters:
  • descriptor – A BUFR descriptor that has numeric value
  • nbits – Number of bits to process for the descriptor.
  • scale_powered – 10 to the scale factor power, i.e. 10 ** scale
  • refval – The reference value
process_numeric_of_new_refval(state, bit_operator, descriptor, nbits, scale_powered, refval_factor)

Process a descriptor that has numeric value with new reference value.

Parameters:
  • descriptor – The BUFR descriptor.
  • nbits – Number of bits to process for the descriptor.
  • scale_powered – 10 to the scale factor power, i.e. 10 ** scale
  • refval_factor – The factor to be applied to the new refval.
process_section(bufr_message, bit_operator, section)

Process the given section of a BUFR message

Parameters:
  • bufr_message – The BufrMessage object to process
  • bit_operator – The bit operator (reader or writer)
  • section
process_string(state, bit_operator, descriptor, nbytes)

Process a descriptor that has string value

Parameters:
  • descriptor – The BUFR descriptor
  • nbytes – Number of BYTES to process for the descriptor.
class pybufrkit.templatecompiler.CompiledTemplateManager(cache_max)

A management class for compiled templates that handles caching and lookup.

Parameters:cache_max – The maximum number of compiled templates to cache.
get_or_compile(template, table_group)
Parameters:
  • template (descriptors.BufrTemplate) – The BUFR template to compile
  • table_group (tables.TableGroup) – The Table Group used to instantiate the Template.
Returns:

pybufrkit.templatecompiler.process_compiled_template(coder, state, bit_operator, compiled_template)

This function runs the compiled code from the TemplateCompiler

Parameters:
  • coder (Coder) –
  • state (VmState) –
  • bit_operator
  • compiled_template (Block) –

pybufrkit.templatedata

The TemplateData object is dedicated to the data decoded for the template of a BUFR message, while Bufr object is for the entire BUFR message. The object provides a fully hierarchical view of the data with attributes properly allocated to their corresponding values.

class pybufrkit.templatedata.AssociatedFieldNode(descriptor, index)
class pybufrkit.templatedata.DataNode(descriptor)

A node is composed of a descriptor and its value (if exists) and any possible child or attribute nodes.

class pybufrkit.templatedata.DelayedReplicationNode(descriptor)
class pybufrkit.templatedata.DifferenceStatsNode(descriptor, index)
class pybufrkit.templatedata.FirstOrderStatsNode(descriptor, index)
class pybufrkit.templatedata.FixedReplicationNode(descriptor)
class pybufrkit.templatedata.NoValueDataNode(descriptor)

A no value node is for any descriptors that cannot have a value, e.g. replication descriptors, sequence descriptors and some operator descriptors, e.g. 201YYY.

class pybufrkit.templatedata.QualityInfoNode(descriptor, index)
class pybufrkit.templatedata.ReplacementNode(descriptor, index)
class pybufrkit.templatedata.SequenceNode(descriptor)
class pybufrkit.templatedata.SubstitutionNode(descriptor, index)
class pybufrkit.templatedata.TemplateData(template, is_compressed, decoded_descriptors_all_subsets, decoded_values_all_subsets, bitmap_links_all_subsets)

This class is dedicated to the data section of a BUFR message and produces a fully hierarchical structure for the otherwise flat list of decoded descriptors and values. Attributes like associated fields and statistical values are properly allocated to their corresponding referred elements.

wire()

From the flat list of descriptors and values, construct a fully hierarchical structure of data including sequence descriptors and correctly set bitmapped values to their corresponding node as attributes.

wire_delayed_replication_descriptor(descriptor)
Parameters:descriptor (DelayedReplicationDescriptor) –
wire_fixed_replication_descriptor(descriptor)
Parameters:descriptor (FixedReplicationDescriptor) –
wire_operator_descriptor(descriptor)
Parameters:descriptor (OperatorDescriptor) –
Returns:
class pybufrkit.templatedata.ValueDataNode(descriptor, index)

A value node is for any descriptors that can have a value attached to it. This includes all Element descriptor, Associated descriptor, Skipped local descriptor, some operator descriptors, e.g. 205YYY, 223255, etc.

Parameters:index (int) – The index to the descriptors and values array for getting the descriptor and its associated value.

pybufrkit.dataquery

class pybufrkit.dataquery.NodePathParser(bare_id_matches_all=True)

This class provides a parser for parsing path query string.

Parameters:bare_id_matches_all (bool) – By default, a path component with bare ID, i.e. with no slicing part, means match all occurrences, i.e. [::]. If set to False, it only matches the first occurence.
class pybufrkit.dataquery.DataQuerent(path_parser)

This class provides interface to query the BUFR Data section.

create_values_from_nodes(nodes, decoded_values)

Process through the nested matching node list and create an values list of identical structure. This method is recursive.

Parameters:
  • nodes – A nested list of matching nodes.
  • decoded_values
Returns:

A nested values list corresponding to the given nodes.

descend_and_proceed(nodes, path_components)

Processing through the given list of nodes, for any nodes that are not a direct match of the path component, descent to its sub-nodes for further matching. Once a match is found, it then proceed through the path component till all the component is matched or zero match is encountered.

Parameters:
  • nodes – A list of nodes to descend into its sub-nodes
  • path_components – The path components used for matching.
filter_for_attribute_sub_nodes(node, path_components)

This method is a specific version of the filter_for_sub_nodes method. It first filters through the attribute nodes of the given node and then goes depth first till all the path components are matched or zero match is found.

filter_for_child_sub_nodes(node, path_components)

This method is a specific version of the filter_for_sub_nodes method. It first filters through the child nodes of the given node and then goes depth first till all the path components are matched or zero match is found.

filter_for_descendant_sub_nodes(node, path_components)

This method is a specific version of the filter_for_sub_nodes method. It filter through all descendant nodes in a depth first fashion of the given node. A descendant node could be either a child, attribute or factor node all the way to the leaf node. It then process through path components till every component is matched or zero match is encountered.

filter_for_nodes(nodes, path_component)

Filter the given list of nodes using the path component. Note this method is different from the filter_for_sub_nodes method in that it filters the given nodes themselves, NOT their sub-nodes. The return value will be a selection of the given nodes.

Parameters:
  • nodes – A list of nodes to be filtered
  • path_component – The path component used for the filtering.
Returns:

A list of nodes that qualified by the path component.

filter_for_sub_nodes(node, path_components)

For the given node, filter through its sub-nodes, which could be child, attribute, factor or descendant nodes depending on the separator value of the first member of path components. Note that the filtering will be performed in a depth first fashion, i.e. the filtering is continued with the direct sub-nodes down to the leaves of the node tree or the end of path components, whichever encounters first.

Parameters:
  • node – The node for which the sub-nodes will be filtered
  • path_components – A list of path components used to filter the nodes.
Returns:

A list of qualified nodes matching through the entire path components.

node_matches(node, path_component)

Check whether the given node is qualified with the path component. If the path component’s separator is descendant, any sub-nodes containing node is qualified.

Parameters:
  • node
  • path_component
Returns:

True or False

proceed_next_path_component(nodes, path_components)

Proceed further down the path components.

Parameters:
  • nodes
  • path_components
Returns:

query(bufr_message, path_expr)

Entry method of the class. Query the data section of the given BUFR message with the query string.

Parameters:
  • bufr_message – A BufrMessage object with wired nodes
  • path_expr – A query string for data.
Returns:

A QueryResult object

class pybufrkit.dataquery.QueryResult(path_expr='')

This class represents the query result.

pybufrkit.mdquery

class pybufrkit.mdquery.MetadataQuerent(metadata_expr_parser)
Parameters:metadata_expr_parser (MetadataExprParser) – Parser for metadata expression

pybufrkit.query

class pybufrkit.query.BufrMessageQuerent

This is a convenient class for querents of metadata and data sections. It provides an uniform interface for querying the BufrMessage object.

pybufrkit.script

pybufrkit.script.process_embedded_query_expr(input_string)

This function scans through the given script and identify any path/metadata expressions. For each expression found, an unique python variable name will be generated. The expression is then substituted by the variable name.

Parameters:input_string (str) – The input script
Returns:A 2-element tuple of the substituted string and a dict of substitutions
Return type:(str, dict)
class pybufrkit.script.ScriptRunner(input_string, data_values_nest_level=None, mode='exec')

This class is responsible for running the given script against BufrMessage object.

code_string
The processed/substituted source code.
code_object
The compiled code object from the code string.
pragma
Extra processing directives
metadata_only
Whether the script requires only metadata part of the BUFR message to work.
querent
The BufrMessageQuerent object for performing the values query.

pybufrkit.renderer

class pybufrkit.renderer.FlatJsonRenderer

This renderer converts the given object to a JSON string by flatten its internal structure.

class pybufrkit.renderer.FlatTextRenderer

This renderer converts the given object by flatten all its sub-structures.

class pybufrkit.renderer.NestedJsonRenderer

The counterpart to NestedTextRenderer but with JSON as output

class pybufrkit.renderer.NestedTextRenderer

This renderer converts the given object to a text string by honoring all its nested sub-structures.

class pybufrkit.renderer.Renderer

This class is the abstract base Renderer. A renderer provides the contract to take in an object and convert it into a string representation.

render(obj)

Render the given object as string.

Parameters:obj (object) – The object to render
Returns:A string representation of the given object.
Return type:str

pybufrkit.bitops

class pybufrkit.bitops.BitStringBitReader(s)

A BitReader implementation using the bitstring module.

Parameters:bitstring.BitStream (bit_stream) – Bit stream created from the input string
get_pos()

Retrieve the bit position for next read

read_bin(nbits)

Read number of bits as bytes representation of binary number

read_bool()

Read one bit for value of boolean

read_bytes(nbytes)

Read number of bytes for value of bytes type

read_int(nbits)

Read number of bits as integer

read_uint(nbits)

Read number of bites for value of unsigned integer

class pybufrkit.bitops.BitStringBitWriter

A BitWriter implementation using the bitstring module.

get_pos()

Retrieve the bit position for next write

set_uint(value, nbits, bitpos)

Set an unsigned integer value of given number of bits at the bit position and replace the old value.

skip(nbits)

Skip ahead for the given number of nbits

to_bytes()

dump all content to bytes type

write_bin(value)

Write a binary number represented by the given value. The length is determined by the value.

write_bool(value)

Write one bit for value of boolean type

write_bytes(value, nbytes=None)

Write given number of bits value of bytes type. If nbytes is none, use the length of the given bytes value

write_int(value, nbits)

Write given number of bits for value of signed integer

write_uint(value, nbits)

Write given number of bits value of unsigned integer type

pybufrkit.bitops.get_bit_reader(s)

Initialise and return a BitReader the given string. This function is intended to shield the actual implementation of BitReader away from the caller.

Parameters:s – The byte string to read from.
Returns:BitReader
pybufrkit.bitops.get_bit_writer()

Initialise and return a BitWriter.

Returns:BitWriter

pybufrkit.commands

This file gathers all the functions that support the command line usages.

pybufrkit.commands.command_decode(ns)

Command to decode given files from command line.

pybufrkit.commands.command_info(ns)

Command to show metadata information of given files from command line.

pybufrkit.commands.command_encode(ns)

Command to encode given JSON file from command line into BUFR file.

pybufrkit.commands.command_lookup(ns)

Command to lookup the given descriptors from command line

pybufrkit.commands.command_compile(ns)

Command to compile the given descriptors.

pybufrkit.commands.command_subset(ns)

Command to subset and save the given BUFR file.

pybufrkit.commands.command_query(ns)

Command to query given BUFR files.

pybufrkit.commands.command_script(ns)

Command to execute script against given BUFR files.

pybufrkit.commands.command_split(ns)

Command to split given files from command line into one file per BufrMessage.

pybufrkit.utils

class pybufrkit.utils.EntityEncoder(skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, sort_keys=False, indent=None, separators=None, encoding='utf-8', default=None)
default(o)

Implement this method in a subclass such that it returns a serializable object for o, or calls the base implementation (to raise a TypeError).

For example, to support arbitrary iterators, you could implement default like this:

def default(self, o):
    try:
        iterable = iter(o)
    except TypeError:
        pass
    else:
        return list(iterable)
    # Let the base class default method raise the TypeError
    return JSONEncoder.default(self, o)
pybufrkit.utils.fixed_width_repr_of_int(value, width, pad_left=True)

Format the given integer and ensure the result string is of the given width. The string will be padded space on the left if the number is small or replaced as a string of asterisks if the number is too big.

Parameters:
  • value (int) – An integer number to format
  • width (int) – The result string must have the exact width
Returns:

A string representation of the given integer.

pybufrkit.utils.flat_text_to_flat_json(flat_text)

Convert the flat Text output to the flat JSON output format

Parameters:flat_text (str) – The flat text output
pybufrkit.utils.flatten_list(values)

Flatten a list so everything is in a list without nesting . :param values: :return:

pybufrkit.utils.generate_quiet(iterable, next_val)

Iterate, returning if the generator function raises StopIteration.

https://www.python.org/dev/peps/pep-0479/

pybufrkit.utils.nested_json_to_flat_json(nested_json_data)

Converted the nested JSON output to the flat JSON output. This is useful as Encoder only works with flat JSON.

Parameters:nested_json_data – The nested JSON object
Returns:Flat JSON object.
pybufrkit.utils.nested_text_to_flat_json(nested_text)

Convert string in nested text format to a flat JSON object. :param str nested_text: The nested text output :return: A flat JSON object

pybufrkit.utils.section_text_to_flat_json(lines, idxline, func_subsets_text_to_flat_json)

Convert a section from text output to a section of flat JSON.

pybufrkit.utils.subsets_flat_text_to_flat_json(lines, idxline)

Convert all subsets data from flat text output to all subsets data of flat JSON.

pybufrkit.utils.subsets_nested_text_to_flat_json(lines, idxline)

Convert all subsets data from nested text format to flat JSON format.

pybufrkit.utils.template_data_nested_json_to_flat_json(template_data_value)

Helper function to convert nested JSON of template data to flat JSON.

pybufrkit.constants

Various constants used in the module.