LAL: Linear Arrangement Library 24.10.00
A library focused on algorithms on linear arrangements of graphs.
Loading...
Searching...
No Matches
lal::io::treebank_reader Class Reference

A reader for a single treebank file. More...

#include <treebank_reader.hpp>

Public Member Functions

treebank_file_error init (const std::string &treebank_filename, const std::string &treebank_id="") noexcept
 Initializes the treebank reader.
 
bool end () const noexcept
 Returns whether there is another tree to be processed.
 
void next_tree () noexcept
 Retrieves the next tree in the file.
 
std::size_t get_num_trees () const noexcept
 Returns the number of trees processed so far.
 
const std::string & get_treebank_identifier () const noexcept
 Returns the identifier corresponding of the treebank.
 
const std::string & get_treebank_filename () const noexcept
 Returns the name of the treebank file.
 
graphs::rooted_tree get_tree () const noexcept
 Returns the current tree.
 
head_vector get_head_vector () const noexcept
 Returns the current head vector.
 
bool is_open () const noexcept
 Can the treebank be read?
 
void set_normalize (const bool v) noexcept
 Should trees be normalized?
 
void set_calculate_size_subtrees (const bool v) noexcept
 Should the size of the subtrees be calculated?
 
void set_calculate_tree_type (const bool v) noexcept
 Should the tree be classified into types?
 
void set_identifier (const std::string &id) noexcept
 Set this treebank's identifier string.
 

Private Attributes

std::string m_treebank_identifier = "none"
 Identifier for the treebank.
 
std::string m_treebank_file = "none"
 Treebank's file name (with the full path).
 
std::ifstream m_treebank
 Handler for main file reading.
 
std::size_t m_num_trees = 0
 Number of trees in the treebank.
 
std::string m_current_line
 Current line.
 
head_vector m_current_head_vector
 Current head vector.
 
bool m_normalize_tree = true
 Normalize the current tree.
 
bool m_calculate_size_subtrees = true
 Calculate the size of the subtrees of the generated rooted tree.
 
bool m_calculate_tree_type = true
 Calculate the type of tree of the generated tree.
 
bool m_no_more_trees = false
 Have all trees in the file been consumed?
 

Detailed Description

A reader for a single treebank file.

This class, the objects of which will be referred to as the "readers", offers a simple interface for iterating over the trees in a single treebank file, henceforth referred to as the treebank (see Treebank for further details on treebank files).

In order to use it, this class has to be first initialized with the treebank file and, optionally, a self-descriptive string, i.e., something that identifies the treebank (e.g., an ISO code of a language). Once initialized, the first tree can be retrievend with get_tree. The other trees can be iterated over by calling next_tree. This function can only be called as long as end returns false.

If an object of this class was returned by the class treebank_collection_reader, then methods get_treebank_filename and get_treebank_identifier might prove useful for debugging since they return, respectively, the full name (path included) of the treebank and an identifier string.

An example of usage of this class is given in the following piece of code.

// it is advisable to check for errors
const auto err = tbread.init(main_file);
while (not tbread.end()) {
const lal::graphs::rooted_tree t = tbread.get_tree();
// process tree 't'
// ....
tbread.next_tree();
}
Rooted tree graph class.
Definition rooted_tree.hpp:109
A reader for a single treebank file.
Definition treebank_reader.hpp:89
bool end() const noexcept
Returns whether there is another tree to be processed.
Definition treebank_reader.hpp:108
void next_tree() noexcept
Retrieves the next tree in the file.
treebank_file_error init(const std::string &treebank_filename, const std::string &treebank_id="") noexcept
Initializes the treebank reader.
graphs::rooted_tree get_tree() const noexcept
Returns the current tree.

Member Function Documentation

◆ get_num_trees()

std::size_t lal::io::treebank_reader::get_num_trees ( ) const
inlinenodiscardnoexcept

Returns the number of trees processed so far.

When method end returns 'true', this method returns the exact amount of trees in the treebank.

◆ init()

treebank_file_error lal::io::treebank_reader::init ( const std::string & treebank_filename,
const std::string & treebank_id = "" )
nodiscardnoexcept

Initializes the treebank reader.

Parameters
treebank_filenameTreebank file name.
treebank_idIdentifier string for the treebank.
Returns
The type of the error, if any. The list of errors that this method can return is:
Postcondition
The amount of trees processed, m_num_trees, is always set to 0.

◆ is_open()

bool lal::io::treebank_reader::is_open ( ) const
inlinenodiscardnoexcept

Can the treebank be read?

If the init method returned an error different from lal::io::treebank_file_error_type::no_error then this returns false.

Returns
Whether the treebank is readable or not.

◆ next_tree()

void lal::io::treebank_reader::next_tree ( )
noexcept

Retrieves the next tree in the file.

Postcondition
Increments the amount of trees found.

◆ set_calculate_size_subtrees()

void lal::io::treebank_reader::set_calculate_size_subtrees ( const bool v)
inlinenoexcept

Should the size of the subtrees be calculated?

Parameters
vBoolean value.

◆ set_calculate_tree_type()

void lal::io::treebank_reader::set_calculate_tree_type ( const bool v)
inlinenoexcept

Should the tree be classified into types?

See lal::graphs::tree_type for details on the classification.

Parameters
vBoolean value.

◆ set_identifier()

void lal::io::treebank_reader::set_identifier ( const std::string & id)
inlinenoexcept

Set this treebank's identifier string.

This method overrides the contents of m_treebank_identifier. This method is most useful when, after initialising a treebank reader, the identifier string is to be changed in some way.

Parameters
idIdentifier string.

◆ set_normalize()

void lal::io::treebank_reader::set_normalize ( const bool v)
inlinenoexcept

Should trees be normalized?

Parameters
vBoolean value.

The documentation for this class was generated from the following file: