|
treebank_file_error | init (const std::string &treebank_filename, const std::string &treebank_id="") noexcept |
| Initializes the treebank reader.
|
|
bool | end () const noexcept |
| Returns whether there is another tree to be processed.
|
|
void | next_tree () noexcept |
| Retrieves the next tree in the file.
|
|
std::size_t | get_num_trees () const noexcept |
| Returns the number of trees processed so far.
|
|
const std::string & | get_treebank_identifier () const noexcept |
| Returns the identifier corresponding of the treebank.
|
|
const std::string & | get_treebank_filename () const noexcept |
| Returns the name of the treebank file.
|
|
graphs::rooted_tree | get_tree () const noexcept |
| Returns the current tree.
|
|
head_vector | get_head_vector () const noexcept |
| Returns the current head vector.
|
|
bool | is_open () const noexcept |
| Can the treebank be read?
|
|
void | set_normalize (const bool v) noexcept |
| Should trees be normalized?
|
|
void | set_calculate_size_subtrees (const bool v) noexcept |
| Should the size of the subtrees be calculated?
|
|
void | set_calculate_tree_type (const bool v) noexcept |
| Should the tree be classified into types?
|
|
void | set_identifier (const std::string &id) noexcept |
| Set this treebank's identifier string.
|
|
|
std::string | m_treebank_identifier = "none" |
| Identifier for the treebank.
|
|
std::string | m_treebank_file = "none" |
| Treebank's file name (with the full path).
|
|
std::ifstream | m_treebank |
| Handler for main file reading.
|
|
std::size_t | m_num_trees = 0 |
| Number of trees in the treebank.
|
|
std::string | m_current_line |
| Current line.
|
|
head_vector | m_current_head_vector |
| Current head vector.
|
|
bool | m_normalize_tree = true |
| Normalize the current tree.
|
|
bool | m_calculate_size_subtrees = true |
| Calculate the size of the subtrees of the generated rooted tree.
|
|
bool | m_calculate_tree_type = true |
| Calculate the type of tree of the generated tree.
|
|
bool | m_no_more_trees = false |
| Have all trees in the file been consumed?
|
|
A reader for a single treebank file.
This class, the objects of which will be referred to as the "readers", offers a simple interface for iterating over the trees in a single treebank file, henceforth referred to as the treebank (see Treebank for further details on treebank files).
In order to use it, this class has to be first initialized with the treebank file and, optionally, a self-descriptive string, i.e., something that identifies the treebank (e.g., an ISO code of a language). Once initialized, the first tree can be retrievend with get_tree. The other trees can be iterated over by calling next_tree. This function can only be called as long as end returns false.
If an object of this class was returned by the class treebank_collection_reader, then methods get_treebank_filename and get_treebank_identifier might prove useful for debugging since they return, respectively, the full name (path included) of the treebank and an identifier string.
An example of usage of this class is given in the following piece of code.
const auto err = tbread.
init(main_file);
while (not tbread.
end()) {
}
Rooted tree graph class.
Definition rooted_tree.hpp:109
A reader for a single treebank file.
Definition treebank_reader.hpp:89
bool end() const noexcept
Returns whether there is another tree to be processed.
Definition treebank_reader.hpp:108
void next_tree() noexcept
Retrieves the next tree in the file.
treebank_file_error init(const std::string &treebank_filename, const std::string &treebank_id="") noexcept
Initializes the treebank reader.
graphs::rooted_tree get_tree() const noexcept
Returns the current tree.