LAL: Linear Arrangement Library 24.10.00
A library focused on algorithms on linear arrangements of graphs.
|
A reader for a collection of treebanks. More...
#include <treebank_collection_reader.hpp>
Public Member Functions | |
treebank_file_error | init (const std::string &main_file) noexcept |
Initialize the reader with a new collection. | |
bool | end () const noexcept |
void | next_treebank () noexcept |
Opens the file of the next treebank in the main file. | |
treebank_reader & | get_treebank_reader () noexcept |
Returns a treebank reader class instance for processing a treebank. | |
Private Member Functions | |
void | step_line () noexcept |
Consumes one line of the main file m_main_file. | |
Private Attributes | |
std::string | m_main_file = "none" |
File containing the list of languages and their treebanks. | |
std::string | m_cur_treebank_id = "none" |
The identifier of the current treebank file. | |
std::string | m_cur_treebank_filename = "none" |
The name of the current treebank file. | |
std::ifstream | m_list |
Handler for main file reading. | |
treebank_reader | m_treebank_reader |
Object to process a language's treebank. | |
bool | m_reached_end = false |
Did we reach the end of the file? | |
bool | m_no_more_treebanks = false |
Have all trees in the file been consumed? | |
A reader for a collection of treebanks.
This class, the objects of which will be referred to as the "collection readers", is an interface to help you do a custom processing of a set of treebanks (see Treebank Collection and Treebank for further details on treebanks and treebank collections).
The user has to initialize a collection reader with the main file (the main file list). For example, to read the Stanford collection the reader has to be initialized with the main file stanford.txt which could contain the contents given above. Bear in mind that a collection reader only processes the main file: it iterates through the list of files within the main file using the method next_treebank. This method can be called as long as method end returns false. Each call to next_treebank builds an object of class treebank_reader which allows the user to iterate through the trees within the corresponding file. This object can be retrieved by calling method get_treebank_reader.
An example of usage of this class is given in the following piece of code.
|
inlinenodiscardnoexcept |
Returns true or false depending on whether there is a next treebank to be read.
|
nodiscardnoexcept |
Initialize the reader with a new collection.
Objects of this class can't be used to read a treebank until this method returns no error.
main_file | Main file of the collection. |
|
noexcept |
Opens the file of the next treebank in the main file.
This method can be called even after it has returned an error.
|
private |
File containing the list of languages and their treebanks.
This file's lines contain two strings, the first being the language name (used mainly for debugging purposes), and the name of the file containing the syntactic dependency trees of that language.