LAL: Linear Arrangement Library 23.01.00
A library focused on algorithms on linear arrangements of graphs.
Loading...
Searching...
No Matches
Public Member Functions | Private Member Functions | Private Attributes | List of all members
lal::io::treebank_collection_reader Class Reference

A reader for a collection of treebanks. More...

#include <treebank_collection_reader.hpp>

Public Member Functions

treebank_error init (const std::string &main_file) noexcept
 Initialise the reader with a new collection. More...
 
bool end () const noexcept
 
void next_treebank () noexcept
 Opens the file of the next treebank in the main file. More...
 
treebank_readerget_treebank_reader () noexcept
 Returns a treebank reader class instance for processing a treebank.
 

Private Member Functions

void step_line () noexcept
 Consumes one line of the main file m_main_file.
 

Private Attributes

std::string m_main_file = "none"
 File containing the list of languages and their treebanks. More...
 
std::string m_cur_treebank_name = "none"
 The name of the current treebank file.
 
std::string m_cur_treebank_filename = "none"
 The name of the current treebank file.
 
std::ifstream m_list
 Handler for main file reading.
 
treebank_reader m_treebank_reader
 Object to process a language's treebank.
 
bool m_reached_end = false
 Did we reach the end of the file?
 
bool m_no_more_treebanks = false
 Have all trees in the file been consumed?
 

Detailed Description

A reader for a collection of treebanks.

This class, the objects of which will be referred to as the "collection readers", is an interface to help you do a custom processing of a set of treebanks (see Treebank Collection and Treebank for further details on treebanks and treebank collections).

The user has to initialise a collection reader with the main file (the main file list). For example, to read the Stanford collection the reader has to be initialised with the main file stanford.txt which could contain the contents given above. Bear in mind that a collection reader only processes the main file: it iterates through the list of files within the main file using the method next_treebank. This method can be called as long as method end returns false. Each call to next_treebank builds an object of class treebank_reader which allows the user to iterate through the trees within the corresponding file. This object can be retrieved by calling method get_treebank_reader.

An example of usage of this class is given in the following piece of code.

lal::io::treebank_collection tbcolreader;
// it is advisable to check for errors
auto err = tbcolreader.init(mainf)
while (not tbcolreader.end()) {
lal::io::treebank_reader& tbreader = tbcolreader.get_treebank_reader();
if (not tbreader.is_open()) {
tbcolreader.next_treebank();
continue;
}
// here goes your custom processing of the treebank
// ...
tbcolreader.next_treebank();
}
A reader for a single treebank file.
Definition: treebank_reader.hpp:90
bool is_open() const noexcept
Can the treebank be read?
Definition: treebank_reader.hpp:149

Member Function Documentation

◆ end()

bool lal::io::treebank_collection_reader::end ( ) const
inlinenoexcept

Returns true or false depending on whether there is a next treebank to be read.

◆ init()

treebank_error lal::io::treebank_collection_reader::init ( const std::string &  main_file)
noexcept

Initialise the reader with a new collection.

Objects of this class can't be used to read a treebank until this method returns no error.

Parameters
main_fileMain file of the collection.
Returns
The type of the error, if any. The list of errors that this method can return is:

◆ next_treebank()

void lal::io::treebank_collection_reader::next_treebank ( )
noexcept

Opens the file of the next treebank in the main file.

This method can be called even after it has returned an error.

Member Data Documentation

◆ m_main_file

std::string lal::io::treebank_collection_reader::m_main_file = "none"
private

File containing the list of languages and their treebanks.

This file's lines contain two strings, the first being the language name (used mainly for debugging purposes), and the name of the file containing the syntactic dependency trees of that language.


The documentation for this class was generated from the following file: