Zip package, which contains
- individual text file for each clipping text
- Log of downloaded clippings
- Clippings metadata excels 1820-1883 and 1884-1885
Folder Structure:
translocalis_data/year/ISSN/txt/388373_2973653_1457-4403_1868-04-09_15_page-2
- File name is formulated by article_id, binding_id, issn, publishing date (YYYY-MM-DD), issue, and page on which clipping has been taken.
Clippings metadata (2 files)
(part until 1883): translocalis_clippings_export_1820_1883.xlsx
(1884-1885): translocalis_clippings_export_1884_1885.xlsx
Fields in metadata excels:
Main title - name of the newspaper
ISSN - (International Standard Serial Number), i.e. newpaper identifier
Date - Publishing date of thee number
Issue - Issue number of newspaper (can be empty, or contain A, Bs e.g. for extra editions)
URL - link to the clipping page
Title - Title eof the article
Keywords - keywords of clipping (see Wiki for full explanations of translocalis_xyz fields)
Category - category of the article
Subject
Notes - if any extra notes
Created - when clipping was created
OCR - text contents of the clipping