To overcome the issues of manually failed parsing, tables can now be imported from a TSV input; This is equivalent to copying and pasting a table from Excel into the curator HTML table input.
This allows the flexibility to edit an HTML(or PDF) table copied into Excel and then manually edited to conform to the requirements of the parser. There are some quirks and tricks to this so please report any common issues.
Importantly, this means you can delete one-line residues, split up multiple atom indices into multiple tables, or bread the “bad atom indices” into multiple rows.
REQUIREMENTS:
There are two possible TSV import formats:
- HTML-like TSV
This format requires ONE header row, but will still do the parsing of splitting chemical shifts, multiplicities, and J-coupling constants out of the table as it would do for a pure HTML table.
Here is an example TSV file which matches the required format. HTML-like TSV
- Manual TSV
This format allows a curator to manually input data from an old document that cannot be copy pasted. The requirements for this input are much more stringent to simplify the parsing. It requires ONE header row which MUST match the specified headers of:
atom_index
MUST be the first column and headerhshift
for proton chemical shifts - MUST be a single decimal valuemult
for proton multiplicitycoup
for proton coupling constants - MUST be a comma separate list of decimal valuescshift
for carbon-13 chemical shifts - MUST be a single decimal value
You may have multiple compounds in a single table.
Here is an example TSV file which matches the required format. Manual TSV