WOSRecord(ExtendedRecord)

class metaknowledge.WOS.WOSRecord(inRecord, sFile='', sLine=0)

Class for full WOS records

It is meant to be immutable; many of the methods and attributes are evaluated when first called, not when the object is created, and the results are stored privately.

The record’s meta-data is stored in an ordered dictionary labeled by WOS tags. To access the raw data stored in the original record the tags() method can be used. To access data that has been processed and cleaned the attributes named after the tags are used.

Customizations

The Record’s hashing and equality testing are based on the WOS number (the tag is ‘UT’, and also called the accession number). They are strings starting with 'WOS:' and followed by 15 or so numbers and letters, although both the length and character set are known to vary. The numbers are unique to each record so are used for comparisons. If a record is bad all equality checks return False.

When converted to a string the records title is used so for a record R, R.TI == R.title == str(R) and its representation uses the WOS number instead of memory location.

Attributes

When a record is created if the parsing of the WOS file failed it is marked as bad. The bad attribute is set to True and the error attribute is created to contain the exception object.

Generally, to get the information from a Record its attributes should be used. For a Record R, calling R.CR causes citations() from the the tagProcessing module to be called on the contents of the raw ‘CR’ field. Then the result is saved and returned. In this case, a list of Citation objects is returned. You can also call R.citations to get the same effect, as each known field tag has a longer name (currently there are 61 field tags). These names are meant to make accessing tags more readable and mapping from tag to name can be found in the tagToFull dict. If a tag is known (in tagToFull) but not in the raw data None is returned instead. Most tags when cleaned return a string or list of strings, the exact results can be found in the help for the particular function.

The attribute authors is also defined as a convenience and returns the same as ‘AF’ or if that is not found ‘AU’.

__Init__

Records are generally created as collections in Recordcollections, and not as individual objects. If you wish to create one on its own it is possible, the arguments are as follows.

Parameters

inRecord: files stream, dict, str or itertools.chain

If it is a file stream the file must be open at the location of the first tag in the record, usually ‘PT’, and the file will be read until ‘ER’ is found, which indicates the end of the record in the file.

If a dict is passed the dictionary is used as the database of fields and tags, so each key is considered a WOS tag and each value a list of the lines of the original associated with the tag. This is the same form of dict that recordParser returns.

For a string the input must be the raw textual data of a single record in the WOS style, like the file stream it must start at the first tag and end in 'ER'.

itertools.chain is treated identically to a file stream and is used by RecordCollections.

sFile : optional [str]

Is the name of the file the raw data was in, by default it is blank. It is mostly used to make error messages more informative.

sLine : optional [int]

Is the line the record starts on in the raw data file. It is mostly used to make error messages more informative.
UT

Returns the UT tag (WOS number) of the record

__init__(inRecord, sFile='', sLine=0)

See help on Record for details

encoding()

An abstractmethod, gives the encoding string of the record.

static getAltName(tag)

An abstractmethod, gives the alternate name of tag or None

specialFuncs(key)

An abstractmethod, process the special tag, key using the whole Record

static tagProcessingFunc(tag)

An abstractmethod, gives the function for processing tag

wosString

Returns the WOS number (UT tag) of the record

writeRecord(infile)

Writes to infile the original contents of the Record. This is intended for use by RecordCollections to write to file. What is written to infile is bit for bit identical to the original record file (if utf-8 is used). No newline is inserted above the write but the last character is a newline.