WOSRecord(ExtendedRecord)¶
-
class
metaknowledge.WOS.
WOSRecord
(inRecord, sFile='', sLine=0)¶ Class for full WOS records
It is meant to be immutable; many of the methods and attributes are evaluated when first called, not when the object is created, and the results are stored privately.
The record’s meta-data is stored in an ordered dictionary labeled by WOS tags. To access the raw data stored in the original record the tags() method can be used. To access data that has been processed and cleaned the attributes named after the tags are used.
Customizations¶
The
Record
’s hashing and equality testing are based on the WOS number (the tag is ‘UT’, and also called the accession number). They are strings starting with'WOS:'
and followed by 15 or so numbers and letters, although both the length and character set are known to vary. The numbers are unique to each record so are used for comparisons. If a record isbad
all equality checks returnFalse
.When converted to a string the records title is used so for a record
R
,R.TI == R.title == str(R)
and its representation uses the WOS number instead of memory location.Attributes¶
When a record is created if the parsing of the WOS file failed it is marked as
bad
. Thebad
attribute is set to True and theerror
attribute is created to contain the exception object.Generally, to get the information from a Record its attributes should be used. For a Record
R
, callingR.CR
causes citations() from the the tagProcessing module to be called on the contents of the raw ‘CR’ field. Then the result is saved and returned. In this case, a list of Citation objects is returned. You can also callR.citations
to get the same effect, as each known field tag has a longer name (currently there are 61 field tags). These names are meant to make accessing tags more readable and mapping from tag to name can be found in the tagToFull dict. If a tag is known (in tagToFull) but not in the raw dataNone
is returned instead. Most tags when cleaned return a string or list of strings, the exact results can be found in the help for the particular function.The attribute
authors
is also defined as a convenience and returns the same as ‘AF’ or if that is not found ‘AU’.__Init__¶
Records are generally created as collections in Recordcollections, and not as individual objects. If you wish to create one on its own it is possible, the arguments are as follows.
Parameters¶
inRecord:
files stream, dict, str or itertools.chain
If it is a file stream the file must be open at the location of the first tag in the record, usually ‘PT’, and the file will be read until ‘ER’ is found, which indicates the end of the record in the file.
If a dict is passed the dictionary is used as the database of fields and tags, so each key is considered a WOS tag and each value a list of the lines of the original associated with the tag. This is the same form of dict that recordParser returns.
For a string the input must be the raw textual data of a single record in the WOS style, like the file stream it must start at the first tag and end in
'ER'
.itertools.chain is treated identically to a file stream and is used by RecordCollections.
sFile :
optional [str]
Is the name of the file the raw data was in, by default it is blank. It is mostly used to make error messages more informative.sLine :
optional [int]
Is the line the record starts on in the raw data file. It is mostly used to make error messages more informative.-
UT
¶ Returns the UT tag (WOS number) of the record
-
encoding
()¶ An
abstractmethod
, gives the encoding string of the record.
-
static
getAltName
(tag)¶ An
abstractmethod
, gives the alternate name of tag orNone
-
specialFuncs
(key)¶ An
abstractmethod
, process the special tag, key using the wholeRecord
-
static
tagProcessingFunc
(tag)¶ An
abstractmethod
, gives the function for processing tag
-
wosString
¶ Returns the WOS number (UT tag) of the record
-
writeRecord
(infile)¶ Writes to infile the original contents of the Record. This is intended for use by RecordCollections to write to file. What is written to infile is bit for bit identical to the original record file (if utf-8 is used). No newline is inserted above the write but the last character is a newline.
-