Skip to content

utils.ids

Warning

Importing these functions outside of the library is usually unnecessary.

  • Collection items already provide IDs in both string form (full_id) and tuple form (full_id_tuple), as well as a year attribute.
  • Person objects have is_explicit corresponding to checking if their ID is verified.
  • ID validation functions are called automatically upon modifying relevant attributes, so you probably just want to set them directly and check for an exception.

Functions for manipulating Anthology IDs.

AnthologyID module-attribute

AnthologyID = str | AnthologyIDTuple

Any type that can be parsed into an Anthology ID.

AnthologyIDTuple module-attribute

AnthologyIDTuple = tuple[str, Optional[str], Optional[str]]

A tuple representing an Anthology ID.

build_id

build_id(collection_id, volume_id=None, paper_id=None)

Transforms collection ID, volume ID, and paper ID to a width-padded Anthology ID.

Parameters:

Name Type Description Default
collection_id str

A collection ID, e.g. "P18".

required
volume_id Optional[str]

A volume ID, e.g. "1".

None
paper_id Optional[str]

A paper ID, e.g. "42".

None

Returns:

Type Description
str

The full Anthology ID.

Examples:

>>> build_id("P18", "1", "1")
P18-1001
>>> build_id("2022.acl", "long", "42")
2022.acl-long.42
Warning

Does not perform any kind of input validation.

build_id_from_tuple

build_id_from_tuple(anthology_id)

Like build_id(), but takes any AnthologyID type.

Parameters:

Name Type Description Default
anthology_id AnthologyID

The Anthology ID to convert into a string.

required

Returns:

Type Description
str

The full Anthology ID.

Examples:

>>> build_id(("P18", "1", "1"))
P18-1001

infer_year

infer_year(anthology_id)

Infer the year from an Anthology ID.

Parameters:

Name Type Description Default
anthology_id AnthologyID

An arbitrary Anthology ID.

required

Returns:

Type Description
str

The year of the item represented by the Anthology ID, as a four-character string.

is_valid_collection_id

is_valid_collection_id(id_)

Validate that a string is formatted like a proper collection ID.

Returns:

Type Description
bool

True if the string is valid, False otherwise.

is_valid_item_id

is_valid_item_id(id_)

Validate that a string is a valid volume or paper ID.

Volume or paper IDs must only consist of lower-case ASCII characters and digits.

Returns:

Type Description
bool

True if the string is valid, False otherwise.

is_valid_orcid

is_valid_orcid(orcid)

Validate that a string looks like an ORCID and has the correct checksum.

Returns:

Type Description
bool

True if the ORCID validates, False otherwise.

is_verified_person_id

is_verified_person_id(id_)

Validate that a string is formatted like a verified person ID.

Returns:

Type Description
bool

True if this ID can refer to a verified person.

Warning

Does not perform any kind of input validation.

parse_id

parse_id(anthology_id)

Parses an Anthology ID into its constituent collection ID, volume ID, and paper ID parts.

Parameters:

Name Type Description Default
anthology_id AnthologyID

The Anthology ID to parse.

required

Returns:

Type Description
AnthologyIDTuple

The parsed collection ID, volume ID, and paper ID.

Examples:

>>> parse_id("P18-1007")
('P18', '1',  '7')
>>> parse_id("W18-6310")
('W18', '63', '10')
>>> parse_id("D19-1001")
('D19', '1',  '1')
>>> parse_id("D19-5702")
('D19', '57', '2')
>>> parse_id("2022.acl-main.1")
('2022.acl', 'main', '1')

Also works with volumes:

>>> parse_id("P18-1")
('P18', '1', None)
>>> parse_id("W18-63")
('W18', '63', None)

And even with just collections:

>>> parse_id("P18")
('P18', None, None)
Warning

Does not perform any kind of input validation.

Note

For Anthology IDs prior to 2020, the volume ID is the first digit after the hyphen, except for the following situations, where it is the first two digits:

  • All collections starting with 'W'
  • The collection "C69"
  • All collections in "D19" where the first digit is >= 5