spotterbase.spotters package

Subpackages

Submodules

spotterbase.spotters.spotter module

class spotterbase.spotters.spotter.Spotter(ctx: SpotterContext)

Bases: ABC

ctx: SpotterContext
abstract process_document(document: Document) Iterable[tuple[Uri | BlankNode, Uri, Uri | BlankNode | Literal]]
classmethod setup_run(**kwargs) tuple[SpotterContext, Iterable[tuple[Uri | BlankNode, Uri, Uri | BlankNode | Literal]]]
spotter_short_id: str
class spotterbase.spotters.spotter.SpotterContext(run_uri: Uri | None = None)

Bases: object

run_uri: Uri
class spotterbase.spotters.spotter.UriGenerator(base_uri: Uri)

Bases: object

class spotterbase.spotters.spotter.UriGeneratorMixin(ctx: SpotterContext)

Bases: Spotter, ABC

get_uri_generator_for(document: Document) UriGenerator

spotterbase.spotters.spotter_runner module

Code for running spotters over a corpus.

Currently, it is rather hacky. Possible improvements:
  • split into multiple files (core functionality vs. caller functions)

  • right now serialization is ntriples and results are simply copied. Can we do better?

class spotterbase.spotters.spotter_runner.RunnerTtlSerializer(path: Path)

Bases: TurtleSerializer

close()
spotterbase.spotters.spotter_runner.auto_run_spotter(spotter_class: type[Spotter] | list[type[Spotter]])

Runs the spotter(s) and handles all the command line arguments etc.

spotterbase.spotters.spotter_runner.run(spotter_classes: list[type[Spotter]], documents: Iterable[Document], *, corpus_descr: str, directory: Path)

Module contents