1 dataset found

Keywords: text deduplication

Filter Results
  • onion

    onion (ONe Instance ONly) is a tool for removing duplicate parts from large collections of texts. The tool has been implemented in Python, licensed under New BSD License and...
You can also access this registry using the API (see API Docs).