About Vercut

What is Vercut?

Vercut is a powerful online text segmentation tool that helps you break down text into sentences, phrases, and words. It supports multiple languages and provides detailed metadata about each token.

Whether you're analyzing text, building NLP applications, or just curious about how text is structured, Vercut provides an intuitive interface to explore text segmentation.

Features
  • Multi-language support (English, German, Spanish, French, Italian, Portuguese, Russian)
  • Full Unicode and CJK character support
  • Sentence, phrase, and word segmentation
  • Detailed token metadata (offsets, punctuation markers)
  • Clean and intuitive user interface
  • Raw JSON export for integration
Technology

Vercut is built on top of the @echogarden/text-segmentation library, which provides robust multilingual text segmentation using a combination of regex-based rules and optional WebAssembly ICU segmentation for CJK languages.

The web application is built with React, TanStack Router, and TailwindCSS.

Open Source

Vercut is open source and available on GitHub. Contributions, bug reports, and feature requests are welcome!